feat: Add Nordic AXON NPU backend for nRF54LM20B#18863
feat: Add Nordic AXON NPU backend for nRF54LM20B#18863petriok wants to merge 1 commit intopytorch:mainfrom
Conversation
Add ExecuTorch backend for Nordic Semiconductor's AXON NPU, targeting the nRF54LM20B (ARM Cortex-M33 + hardware neural network accelerator). Follows the same composition pattern as the Ethos-U backend: reuses TOSABackend for TOSA lowering, then compiles to AXON command buffers via Nordic's compiler library. Python backend (backends/nordic/): - AxonBackend: @Final BackendDetails with TOSA composition - AxonPartitioner: extends TOSAPartitioner with AXON constraint checks - AxonQuantizer: wraps TOSAQuantizer with AXON INT8 defaults - AxonCompileSpec: hardware constraints and SDK path configuration - TOSA-to-AXON compiler bridge with per-op converters - Subgraph naming (content-hash), marker format, header code generation - Operator support checks (FC 2048, Conv 16x16, Pool 32x32, max 2 inputs) C++ runtime (backends/nordic/runtime/): - AxonBackend delegate: marker-based multi-subgraph lookup, profiling API - Op extensions: single-precision sigmoid/tanh CPU callbacks Zephyr integration: - Kconfig: EXECUTORCH_BUILD_NORDIC_AXON (depends on NRF_AXON) - CMakeLists: auto-link executorch_delegate_axon Examples (examples/nordic/): - hello_axon: minimal sin(x) regression (1 AXON subgraph) - multi_layer: chained FC classifier with profiling (1 subgraph) - simple_rnn: RNN with recurrent state — multi-subgraph delegation (2 subgraphs) - Each includes export script, Zephyr firmware, and setup instructions Tests: 48 passed, 6 skipped (require Nordic SDK) Hardware verified on nRF54LM20DK: all three examples produce correct inference output with AXON NPU acceleration. The Nordic sdk-edge-ai (containing the AXON compiler library) is an external dependency, not redistributed. Discovered via SDK_EDGE_AI_PATH environment variable — same pattern as Ethos-U's dependency on Vela. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18863
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below:
|
This PR needs a
|
Lists NCS v3.3.0-preview3 and links to pytorch/executorch#18863 as explicit prerequisites for the deploy notebook. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run PyTorch models on Nordic's AXON NPU. This repository packages the showcase side of the AXON backend work: models, notebooks, and the deployment environment. The backend itself is proposed upstream as pytorch/executorch#18863. Included: - Three hardware-verified showcase models (nRF54LM20DK), each with a PyTorch training path, INT8 PT2E quantization with AxonQuantizer, AxonPartitioner delegation, and Zephyr firmware: * Anomaly detection — autoencoder, 428 params, ~230 µs/inference * Image classifier — 8×8 CNN, 1,508 params, ~680 µs/inference * Keyword spotting — MFCC CNN, 16,332 params, ~1,600 µs/inference - Four progressive Jupyter notebooks covering first principles, each of the three showcase tasks, and end-to-end flash + serial verification on the DK. - Docker environment with everything pre-installed: NCS toolchain v3.3.0-preview3, ExecuTorch with the AXON backend, PyTorch (CPU), SEGGER J-Link, Jupyter Lab. One build + one run to reach a working environment. - Architecture and supported-ops guides covering TOSA composition with the ARM Ethos-U backend, the AxonPartitioner, and the current NPU op set (FC, Conv1D/2D, depthwise Conv, pool, element-wise, ReLU family) plus op extensions for sigmoid, tanh, and softmax. Apache 2.0 licensed. Nordic's sdk-edge-ai is proprietary and mounted by the user at runtime; SEGGER J-Link is downloaded at Docker build under SEGGER's terms. See README for the full third-party table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Created a showcase repository https://github.com/ioteai/axon-ai |
Summary
ExecuTorch backend for Nordic Semiconductor's AXON NPU on the nRF54LM20B (ARM Cortex-M33 + hardware neural network accelerator).
Follows the same composition pattern as the Ethos-U backend: reuses
TOSABackendfor TOSA lowering, then compiles to AXON commandbuffers via Nordic's compiler library.
Python backend (
backends/nordic/):AxonBackend,AxonPartitioner,AxonQuantizer,AxonCompileSpecC++ runtime (
backends/nordic/runtime/):Zephyr integration:
EXECUTORCH_BUILD_NORDIC_AXONKconfig optionexecutorch_delegate_axonin CMakeExamples (
examples/nordic/):hello_axonmulti_layersimple_rnnEach example includes an export script, Zephyr firmware, and step-by-step README.
Nordic's
sdk-edge-ai(containing the AXON compiler library) is an external dependency, not redistributed. Discovered viaSDK_EDGE_AI_PATHenvironment variable — same pattern as Ethos-U's dependency on Vela.Test plan
hello_axon: sin(1.57) = 0.967, 163 us @ 128 MHz, 1 AXON delegate
multi_layer: 4/4 classes correct, ~210 us, 1 delegate (chained layers)
simple_rnn: 4 RNN steps, ~690 us/step, 2 delegates (tanh breaks chain)