Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 15 additions & 23 deletions backends/nxp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ networks, as well as the ability to adapt and scale to new model architectures,
to AI workloads. ML application development with the eIQ Neutron NPU is fully supported by the
[eIQ machine learning software development environment](https://www.nxp.com/design/design-center/software/eiq-ml-development-environment/eiq-toolkit-for-end-to-end-model-development-and-deployment:EIQ-TOOLKIT).
The eIQ AI SW Stack provides a streamlined development experience for developers and end-users of NXP products.
eIQ extensions connect broader AI ecosystems to the edge, such as the NVIDIA TAO extension, which enables developers to bring AI models trained and fine-tuned with TAO to NXP-powered edge devices.
eIQ extensions connect broader AI ecosystems to the edge, such as the NVIDIA TAO extension, which enables developers
to bring AI models trained and fine-tuned with TAO to NXP-powered edge devices.


## Supported NXP platforms
Expand All @@ -35,37 +36,28 @@ improvements. NXP and the ExecuTorch community is actively developing this codeb

## Neutron Backend implementation and SW architecture
Neutron Backend uses the eIQ Neutron Converter as ML compiler to compile the delegated subgraph to Neutron microcode.
The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class therefore the Neutron Backend uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.

The Neutron Backend in its early prototype phase, is based on existing NXP products, such as
onnx2tflite, known from the NXP's eIQ Toolkit.
The **onnx2tflite** is a converter from the ONNX format to LiteRT (formerly known as TFLite).
It consists of 3 stages:
* ONNX Model Parsing
* Tensor Format Inference, to identify tensors using channel-first layer
* ONNX to LiteRT Conversion
* Optimization Passes, which operate on top of the LiteRT format
* LiteRT Serialization

Due to the similarities between ONNX to LiteRT and Edge to LiteRT conversion, the Neutron Backend's
currently leverages the Tensor format Inference and LiteRT Optimizer.
This shall be considered as temporary solution, intended to be replaced with:
* Dim Order (https://github.com/pytorch/executorch/issues/4873)
* Corresponding ExecuTorch/ATen passes

before reaching higher maturity status by the end of 2025.
The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class therefore the Neutron Backend
uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.

## Layout
The current code base is as follows:
* `backend/ir/` - TFLite/LiteRT based IR to represent the Edge Subgraph, taken from onnx2tflite code base and extended to
support Edge Dialect to LiteRT conversion.
* `backend/ir/converter` - Neutron Backends conversion from Edge (ATen) Dialect to LiteRT, TFLite. The subfolder
`node_conveters` is structured as single module for each Edge operator.
* `backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema
* `backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema.
* `backend/ir/tflite_generator` and `backend/ir/tflite_optimizer` handle the serialization
of the in-memory built subgraph for delegation into LiteRT/TFLite flatbuffers
representation. Code taken from the onnx2tflite tool.
* `quantizer` - Neutron Backends quantizer implementation.
* `edge_passes` - Various passes operating on Edge dialect level.
* `quantizer` - Neutron Backend quantizer implementation.
* `runtime` - Neutron Backend runtime implementation. For running compiled on device.
* `tests/` - Unit tests for Neutron backend.
* `tests/converter/node_converter` - Operator level unit tests.

* `examples/nxp/` - Example models and scripts for running them.

## Examples
Please see this [README.md](https://github.com/pytorch/executorch/blob/main/examples/nxp/README.md).

## Help & Improvements
If you have problems or questions or have suggestions for ways to make
Expand Down
52 changes: 39 additions & 13 deletions examples/nxp/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,46 @@
# PyTorch Model Delegation to Neutron Backend
# ExecuTorch Neutron Backend examples
This directory contains examples demonstrating the use of ExecuTorch AoT flow to convert a PyTorch model to ExecuTorch
format and delegate the model computation to eIQ Neutron NPU using the eIQ Neutron Backend.

In this guide we will show how to use the ExecuTorch AoT flow to convert a PyTorch model to ExecuTorch format and delegate the model computation to eIQ Neutron NPU using the eIQ Neutron Backend.
## Layout
* `experimental/` - contains CifarNet model example.
* `models` - various example models.
* `aot_neutron_compile.py` - script with end-to-end ExecuTorch AoT Neutron Backend workflow.
* `README.md` - this file.
* `run_aot_example.sh` - utility script for aot_neutron_compile.py.
* `setup.sh` - setup script for Neutron Converter installation.

First we will start with an example script converting the model. This example show the CifarNet model preparation. It is the same model which is part of the `example_cifarnet`
## Setup
Please finish tutorial [Setting up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup).

The steps are expected to be executed from the executorch root folder.
1. Run the setup.sh script to install the neutron-converter:
Run the setup.sh script to install the neutron-converter:
```commandline
$ examples/nxp/setup.sh
$ ./examples/nxp/setup.sh
```

2. Now run the `aot_neutron_compile.py` example with the `cifar10` model
```commandline
$ python -m examples.nxp.aot_neutron_compile --quantize \
--delegate --neutron_converter_flavor SDK_25_06 -m cifar10
```
## Supported models
* CifarNet
* MobileNetV2

## PyTorch Model Delegation to Neutron Backend
First we will start with an example script converting the model. This example show the CifarNet model preparation.
It is the same model which is part of the `example_cifarnet` in
[MCUXpresso SDK](https://www.nxp.com/design/design-center/software/development-software/mcuxpresso-software-and-tools-/mcuxpresso-software-development-kit-sdk:MCUXpresso-SDK).

The NXP MCUXpresso software and tools offer comprehensive development solutions designed to help accelerate embedded
system development of applications based on MCUs from NXP. The MCUXpresso SDK includes a flexible set of peripheral
drivers designed to speed up and simplify development of embedded applications.

The steps are expected to be executed from the `executorch` root folder.

1. Run the `aot_neutron_compile.py` example with the `cifar10` model
```commandline
$ python -m examples.nxp.aot_neutron_compile --quantize \
--delegate --neutron_converter_flavor SDK_25_09 -m cifar10
```

3. It will generate you `cifar10_nxp_delegate.pte` file which can be used with the MXUXpresso SDK `cifarnet_example` project, presented [here](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/example_applications.html#how-to-build-and-run-executorch-cifarnet-example).
To get the MCUXpresso SDK follow this [guide](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/getting_mcuxpresso.html), use the MCUXpresso SDK v25.03.00.
2. It will generate you `cifar10_nxp_delegate.pte` file which can be used with the MCUXpresso SDK `cifarnet_example`
project, presented [here](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/example_applications.html#how-to-build-and-run-executorch-cifarnet-example).
This project will guide you through the process of deploying your PTE model to the device.
To get the MCUXpresso SDK follow this [guide](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/getting_mcuxpresso.html),
use the MCUXpresso SDK v25.09.00.
Loading