diff --git a/docs/source/backends-nxp.md b/docs/source/backends-nxp.md index 4783b4a5bc6..5fcfaa21912 100644 --- a/docs/source/backends-nxp.md +++ b/docs/source/backends-nxp.md @@ -10,14 +10,14 @@ For up-to-date status about running ExecuTorch on Neutron Backend please visit t ## Features -Executorch v1.0 supports running machine learning models on selected NXP chips (for now only i.MXRT700). +ExecuTorch v1.0 supports running machine learning models on selected NXP chips (for now only i.MXRT700). Among currently supported machine learning models are: - Convolution-based neutral networks -- Full support for MobileNetv2 and CifarNet +- Full support for MobileNetV2 and CifarNet ## Prerequisites (Hardware and Software) -In order to succesfully build executorch project and convert models for NXP eIQ Neutron Backend you will need a computer running Windows or Linux. +In order to successfully build ExecuTorch project and convert models for NXP eIQ Neutron Backend you will need a computer running Linux. If you want to test the runtime, you'll also need: - Hardware with NXP's [i.MXRT700](https://www.nxp.com/products/i.MX-RT700) chip or a testing board like MIMXRT700-AVK @@ -32,9 +32,48 @@ To test converting a neural network model for inference on NXP eIQ Neutron Backe ./examples/nxp/aot_neutron_compile.sh [model (cifar10 or mobilenetv2)] ``` -For a quick overview how to convert a custom PyTorch model, take a look at our [exmple python script](https://github.com/pytorch/executorch/tree/release/1.0/examples/nxp/aot_neutron_compile.py). +For a quick overview how to convert a custom PyTorch model, take a look at our [example python script](https://github.com/pytorch/executorch/tree/release/1.0/examples/nxp/aot_neutron_compile.py). + +### Partitioner API + +The partitioner is defined in `NeutronPartitioner` in `backends/nxp/neutron_partitioner.py`. It has the following +arguments: +* `compile_spec` - list of key-value pairs defining compilation. E.g. for specifying platform (i.MXRT700) and Neutron Converter flavor. +* `custom_delegation_options` - custom options for specifying node delegation. + +### Quantization + +The quantization for Neutron Backend is defined in `NeutronQuantizer` in `backends/nxp/quantizer/neutron_quantizer.py`. +The quantization follows PT2E workflow, INT8 quantization is supported. Operators are quantized statically, activations +follow affine and weights symmetric per-tensor quantization scheme. + +#### Supported operators + +List of Aten operators supported by Neutron quantizer: + +`abs`, `adaptive_avg_pool2d`, `addmm`, `add.Tensor`, `avg_pool2d`, `cat`, `conv1d`, `conv2d`, `dropout`, +`flatten.using_ints`, `hardtanh`, `hardtanh_`, `linear`, `max_pool2d`, `mean.dim`, `pad`, `permute`, `relu`, `relu_`, +`reshape`, `view`, `softmax.int`, `sigmoid`, `tanh`, `tanh_` + +#### Example +```python +import torch +from executorch.backends.nxp.quantizer.neutron_quantizer import NeutronQuantizer +from torchao.quantization.pt2e.quantize_pt2e import convert_pt2e, prepare_pt2e + +# Prepare your model in Aten dialect +aten_model = get_model_in_aten_dialect() +# Prepare calibration inputs, each tuple is one example, example tuple has items for each model input +calibration_inputs: list[tuple[torch.Tensor, ...]] = get_calibration_inputs() +quantizer = NeutronQuantizer() + +m = prepare_pt2e(aten_model, quantizer) +for data in calibration_inputs: + m(*data) +m = convert_pt2e(m) +``` ## Runtime Integration -To learn how to run the converted model on the NXP hardware, use one of our example projects on using executorch runtime from MCUXpresso IDE example projects list. +To learn how to run the converted model on the NXP hardware, use one of our example projects on using ExecuTorch runtime from MCUXpresso IDE example projects list. For more finegrained tutorial, visit [this manual page](https://mcuxpresso.nxp.com/mcuxsdk/latest/html/middleware/eiq/executorch/docs/nxp/topics/example_applications.html).