Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/source/embedded-section.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ Start here for C++ development with ExecuTorch runtime APIs and essential tutori

- {doc}`tutorial-arm-ethos-u` — Export a simple PyTorch model for the ExecuTorch Ethos-U backend
- {doc}`raspberry_pi_llama_tutorial` — Deploy a LLaMA model on a Raspberry Pi
- {doc}`pico2_tutorial` — Deploy a demo MNIST model on the Raspberry Pi Pico 2


```{toctree}
:hidden:
Expand All @@ -38,3 +40,4 @@ using-executorch-building-from-source
embedded-backends
tutorial-arm-ethos-u
raspberry_pi_llama_tutorial
pico2_tutorial
198 changes: 198 additions & 0 deletions docs/source/pico2_tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# Pico2: A simple MNIST Tutorial

Deploy your PyTorch models directly to Raspberry Pi Pico2 microcontroller with ExecuTorch.

## What You'll Build

A 28×28 MNIST digit classifier running on memory constrained, low power microcontrollers:

- Input: ASCII art digits (0, 1, 4, 7)
- Output: Real-time predictions via USB serial
- Memory: <400KB total footprint

## Prerequisites

- [Environment Setup section](https://docs.pytorch.org/executorch/1.0/using-executorch-building-from-source.html)

- Refer to this link on how to accept 'EULA' agreement and setup toolchain [link](https://docs.pytorch.org/executorch/1.0/backends-arm-ethos-u.html#development-requirements)

- Verify ARM toolchain

```bash
which arm-none-eabi-gcc # --> arm/ethos-u-scratch/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi/bin/
```

## Step 1: Generate pte from given example Model

- Use the [provided example model](https://github.com/pytorch/executorch/blob/main/examples/raspberry_pi/pico2/export_mlp_mnist.py)

```bash
python export_mlp_mnist.py # Creates balanced_tiny_mlp_mnist.pte
```

- **Note:** This is hand-crafted MNIST Classifier (proof-of-concept), and not production trained. This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.

## Step 2: Build Firmware for Pico2

```bash
# Generate model

python export_mlp_mnist.py # Creates balanced_tiny_mlp_mnist.pte

# Build Pico2 firmware (one command!)

./executorch/examples/rpi/build_firmware_pico.sh --model=balanced_tiny_mlp_mnist.pte # This creates executorch_pico.uf2, a firmware image for Pico2
```

Output: **executorch_pico.uf2** firmware file (examples/raspberry_pi/pico2/build/)

**Note:** 'build_firmware_pico.sh' script converts given model pte to hex array and generates C code for the same via this helper [script](https://github.com/pytorch/executorch/blob/main/examples/raspberry_pi/pico2/pte_to_array.py). This C code is then compiled to generate final .uf2 binary which is then flashed to Pico2.

## Step 3: Flash to Pico2

Hold BOOTSEL button on Pico2
Connect USB → Mounts as ^RPI-RP2^ drive
Drag & drop ^executorch_pico.uf2^ file
Release BOOTSEL → Pico2 reboots with your model

## Step 4: Verify Deployment

**Success indicators:**

- LED blinks 10× at 500ms → Model running ✅
- LED blinks 10× at 100ms → Error, check serial ❌

**View predictions:**

```bash
# Connect serial terminal
screen /dev/tty.usbmodem1101 115200
# Expected output:

Something like:

=== Digit 7 ===
############################
############################
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
####
###

Input stats: 159 white pixels out of 784 total
Running neural network inference...
✅ Neural network results:
Digit 0: 370.000
Digit 1: 0.000
Digit 2: -3.000
Digit 3: -3.000
Digit 4: 860.000
Digit 5: -3.000
Digit 6: -3.000
Digit 7: 1640.000 ← PREDICTED
Digit 8: -3.000
Digit 9: -3.000

� PREDICTED: 7 (Expected: 7) ✅ CORRECT!
```

## Memory Optimization Tips

### Pico2 Constraints

- 520KB SRAM (runtime memory)
- 4MB Flash (model storage)
- Keep models small:

### Common Issues

- "Memory allocation failed" → Reduce model size and use quantization
- "Operator missing" → Use selective build: ^--operators=add,mul,relu^
- "Import error" → Check ^arm-none-eabi-gcc^ toolchain setup.

In order to resolve some of the issues above, refer to the following guides:

- [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html)
- [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and
- [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)

### Firmware Size Analysis

```bash
cd <root of executorch repo>
ls -al examples/raspberry_pi/pico2/build/executorch_pico.elf
```

- **Overall section sizes**

```bash
arm-none-eabi-size -A examples/raspberry_pi/pico2/build/executorch_pico.elf
```

- **Detailed section breakdown**

```bash
arm-none-eabi-objdump -h examples/raspberry_pi/pico2/build/executorch_pico.elf
```

- **Symbol sizes (largest consumers)**

```bash
arm-none-eabi-nm --print-size --size-sort --radix=d examples/raspberry_pi/pico2/build/executorch_pico.elf | tail -20
```

### Model Memory Footprint

- **Model data specifically**

```bash
arm-none-eabi-nm --print-size --size-sort --radix=d examples/raspberry_pi/pico2/build/executorch_pico.elf | grep -i model
```

- **Check what's in .bss (uninitialized data)**

```bash
arm-none-eabi-objdump -t examples/raspberry_pi/pico2/build/executorch_pico.elf | grep ".bss" | head -10
```

- **Memory map overview**

```bash
arm-none-eabi-readelf -l examples/raspberry_pi/pico2/build/executorch_pico.elf
```

## Next Steps

### Scale up your deployment

- Use real production trained model
- Optimize further → INT8 quantization, pruning

### Happy Inference!

**Result:** PyTorch model → Pico2 deployment in 4 simple steps 🚀
Total tutorial time: ~15 minutes

**Conclusion:** Real-time inference on memory constrained, low power microcontrollers, a complete PyTorch → ExecuTorch → Pico2 demo MNIST deployment
39 changes: 23 additions & 16 deletions examples/raspberry_pi/pico2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,44 +4,48 @@ This document outlines the steps required to run a simple MNIST digit recognitio

## Demo Model: Hand-crafted MNIST Classifier

The included `export_mlp_mnist.py` creates a demonstration model with hand-crafted weights (not production-trained). This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.
The included `export_mlp_mnist.py` (in examples/raspberry_pi/pico2) creates a demonstration model with hand-crafted weights (not production-trained). This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.
Note: This is a proof-of-concept. For production use, train your model on real MNIST data.

## Bring Your Own Model
## Bring Your Own Model and Deploy

This demo demonstrates ExecuTorch's ability to bring your own PyTorch model and deploy it to Pico2 with one simple script. The complete pipeline works from any PyTorch model to a runnable binary:

### Train your PyTorch model
- Use existing demo model (examples/raspberry_pi/pico2/export_mlp_mnist.py) or bring your own model
- Build firmware with one command and pass the model file (.pte) as an argument
- Deploy directly to Pico2

Export using `torch.export()` and `to_edge()`
Build firmware with one command
Deploy directly to Pico2
### Important Caveats

#### Important Caveats:

- Memory constraints - Models must fit in 520KB SRAM
- Memory constraints - Models must fit in 520KB SRAM (Pico2)
- Missing operators - Some ops may not be supported
- Selective builds - Include only operators your model uses
- Selective builds - Include only operators your model uses if you want to reduce binary size

## Memory Constraints & Optimization

- Critical: Pico2 has limited memory:
- 520KB SRAM (on-chip static RAM)
- 4MB QSPI Flash (onboard storage)
- Critical: Pico2 has limited memory
- 520KB SRAM (on-chip static RAM)
- 4MB QSPI Flash (onboard storage)

### Always apply optimization techniques on large models that do not fit in Pico2 memory:

Large models will not fit. Keep your `.pte` files small!

- Quantization (INT8, INT4)
- Model pruning
- Operator fusion
- Selective builds (include only needed operators)
For more details , refer to the [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html), [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)

For more details , refer to the following guides:

- [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html)
- [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and
- [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)

## (Prerequisites) Prepare the Environment for Arm

Setup executorch development environment. Also see instructions for setting up the environment for Arm.
Make sure you have the toolchain configured correctly. Refer to this [setup](https://docs.pytorch.org/executorch/1.0/backends-arm-ethos-u.html#development-requirements) for more details.
Make sure you have the toolchain configured correctly. Refer to this [setup](https://docs.pytorch.org/executorch/main/backends-arm-ethos-u.html#development-requirements) for more details.

```bash
which arm-none-eabi-gcc
Expand Down Expand Up @@ -73,6 +77,7 @@ Hold the BOOTSEL button on Pico2 and connect to your computer. It mounts as `RPI
### Verify Execution

The Pico2 LED blinks 10 times at 500ms intervals for successful execution. Via serial terminal, you'll see:

```bash
...
...
Expand Down Expand Up @@ -134,9 +139,11 @@ Running neural network inference...
### Debugging via Serial Terminal

On macOS/Linux:

```bash
screen /dev/tty.usbmodem1101 115200
```

Replace `/dev/tty.usbmodem1101` with your device path. If LED blinks 10 times at 100ms intervals, check logs for errors, but if it blinks 10 times at 500ms intervals, it is successful!

Result: A complete PyTorch → ExecuTorch → Pico2 demo neural network deployment! 🚀
Result: A complete PyTorch → ExecuTorch → Pico2 demo MNIST deployment! 🚀
Loading