From c11c5b3fbbbfee597f114f0e5c93d311db1aa8cb Mon Sep 17 00:00:00 2001
From: Siddartha Pothapragada <sidart@meta.com>
Date: Thu, 16 Oct 2025 05:58:47 -0700
Subject: [PATCH] Summary : Add Pico2 Tutorial

---
 docs/source/embedded-section.md       |   3 +
 docs/source/pico2_tutorial.md         | 198 ++++++++++++++++++++++++++
 examples/raspberry_pi/pico2/README.md |  39 ++---
 3 files changed, 224 insertions(+), 16 deletions(-)
 create mode 100644 docs/source/pico2_tutorial.md

diff --git a/docs/source/embedded-section.md b/docs/source/embedded-section.md
index 5636a7546dc..aac64190030 100644
--- a/docs/source/embedded-section.md
+++ b/docs/source/embedded-section.md
@@ -26,6 +26,8 @@ Start here for C++ development with ExecuTorch runtime APIs and essential tutori
 
 - {doc}`tutorial-arm-ethos-u` — Export a simple PyTorch model for the ExecuTorch Ethos-U backend
 - {doc}`raspberry_pi_llama_tutorial` — Deploy a LLaMA model on a Raspberry Pi
+- {doc}`pico2_tutorial` — Deploy a demo MNIST model on the Raspberry Pi Pico 2
+
 
 ```{toctree}
 :hidden:
@@ -38,3 +40,4 @@ using-executorch-building-from-source
 embedded-backends
 tutorial-arm-ethos-u
 raspberry_pi_llama_tutorial
+pico2_tutorial
diff --git a/docs/source/pico2_tutorial.md b/docs/source/pico2_tutorial.md
new file mode 100644
index 00000000000..7098df11b05
--- /dev/null
+++ b/docs/source/pico2_tutorial.md
@@ -0,0 +1,198 @@
+# Pico2: A simple MNIST Tutorial
+
+Deploy your PyTorch models directly to Raspberry Pi Pico2 microcontroller with ExecuTorch.
+
+## What You'll Build
+
+A 28×28 MNIST digit classifier running on memory constrained, low power microcontrollers:
+
+- Input: ASCII art digits (0, 1, 4, 7)
+- Output: Real-time predictions via USB serial
+- Memory: <400KB total footprint
+
+## Prerequisites
+
+- [Environment Setup section](https://docs.pytorch.org/executorch/1.0/using-executorch-building-from-source.html)
+
+- Refer to  this link on how to accept 'EULA' agreement and setup toolchain [link](https://docs.pytorch.org/executorch/1.0/backends-arm-ethos-u.html#development-requirements)
+
+- Verify ARM toolchain
+
+```bash
+which arm-none-eabi-gcc # --> arm/ethos-u-scratch/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi/bin/
+```
+
+## Step 1: Generate pte from given example Model
+
+- Use the [provided example model](https://github.com/pytorch/executorch/blob/main/examples/raspberry_pi/pico2/export_mlp_mnist.py)
+
+```bash
+python export_mlp_mnist.py # Creates balanced_tiny_mlp_mnist.pte
+```
+
+- **Note:** This is hand-crafted MNIST Classifier (proof-of-concept), and not production trained. This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.
+
+## Step 2: Build Firmware for Pico2
+
+```bash
+# Generate model
+
+python export_mlp_mnist.py # Creates balanced_tiny_mlp_mnist.pte
+
+# Build Pico2 firmware (one command!)
+
+./executorch/examples/rpi/build_firmware_pico.sh --model=balanced_tiny_mlp_mnist.pte   # This creates executorch_pico.uf2, a firmware image for Pico2
+```
+
+Output: **executorch_pico.uf2** firmware file (examples/raspberry_pi/pico2/build/)
+
+**Note:** 'build_firmware_pico.sh' script converts given model pte to hex array and generates C code for the same via this helper [script](https://github.com/pytorch/executorch/blob/main/examples/raspberry_pi/pico2/pte_to_array.py). This C code is then compiled to generate final .uf2 binary which is then flashed to Pico2.
+
+## Step 3: Flash to Pico2
+
+Hold BOOTSEL button on Pico2
+Connect USB → Mounts as ^RPI-RP2^ drive
+Drag & drop ^executorch_pico.uf2^ file
+Release BOOTSEL → Pico2 reboots with your model
+
+## Step 4: Verify Deployment
+
+**Success indicators:**
+
+- LED blinks 10× at 500ms → Model running ✅
+- LED blinks 10× at 100ms → Error, check serial ❌
+
+**View predictions:**
+
+```bash
+# Connect serial terminal
+screen /dev/tty.usbmodem1101 115200
+# Expected output:
+
+Something like:
+
+=== Digit 7 ===
+############################
+############################
+                        ####
+                       ####
+                      ####
+                     ####
+                    ####
+                   ####
+                  ####
+                 ####
+                ####
+               ####
+              ####
+             ####
+            ####
+           ####
+          ####
+         ####
+        ####
+       ####
+      ####
+     ####
+    ####
+   ####
+  ####
+ ####
+####
+###
+
+Input stats: 159 white pixels out of 784 total
+Running neural network inference...
+✅ Neural network results:
+  Digit 0: 370.000
+  Digit 1: 0.000
+  Digit 2: -3.000
+  Digit 3: -3.000
+  Digit 4: 860.000
+  Digit 5: -3.000
+  Digit 6: -3.000
+  Digit 7: 1640.000 ← PREDICTED
+  Digit 8: -3.000
+  Digit 9: -3.000
+
+� PREDICTED: 7 (Expected: 7) ✅ CORRECT!
+```
+
+## Memory Optimization Tips
+
+### Pico2 Constraints
+
+- 520KB SRAM (runtime memory)
+- 4MB Flash (model storage)
+- Keep models small:
+
+### Common Issues
+
+- "Memory allocation failed" → Reduce model size and use quantization
+- "Operator missing" → Use selective build: ^--operators=add,mul,relu^
+- "Import error" → Check ^arm-none-eabi-gcc^ toolchain setup.
+
+In order to resolve some of the issues above, refer to the following guides:
+
+- [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html)
+- [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and
+- [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)
+
+### Firmware Size Analysis
+
+```bash
+cd <root of executorch repo>
+ls -al examples/raspberry_pi/pico2/build/executorch_pico.elf
+```
+
+- **Overall section sizes**
+
+```bash
+arm-none-eabi-size -A examples/raspberry_pi/pico2/build/executorch_pico.elf
+```
+
+- **Detailed section breakdown**
+
+```bash
+arm-none-eabi-objdump -h examples/raspberry_pi/pico2/build/executorch_pico.elf
+```
+
+- **Symbol sizes (largest consumers)**
+
+```bash
+arm-none-eabi-nm --print-size --size-sort --radix=d examples/raspberry_pi/pico2/build/executorch_pico.elf | tail -20
+```
+
+### Model Memory Footprint
+
+- **Model data specifically**
+
+```bash
+arm-none-eabi-nm --print-size --size-sort --radix=d examples/raspberry_pi/pico2/build/executorch_pico.elf | grep -i model
+```
+
+- **Check what's in .bss (uninitialized data)**
+
+```bash
+arm-none-eabi-objdump -t examples/raspberry_pi/pico2/build/executorch_pico.elf | grep ".bss" | head -10
+```
+
+- **Memory map overview**
+
+```bash
+arm-none-eabi-readelf -l examples/raspberry_pi/pico2/build/executorch_pico.elf
+```
+
+## Next Steps
+
+### Scale up your deployment
+
+- Use real production trained model
+- Optimize further → INT8 quantization, pruning
+
+### Happy Inference!
+
+**Result:** PyTorch model → Pico2 deployment in 4 simple steps 🚀
+Total tutorial time: ~15 minutes
+
+**Conclusion:** Real-time inference on memory constrained, low power microcontrollers, a complete PyTorch → ExecuTorch → Pico2 demo MNIST deployment
diff --git a/examples/raspberry_pi/pico2/README.md b/examples/raspberry_pi/pico2/README.md
index 976754d6c5e..e9da5a7fd1d 100644
--- a/examples/raspberry_pi/pico2/README.md
+++ b/examples/raspberry_pi/pico2/README.md
@@ -4,44 +4,48 @@ This document outlines the steps required to run a simple MNIST digit recognitio
 
 ## Demo Model: Hand-crafted MNIST Classifier
 
-The included `export_mlp_mnist.py` creates a demonstration model with hand-crafted weights (not production-trained). This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.
+The included `export_mlp_mnist.py` (in examples/raspberry_pi/pico2) creates a demonstration model with hand-crafted weights (not production-trained). This tiny MLP recognizes digits 0, 1, 4, and 7 using manually designed feature detectors.
 Note: This is a proof-of-concept. For production use, train your model on real MNIST data.
 
-## Bring Your Own Model
+## Bring Your Own Model and Deploy
 
 This demo demonstrates ExecuTorch's ability to bring your own PyTorch model and deploy it to Pico2 with one simple script. The complete pipeline works from any PyTorch model to a runnable binary:
 
-### Train your PyTorch model
+- Use existing demo model (examples/raspberry_pi/pico2/export_mlp_mnist.py) or bring your own model
+- Build firmware with one command and pass the model file (.pte) as an argument
+- Deploy directly to Pico2
 
-Export using `torch.export()` and `to_edge()`
-Build firmware with one command
-Deploy directly to Pico2
+### Important Caveats
 
-#### Important Caveats:
-
-- Memory constraints - Models must fit in 520KB SRAM
+- Memory constraints - Models must fit in 520KB SRAM (Pico2)
 - Missing operators - Some ops may not be supported
-- Selective builds - Include only operators your model uses
+- Selective builds - Include only operators your model uses if you want to reduce binary size
 
 ## Memory Constraints & Optimization
 
-- Critical: Pico2 has limited memory:
-- 520KB SRAM (on-chip static RAM)
-- 4MB QSPI Flash (onboard storage)
+- Critical: Pico2 has limited memory
+  - 520KB SRAM (on-chip static RAM)
+  - 4MB QSPI Flash (onboard storage)
 
 ### Always apply optimization techniques on large models that do not fit in Pico2 memory:
 
 Large models will not fit. Keep your `.pte` files small!
+
 - Quantization (INT8, INT4)
 - Model pruning
 - Operator fusion
 - Selective builds (include only needed operators)
-For more details , refer to the [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html), [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)
+
+For more details , refer to the following guides:
+
+- [ExecuTorch Quantization Optimization Guide](https://docs.pytorch.org/executorch/1.0/quantization-optimization.html)
+- [Model Export & Lowering](https://docs.pytorch.org/executorch/1.0/using-executorch-export.html) and
+- [Selective Build support](https://docs.pytorch.org/executorch/1.0/kernel-library-selective-build.html)
 
 ## (Prerequisites) Prepare the Environment for Arm
 
 Setup executorch development environment. Also see instructions for setting up the environment for Arm.
-Make sure you have the toolchain configured correctly. Refer to this [setup](https://docs.pytorch.org/executorch/1.0/backends-arm-ethos-u.html#development-requirements) for more details.
+Make sure you have the toolchain configured correctly. Refer to this [setup](https://docs.pytorch.org/executorch/main/backends-arm-ethos-u.html#development-requirements) for more details.
 
 ```bash
 which arm-none-eabi-gcc
@@ -73,6 +77,7 @@ Hold the BOOTSEL button on Pico2 and connect to your computer. It mounts as `RPI
 ### Verify Execution
 
 The Pico2 LED blinks 10 times at 500ms intervals for successful execution. Via serial terminal, you'll see:
+
 ```bash
 ...
 ...
@@ -134,9 +139,11 @@ Running neural network inference...
 ### Debugging via Serial Terminal
 
 On macOS/Linux:
+
 ```bash
 screen /dev/tty.usbmodem1101 115200
 ```
+
 Replace `/dev/tty.usbmodem1101` with your device path. If LED blinks 10 times at 100ms intervals, check logs for errors, but if it blinks 10 times at 500ms intervals, it is successful!
 
-Result: A complete PyTorch → ExecuTorch → Pico2 demo neural network deployment! 🚀
+Result: A complete PyTorch → ExecuTorch → Pico2 demo MNIST deployment! 🚀