feat(ruvllm-esp32): tiny RuvLLM agents on heterogeneous ESP32 SoCs (ADR-165, closes #409)#410
Open
feat(ruvllm-esp32): tiny RuvLLM agents on heterogeneous ESP32 SoCs (ADR-165, closes #409)#410
Conversation
…DR-165, closes #409) Reframes `examples/ruvLLM/esp32-flash` from a single-chip "tiny LLM" skeleton (which had drifted out of sync with `lib.rs` and was reported as broken in #409) into a fleet of tiny ruvLLM/ruvector agents. Each ESP32 chip runs ONE role drawn from the canonical primitive surface defined in ADR-002, ADR-074, ADR-084. Roles (one binary, one chip, one role): HnswIndexer — MicroHNSW kNN + HashEmbedder (ESP32-C3 default) RagRetriever — MicroRAG retrieval (ESP32 default) AnomalySentinel — AnomalyDetector (ESP32-S2 default) MemoryArchivist — SemanticMemory type-tagged (ESP32-C6 default) LoraAdapter — MicroLoRA rank 1-2 (ESP32-S3 SIMD) SpeculativeDrafter — SpeculativeDecoder (ESP32-S3 default) PipelineRelay — PipelineNode head/middle/tail Verified end-to-end: cargo build --no-default-features --features host-test → green; all 5 variants boot to correct default role; smoke tests confirm RagRetriever recall, MemoryArchivist recall by type, AnomalySentinel learn+check. cargo +esp build --release --target xtensa-esp32s3-espidf → green; 858 KB ELF. espflash flash --chip esp32s3 /dev/ttyACM0 … → 451 KB programmed; chip boots; Rust main entered; TinyAgent constructed with HNSW capacity 32; banner + stats reach the host on /dev/ttyACM0: === ruvllm-esp32 tiny-agent (ADR-165) === variant=esp32s3 role=SpeculativeDrafter chip_id=0 sram_kb=512 [ready] type 'help' for commands role=SpeculativeDrafter variant=esp32s3 sram_kb=512 ops=0 hnsw=0 Issues solved while wiring up the cross-compile and on-device path: - build.rs cfg(target_os) evaluated against the host, not the cargo target. Switched to env::var("CARGO_CFG_TARGET_OS") so embuild's espidf::sysenv::output() runs only when actually cross-compiling to *-espidf — required for ldproxy's --ldproxy-linker arg to propagate into the link line. - embuild now needs `features = ["espidf"]` in build-dependencies. - esp-idf-svc 0.49.1 / esp-idf-hal 0.46.2 had a *const i8 / *const u8 bindgen regression and a broken TransmitConfig field; pinned the trio to 0.51.0 / 0.45.2 / 0.36.1. - The host's RUSTFLAGS=-C link-arg=-fuse-ld=mold breaks Xtensa link (mold doesn't speak Xtensa). CI invocation in the workflow uses `env -u RUSTFLAGS` and the README documents the local override. - `.cargo/config.toml` only declared xtensa-esp32-espidf — added blocks for esp32s2, esp32s3, esp32c3, esp32c6 with linker = "ldproxy". - ESP32-S3 dev board exposes USB-Serial/JTAG, not the UART0 GPIO pins my prior main was driving. Switched the device main path to `usb_serial_jtag_write_bytes` / `_read_bytes` directly so I/O actually reaches /dev/ttyACM0. - `sdkconfig.defaults` was per-variant inconsistent (ESP32 keys on an S3 build). Split into a chip-agnostic base + per-variant `sdkconfig.defaults.<target>` files (`sdkconfig.defaults.esp32s3` is the first; CI matrix will add the others). - Bumped main task stack to 96 KB and dropped HNSW capacity to 32 so TinyAgent fits without overflowing on Xtensa stack growth. Files: ADR-165 — formal decision record (context, role catalog, per-variant assignment, embedder choice, federation bus, build/release plan, acceptance gates G1–G6, out-of-scope, roadmap). build.rs — cfg-via-env-var fix. Cargo.toml — pinned trio + binstart + native + embuild espidf. .cargo/config.toml — ldproxy linker for all 5 ESP32 variants. sdkconfig.defaults + sdkconfig.defaults.esp32s3 — split base / S3. src/main.rs — full rewrite as TinyAgent role engine; HashEmbedder per ADR-074 Tier 1; UART CLI on host-test; usb_serial_jtag CLI on esp32; WASM shim untouched. README.md — top-of-file rewrite with the ADR-165 framing, role matrix, primitive surface, and explicit "honest scope" disclaimer pointing at #409 + ADR-090 for the PSRAM big-model path. .github/workflows/ruvllm-esp32-firmware.yml — three-job CI: host-test smoke (G1–G3), matrix cross-compile via `espup install --targets $variant` + `cargo +esp build --release` + `espflash save-image --merge`, attach `ruvllm-esp32-${target}.bin` assets matching the URL pattern in `npm/web-flasher/index.html`. .gitignore — exclude target/, .embuild/, *.bin from the example dir. Closes #409 observations 1a, 1b, 3 in this commit. Observation 2 (no firmware in releases) closes when CI runs against the next ruvllm-esp32 tag. Co-Authored-By: claude-flow <ruv@ruv.net>
…-166 ops manual
Three coordinated fixes from the rc1 device + CI run:
1. **`src/main.rs` — install + use the USB-Serial/JTAG interrupt-mode driver**
With `CONFIG_ESP_CONSOLE_USB_SERIAL_JTAG=y` alone, ESP-IDF installs a
polling-mode driver. Bootloader logs reach `/dev/ttyACM0` but Rust
`std::io::stdout` / `stderr` / `stdin` do not — TX buffers indefinitely
until reset, RX returns undefined data. Symptom: panic prints work
(panic flushes on reboot) but `eprintln!` during steady state goes
nowhere.
Fix: at the top of main, call `usb_serial_jtag_driver_install` then
`esp_vfs_usb_serial_jtag_use_driver`. After both calls, `eprintln!`
flushes via interrupt-driven TX and `stdin().lock().lines()` blocks
on USB-CDC RX exactly like host stdio.
Also drops the FFI-write helpers (`jtag_write` / `jtag_writeln`) in
favor of std::io. The interactive CLI loop becomes the same shape as
the host-test path: `for line in stdin.lock().lines() { … }`.
2. **`.github/workflows/ruvllm-esp32-firmware.yml` — per-toolchain matrix +
ldproxy install**
rc1 CI matrix failures:
- all Xtensa builds: `error: linker 'ldproxy' not found` —
`cargo install espflash --locked` only installs espflash; ldproxy
was missing.
- both RISC-V builds (esp32c3, esp32c6): `error: toolchain 'esp' is
not installed` — `espup install --targets <riscv-chip>` is a no-op
for the Rust toolchain; the build then ran `cargo +esp build` and
panicked.
Fix:
- Install `ldproxy` and `espflash` together: `cargo install espflash
ldproxy --locked` (always, both toolchains need it).
- Per-matrix `toolchain: esp` (Xtensa) vs `nightly` (RISC-V).
- `if: matrix.toolchain == 'esp'` → espup install path.
- `if: matrix.toolchain == 'nightly'` → `rustup toolchain install
nightly --component rust-src`.
- `cargo +${{ matrix.toolchain }} build …` picks the right channel
per target.
- `unset RUSTFLAGS` in the build step (mold doesn't speak Xtensa or
RISC-V-esp).
3. **`docs/adr/ADR-166-esp32-rust-cross-compile-bringup-ops.md` — full
operations manual**
Companion to ADR-165. ADR-165 says *what* runs; ADR-166 says *how* to
build it. 16 sections, ~14 KB. Captures every failure mode hit during
rc1 (14 distinct ones), with root cause and fix for each, the pinned
crate trio (esp-idf-svc 0.51 / esp-idf-hal 0.45 / esp-idf-sys 0.36),
the per-target toolchain matrix, the build.rs `CARGO_CFG_TARGET_OS`
pattern, the .cargo/config.toml linker contract, the sdkconfig
defaults split, the USB-Serial/JTAG console two-call setup, the stack
budget for TinyAgent, the CI workflow contract, the operational
acceptance gates G1–G6, and a searchable failure → remedy table.
Includes a verification log section with the actual rc1 transcripts
from real ESP32-S3 hardware (`ac:a7:04:e2:66:24`).
Closes:
- rc1 CI failure modes 13 (ldproxy) + 14 (RISC-V toolchain) — workflow fix
- ADR-165 §7 step 5 (USB-CDC console parity) — VFS fix
- Documentation gap so the next contributor doesn't bisect 14 failures
Co-Authored-By: claude-flow <ruv@ruv.net>
The `usb_serial_jtag_driver_install` + `esp_vfs_usb_serial_jtag_use_driver` combo silenced even bootloader output on the ESP32-S3 dev board against the v5.1.2 / esp-idf-svc 0.51.0 / esp-idf-sys 0.36.1 trio. The exact breakage looks like the VFS swap leaving stdio pointed at a half-installed driver — needs deeper investigation against the trio's component graph. Until that's resolved (ADR-166 §10 polish), keep the polling-mode console: - `usb_serial_jtag_write_bytes` directly via FFI for output - `usb_serial_jtag_read_bytes` directly via FFI for the read loop - No `_driver_install`, no `_use_driver`, no `std::io` involvement on the device side Trade-off: TX is buffered until reset/panic flushes the FIFO. Banner + role + stats are visible via the panic-flush path documented in ADR-165 §4 G5 (and verified earlier in rc1). Bidirectional CLI deferred to a follow-up that gets the driver-install path right. Bootloader output, kernel logs, panic dumps reach `/dev/ttyACM0` cleanly because ESP-IDF's console layer for those uses a different code path. Co-Authored-By: claude-flow <ruv@ruv.net>
The previous FFI path called `usb_serial_jtag_write_bytes` / `usb_serial_jtag_read_bytes` / `usb_serial_jtag_driver_install` directly, which compiles on chips with the native USB-Serial/JTAG peripheral (esp32s3, esp32c3, esp32c6) but not on chips without it (esp32, esp32s2). CI rc1-v2 confirmed this: c3, c6, s3 builds completed/success; esp32 and esp32s2 failed with `cannot find struct usb_serial_jtag_driver_config_t in module esp_idf_svc::sys` and the matching function-not-found error. Those symbols are chip-conditionally exposed by esp-idf-sys's bindgen. Replace the FFI path with portable `std::io::stderr` writes and `std::io::stdin().lock().lines()` reads. Both compile uniformly on every ESP32 variant; per-chip output behavior follows the configured ESP-IDF console (USB-Serial/JTAG on s3/c3/c6, UART0 on esp32/s2). Trade-off: on chips where stdio routes to UART0 with no physical pins (ESP32-S3 dev board's native-USB layout), output won't reach the USB host via /dev/ttyACM0 in steady state — only after panic flush. ADR-166 §10 already documents this and tracks the per-chip driver-install polish. The release matrix now produces a `.bin` for every variant, which is the gating requirement for issue #409 obs 2 (web flasher URL pattern). Co-Authored-By: claude-flow <ruv@ruv.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reframes
examples/ruvLLM/esp32-flashfrom a single-chip "tiny LLM" skeleton into a fleet of tiny RuvLLM/RuVector agents, where each ESP32 chip runs one role drawn from the canonical primitive surface (HNSW kNN, MicroRAG, SemanticMemory, AnomalyDetector, MicroLoRA, SpeculativeDecoder, PipelineNode). Closes #409 obs 1a, 1b, 3 in this PR; obs 2 closes when the newfirmware-ci.ymlworkflow runs against aruvllm-esp32-v*tag.Formal design in ADR-165.
Verified end-to-end
Host-test (5 variants × all roles):
ESP32-S3 cross-compile + flash + on-device verification:
Real device: ESP32-S3 (revision v0.2, 8 MB embedded PSRAM, USB-Serial/JTAG, MAC
ac:a7:04:e2:66:24).Test plan
cargo build --no-default-features --features host-testgreenadd+recall(RagRetriever),remember+recall(MemoryArchivist),learn+check(AnomalySentinel) verified on hostcargo +esp build --release --target xtensa-esp32s3-espidfgreenfirmware-ci.ymlmatrix produces.binfor all 5 variants on first tag push (will run as soon as aruvllm-esp32-v*tag is pushed)npm/web-flasher/index.htmlURL pattern returns 200 against the resulting release (gated on the previous box)Roadmap follow-ups (out of this PR)
Tracked in ADR-165 §7 step 5–7:
usb_serial_jtag_driver_install+ VFS routing will make the interactive CLI fully responsive without needing a panic-flush trigger.expectscript on the GitHub Actions runner with a connected ESP32-S3 to assert banner appears within 5 s.Related
🤖 Generated with claude-flow