Rust bootloader for resource-constrained microcontrollers. Fits in the CH32V003's 1920-byte system flash with full trial boot, CRC16 app validation, OB-based metadata, and version reporting — leaving the entire 16KB user flash for the application.
I built tinyboot for OpenServoCore, where CH32V006-based servo boards need seamless firmware updates over the existing DXL TTL bus — no opening the shell, no debug probe, just flash over the same wire the servos already talk on.
The existing options didn't fit:
-
CH32 factory bootloader — Fixed to 115200 baud on PD5/PD6 with no way to configure UART pins, baud rate, or TX-enable for RS-485. Uses a sum-mod-256 checksum that silently drops bad commands with no error response. No CRC verification, no trial boot, no boot state machine. See ch32v003-bootloader-docs for the reverse-engineered protocol details.
-
embassy-boot — A well-designed bootloader, but requires ~8KB of flash. That's half the V003's 16KB user flash, and doesn't fit in system flash at all. Not practical for MCUs with 16-32KB total.
I took it as a challenge to fit a proper bootloader — with a real protocol, CRC16 validation, trial boot, and configurable transport — into the CH32V003's 1920-byte system flash. The key inspiration was rv003usb by cnlohr, whose software USB implementation includes a 1920-byte bootloader in system flash. That project proved it was possible to fit meaningful code in that space, and showed me that the entire 16KB of user flash could be left free for the application.
Beyond the usual Cargo profile tricks (opt-level = "z", LTO, codegen-units = 1, panic = "abort"), fitting a real bootloader in 1920 bytes required some more deliberate choices:
- No HAL crates — bare metal register access via PAC crates only; HAL abstractions are too expensive for this budget
- Custom runtime — no qingke-rt; the bootloader doesn't need a vector table, interrupts, or static initialization, so the startup is just GP/SP init and a jump to main (20 bytes of assembly instead of ~1.4KB of full runtime)
- Symmetric frame format — the same
Framestruct is used for both requests and responses with one shared parse and format path, eliminating code duplication repr(C)frame with union data — CRC is computed directly over the struct memory via pointer cast; no serialization step, no intermediate bufferMaybeUninitframe buffer — the 76-byteFramestruct is reused every iteration without zero-initialization- Bit-bang CRC16 — no lookup table, trades speed for ~512 bytes of flash savings
- OB bit-clear state transitions — forward state changes (Idle→Updating, trial consumption) flip 1→0 bits without erasing, avoiding the cost of a full erase+rewrite cycle and the code to preserve OB contents
- Avoid
memset/memcpy— these pull in expensive core routines; manual byte loops and volatile writes keep the linker from dragging in library code .write()over.modify()— register writes use direct writes instead of read-modify-write, saving the read and mask operations- Aggressive code deduplication — shared flash operation primitives across erase, write, and OB operations (see the flash HAL)
tinyboot is structured as a library, not a monolithic binary. The core logic and protocol are platform-agnostic crates; chip-specific details live in separate ch32-* crates. To build your bootloader, you create a small crate with a main.rs that wires up your pin configuration, baud rate, and flash layout — see the examples for exactly this. The same split applies on the app side: tinyboot-ch32-app integrates into your application so it can confirm a successful boot and reboot into the bootloader on command, enabling fully remote firmware updates without physical access.
- Tiny — Fits in 1920 bytes of CH32V003 system flash, leaving all 16KB user flash for the application
- CRC16 validation — Every frame is CRC16-CCITT protected; app image is verified end-to-end after flashing
- Trial boot — New firmware gets a limited number of boot attempts; if the app doesn't confirm, the bootloader takes over automatically
- Boot state machine — Idle / Updating / Validating lifecycle tracked in option bytes with forward-only bit transitions (no erase needed for state advances)
- Version reporting — Boot and app versions packed into flash, queryable over the wire
- Configurable transport — The protocol runs over any
embedded_io::Read + Writestream. The CH32 implementation supports UART with configurable pins, baud rate, and optional TX-enable for RS-485 / DXL TTL, but the core is transport-agnostic — USB, SPI, Bluetooth, or WiFi would work just as well - App-side integration — The app can confirm a successful boot and request bootloader entry over the wire, enabling fully remote firmware updates without physical access
- Library, not binary — Build your bootloader by creating a small crate that wires up your specific hardware; the core logic is reusable across chips
- Modular and portable — Platform-agnostic core with four traits (
Transport,Storage,BootMetaStore,BootCtl) that you implement for your MCU; the protocol, state machine, and CLI work unchanged
| Crate / Example | Category | Description |
|---|---|---|
tinyboot |
core | Platform-agnostic bootloader core (protocol dispatcher, boot state machine, app validation) |
tinyboot-protocol |
core | Wire protocol (frame format, CRC16, commands) |
tinyboot-ch32-hal |
ch32 | Minimal HAL (flash, GPIO, USART, RCC) |
tinyboot-ch32-boot |
ch32 | Bootloader platform (storage, boot control, OB metadata) |
tinyboot-ch32-app |
ch32 | App-side boot client (confirm, request update) |
tinyboot-cli |
host | CLI firmware flasher over UART |
examples/ch32/system-flash |
example | Full-featured bootloader in 1920 bytes of system flash, all 16KB free for app |
examples/ch32/user-flash |
example | Same bootloader in user flash, with room for extras like defmt logging |
The workspace uses edition 2024.
- Library crates and CLI — stable Rust 1.85+
- CH32 examples (bootloader and app binaries) — nightly, for
-Zbuild-stdonriscv32ec-unknown-none-elf
-
Build your bootloader — create a small crate with a
main.rsthat configures your pins, baud rate, and flash layout. The system-flash example puts the bootloader in system flash, leaving all user flash for your app. The user-flash example keeps it in user flash instead, which gives more room for bootloader features (e.g. defmt logging) or debugging the bootloader itself. -
Flash the bootloader to system flash using wlink:
wlink flash --address 0x1FFFF000 target/riscv32ec-unknown-none-elf/release/boot
-
Install the CLI and flash your app over UART:
cargo install tinyboot-cli tinyboot flash target/riscv32ec-unknown-none-elf/release/app --reset
Adding a new chip within an existing family (e.g. another CH32 variant) is straightforward — add the register definitions to the existing HAL crate and a feature flag. No new crates needed.
Porting to an entirely new MCU family (e.g. STM32) requires a parallel set of crates. The core crates (tinyboot, tinyboot-protocol, tinyboot-cli) are platform-agnostic — you implement four traits and provide a minimal HAL. Here's what that looks like:
Low-level register access shared between the boot and app crates. Provides the bare minimum operations both sides need:
- Flash — unlock, erase page, write halfword/word, lock, option byte access
- GPIO — configure pin mode, set high/low (for TX-enable if using RS-485)
- USART — init with baud rate, blocking read byte, blocking write byte, flush
- RCC/clock — enable peripheral clocks
- Reset — system reset, and optionally jump-to-address for user-flash bootloaders
For CH32, we use ch32-metapac for register definitions. For STM32, you could use stm32-metapac or raw PAC crates. The HAL should be minimal — this code runs in a bootloader, not an application.
Implements the core boot traits using the HAL. Four traits from tinyboot::traits::boot:
| Trait | What to implement |
|---|---|
Transport |
Any embedded_io::Read + Write stream — UART, RS-485, USB, SPI, even WiFi or Bluetooth. The protocol doesn't care what carries the bytes |
Storage |
Implement embedded_storage::NorFlash (erase, write) and provide as_slice() for zero-copy flash reads, plus unlock() |
BootMetaStore |
Read/write boot state, trial counter, and app checksum from your chip's equivalent of option bytes or a reserved flash page |
BootCtl |
is_boot_requested() checks your boot flag (OB bit, RAM magic, GPIO pin, etc.); system_reset() resets or jumps to app |
Wire them together in a Platform struct and pass it to Core::new(platform).run().
Implements tinyboot::traits::app::BootClient using the HAL:
confirm()— transition boot state from Validating back to Idlerequest_update()— set your boot request flagsystem_reset()— reset the system
The core tinyboot::app::App handles command polling and dispatch generically — you just provide the BootClient implementation.
The entire protocol (frame format, CRC, sync, commands), the boot state machine (Idle/Updating/Validating transitions, trial boot logic, app validation), the CLI, and the host-side flashing workflow all work unchanged. You only write the chip-specific glue.
tinyboot currently supports the CH32V003 over UART / RS-485. This is tested and working end-to-end for both system-flash and user-flash configurations.
What's coming: I have dev boards on order for several other CH32 family chips (V006, V103, V203, X035, etc.) and plan to add support as they arrive. The architecture is already chip-agnostic — it's mostly a matter of adding HAL implementations and testing on real hardware.
Want a different transport or chip? File an issue. USB support in particular would be a natural addition. Transports like Bluetooth are harder to fit in system flash, though newer CH32 chips with larger system flash regions may make it feasible — no guarantees.
The crates use unsafe in targeted places, primarily to meet the extreme size constraints of system flash (1920 bytes):
repr(C)unions andMaybeUninit— zero-copy frame access and avoiding zero-initialization overheadread_volatile/write_volatile— direct flash reads/writes, version reads, and boot request flag accesstransmute— enum conversions (boot state) and function pointer cast for jump-to-addressfrom_raw_parts— zero-copy flash slice access in the storage layer- Linker section attributes — placing version data and boot metadata at fixed flash addresses
export_name/extern "C"— runtime entry points and linker symbol access- Critical section impl — no-op implementation since the bootloader runs with interrupts disabled
These are deliberate trade-offs — safe alternatives would pull in extra code that doesn't fit. The unsafe is confined to data layout, memory access, and hardware boundaries; the bootloader state machine and protocol logic are safe Rust.
Contributions are welcome — especially new chip ports and transport implementations. If you're thinking about adding support for a new MCU family, the Porting to a New Chip section above covers the trait surface you'd need to implement.
Please open an issue before starting a large PR so we can discuss the approach.
Licensed under either of Apache License, Version 2.0 or MIT License at your option.
