Skip to content

test: validate per-chip flash offsets against an authoritative source (recurring ROM-offset bug) #279

@zackees

Description

@zackees

Context

Per-chip flash layout values — most critically the second-stage bootloader offset — are hand-maintained in the per-MCU JSON configs (crates/fbuild-build/src/esp32/configs/*.jsonesptool.flash_offsets.bootloader). These are ROM-defined constants: get one wrong and the chip's ROM reads garbage at its fixed load address and enters an invalid header: 0x… reboot loop. Nothing currently cross-checks these values against an authoritative source, so a wrong value ships silently and only surfaces as a dead board.

This has bitten us repeatedly. Most recently #278: esp32p4 and esp32c5 both shipped bootloader: "0x0" but the ROM expects 0x2000, producing the invalid header: 0x47550ff7 boot loop on ESP32-P4 hardware. The pattern (a new/edited chip config carrying the wrong offset) is the recurring failure mode, not a one-off.

Authoritative values (ESP-IDF ESP_BOOTLOADER_OFFSET, also exposed as build.bootloader_addr in arduino-esp32 boards.txt):

Offset Chips
0x0 esp32s3, esp32c2, esp32c3, esp32c6, esp32h2, esp32c61
0x1000 esp32, esp32s2
0x2000 esp32p4, esp32c5

Current state in-tree after #278 (all correct now, but only guarded by a hand-written table):

esp32=0x1000  esp32s2=0x1000
esp32s3=0x0   esp32c2=0x0  esp32c3=0x0  esp32c6=0x0  esp32h2=0x0
esp32c5=0x2000  esp32p4=0x2000

#278 added test_bootloader_flash_offsets_match_rom (crates/fbuild-build/src/esp32/mcu_config.rs), but it asserts each config against a hardcoded table in the same repo — it catches an accidental edit to an existing config, yet a newly-added chip can still ship a wrong offset by being added to both the config and the table with the same mistake. We have board-validation tooling already (ci/validate_boards.py, ci/board_sources.py) that pulls external sources, which is the natural place to cross-check.

Proposal

Validate the flash offsets against an authoritative external source rather than (only) an in-repo table:

  • Extend ci/validate_boards.py (or add a sibling check) to read build.bootloader_addr from the installed arduino-esp32 boards.txt (and/or ESP-IDF ESP_BOOTLOADER_OFFSET) and assert it matches each esp32*.json's esptool.flash_offsets.bootloader. Run it in CI.
  • Treat a missing/mismatched offset for any supported chip as a hard failure, so adding a new chip forces the correct value.
  • Consider extending the same authoritative cross-check to the other flash_offsets (partitions 0x8000, firmware 0x10000) and to non-ESP32 platforms that carry analogous ROM-defined layout constants.
  • Keep the fast in-repo unit test as a first line of defense, but make the CI/external check the source of truth.

Acceptance criteria

  • A CI-run check fails when any esp32*.json bootloader offset disagrees with the authoritative source (arduino-esp32 boards.txt build.bootloader_addr and/or ESP-IDF), verified by temporarily mutating one config and observing the failure.
  • The check covers every chip in crates/fbuild-build/src/esp32/configs/ and fails (not silently skips) on an unknown/missing chip.
  • Adding a new ESP32 variant with a wrong/absent bootloader offset cannot pass CI.
  • Docs/comment point maintainers at the authoritative source so the value isn't guessed.

Decisions

  • Type: test/CI enhancement (not a code bug) — the offsets are currently correct in-tree (fixed in ESP32-P4 (eco2) won't boot: bootloader flashed at 0x0 (should be 0x2000) + eco5-linked bootloader/app crash on eco2 silicon #278); this issue is the guardrail to stop the class of bug recurring.
  • Priority: P2 — no active breakage right now, but it's a repeat foot-gun with severe (dead-board) impact; worth doing before the next chip is added.
  • Authoritative source: arduino-esp32 boards.txt build.bootloader_addr as the primary cross-check, since fbuild already resolves that framework and ci/board_sources.py knows how to read it; ESP-IDF ESP_BOOTLOADER_OFFSET as a secondary reference. Triage may prefer one over the other.
  • Scope: bootloader offset first, other flash_offsets/platforms as follow-up — bootloader offset is the one with catastrophic, ROM-level failure; the rest are lower-risk.
  • Home: extend ci/validate_boards.py as the guessed location; can be re-routed.

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions