Skip to content

provision.py: esptool v5 incompat + NVS partition wipes existing keys when partial update #391

@ruvnet

Description

@ruvnet

Found while validating ADR-073 / cognitum-agent PR #60 (UDP:5005 CSI ingest) against a physical ESP32-S3 node on COM8. Two bugs in firmware/esp32-csi-node/provision.py.

Bug 1: esptool v5 syntax incompat

provision.py line 153 uses esptool v4 syntax:

"write_flash",

With esptool v5.x (current: 5.1.0) this fails — v5 renamed the command to write-flash (hyphenated). The PR #60 review already flagged this; opening here to track the fix.

Repro:

pip install 'esptool>=5.0'
python provision.py --port COM8 --target-port 5005

Fix: one-line change write_flashwrite-flash. Or detect esptool version and branch. Verified locally — after edit, flashing succeeded and ESP32 booted with new config.

Bug 2: Partial provision.py invocation wipes all other NVS keys

build_nvs_csv only emits rows for args explicitly passed on the CLI. The generated NVS binary then replaces the entire csi_cfg namespace on flash, silently wiping keys the user didn't pass.

Concrete repro (this caused WiFi to fail in the field):

# Starting NVS has ssid, password, target_ip, target_port=5006, node_id=2, hop_channels
python provision.py --port COM8 --target-port 5005
# After flash: only target_port=5005 survives — ssid/password/target_ip/node_id/hop_count/chan_list/dwell_ms all erased
# Node boots, logs "Retrying WiFi connection (10/10)" and never reconnects

Expected: either

  • read current NVS, merge new values, re-emit full set; or
  • refuse to flash if the invocation doesn't include WiFi creds; or
  • document clearly in --help that provision.py is a full-replace tool and all settings must be provided every time.

Bug 3 (Phase 2 hint): ~11% UDP frame loss on LAN-local capture

When capturing directly on the ESP's target host (no network hops), 17/147 frames failed ADR-018 magic validation in an 8-second sample. This matches the udp_packets_dropped counter behavior noted in PR #60 post-merge review. Not urgent, but worth instrumenting on the firmware side — likely channel-hop boundary truncation.

Sample capture (post-fix):

  • 130 valid frames, 17 invalid in 8s
  • Payload sizes varied 148–404 bytes across channels 3/5/9
  • All valid frames parsed cleanly: magic=0xC5110001, node=2, nsub=64, freq=2432MHz, rssi -34..-70 dBm, monotonic sequence

Filing this for tracking; happy to PR Bug 1 & Bug 2 fixes if the project wants them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions