Skip to content

esp32-csi-node: mDNS discovery for seed_url + provisioning UX fixes #574

@proffesor-for-testing

Description

@proffesor-for-testing

Summary

esp32-csi-node's swarm_bridge feature requires hand-set seed_url and seed_token via provision.py writes to the csi_cfg NVS namespace. Customers must know their Cognitum Seed's IP, run a Python tool, navigate provision.py's full-replace footgun (issue #391), and re-provision when the seed's DHCP lease changes. mDNS discovery would eliminate the manual step entirely for the common "ESP32 + Cognitum Seed on the same WiFi" case.

Why now

Customer onboarding friction. A real customer spent 4+ hours on what should have been a "flash and go" ESP32 ↔ Seed integration. The provision.py footgun and hand-set seed_url were two of the larger time sinks. mDNS + a USB-CDC pairing handshake should reduce this to under five minutes per node.

A separate seed-side issue exists for the cog fan-out gap that made the data invisible even after Brian got it flowing — cognitum-one/seed#166 (proposal pending @ruvnet review). This RuView-side discovery work is independent of that seed-side architectural decision and can ship on its own.

Done looks like

  • ESP32 with empty seed_url periodically queries _cognitum-seed._tcp.local (or similar) via mDNS while on WiFi
  • On match, optionally fetches seed_token from the seed's pairing endpoint over USB-CDC if the seed is plugged in via USB at first-boot, falls back to operator-paste-from-CLI otherwise
  • Empty-by-default semantics preserved — explicit seed_url from provision.py still wins
  • DHCP lease changes don't break the link (re-discovery on POST failure)
  • provision.py no longer wipes the csi_cfg namespace on partial reconfigure (provision.py: esptool v5 incompat + NVS partition wipes existing keys when partial update #391)

Work items

  • mDNS query in new seed_discovery.c (or extend nvs_config.c) when seed_url is empty at boot
  • Re-discovery loop on swarm_bridge POST failure (handle DHCP lease churn)
  • Optional: USB-CDC pairing flow that fetches seed_token without provision.py
  • provision.py additive-by-default mode (fix provision.py: esptool v5 incompat + NVS partition wipes existing keys when partial update #391 footgun) so partial reconfigures don't wipe other NVS keys
  • Update firmware/esp32-csi-node/README.md quick-start with the discovery path

Status

Net-new feature work, no breaking changes proposed. Empty-default behaviour preserved so existing deployments are unaffected.


Technical detail

Current state

firmware/esp32-csi-node/main/nvs_config.c:308:

if (nvs_get_str(handle, \"seed_url\", cfg->seed_url, &len) != ESP_OK) {
    cfg->seed_url[0] = '\\0';  /* Disabled by default */
}

swarm_bridge.c will silently log \"seed_url is empty — swarm bridge disabled\" and never attempt to reach a seed unless provision.py was run with --seed-url. Customers without that prior knowledge see no integration with their Cognitum Seed even when both devices are on the same network.

provision.py header warning (kept verbatim from script):

WARNING -- FULL-REPLACE SEMANTICS (issue #391):
    Every invocation REPLACES the entire `csi_cfg` NVS namespace on the device.
    Any key you don't pass on the CLI is erased.

This means a customer who provisions WiFi credentials, then later wants to add --seed-url, must re-pass every other flag (SSID, password, target-ip, node-id, zone) or lose that state. Easy to miss, painful to debug.

Proposed mDNS service record (seed side, for coordination)

Cognitum Seed would advertise:

_cognitum-seed._tcp.local  port=8080
TXT: device_id=<uuid>, version=<seed-firmware-version>, paired=<bool>

ESP32 would query, prefer paired+matching version, fall back to first-found.

Pairing handshake (optional, USB-CDC only)

If ESP32 is plugged into the seed via USB at first boot:

  1. ESP32 enumerates as USB-CDC device
  2. Seed detects new CDC device, polls a small pairing endpoint on the ESP32
  3. Seed mints a seed_token, writes it into ESP32 NVS via the pairing channel
  4. ESP32 reboots, comes up with seed_token set, joins WiFi, finds seed via mDNS, starts streaming

Customer experience reduces to: "plug ESP32 into seed USB, wait 30 s, unplug, mount somewhere on WiFi."

File pointers

  • firmware/esp32-csi-node/main/swarm_bridge.c — current HTTP POST loop
  • firmware/esp32-csi-node/main/nvs_config.c:306-322 — current seed_url / seed_token / swarm_ingest_sec defaults
  • firmware/esp32-csi-node/provision.py — full-replace semantics, footgun warning at lines 15-19
  • ADR-066 (this repo, docs/adr/ADR-066-esp32-swarm-seed-coordinator.md) — swarm bridge architectural context
  • ADR-060 (provisioning) — referenced as related context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions