Skip to content

fix(firmware): OTA upload fails closed when no PSK in NVS (RuView#596 audit)#623

Merged
ruvnet merged 1 commit into
mainfrom
fix/ota-fail-closed
May 18, 2026
Merged

fix(firmware): OTA upload fails closed when no PSK in NVS (RuView#596 audit)#623
ruvnet merged 1 commit into
mainfrom
fix/ota-fail-closed

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented May 18, 2026

Pre-existing critical-severity bug, not from PR #596

firmware/esp32-csi-node/main/ota_update.c:44-49 (current main) had:

static bool ota_check_auth(httpd_req_t *req)
{
    if (s_ota_psk[0] == '\0') {
        /* No PSK provisioned — auth disabled (permissive for dev). */
        return true;
    }
    ...
}

So a freshly-flashed node — or any node where nobody had provisioned an OTA PSK yet — accepted attacker-controlled firmware over plain HTTP on port 8032 from any host on the WiFi. No Secure Boot V2, no signed-image verification, no transport encryption. Single LAN call could brick or backdoor a node.

I flagged this in the deep security review of PR #596 but it was pre-existing code in main, not introduced by #596 — so it stayed live as a critical production issue until this commit.

⚠ Breaking change

After this PR, any deployment that relied on the unauthenticated OTA path working out of the box will need to provision a PSK before subsequent OTA pushes succeed. The OTA HTTP server itself still starts when no PSK is set, so operators can:

python firmware/esp32-csi-node/provision.py --port COM7 --ota-psk <hex> --force-partial

to write the NVS key over USB-CDC without re-flashing. Only the POST /ota upload endpoint refuses requests until that's done.

Boot-time ESP_LOGW makes the new posture visible on serial:

W (...) ota_update: No OTA PSK in NVS — OTA upload endpoint will REJECT all
                    requests until provisioned (provision.py --ota-psk <hex>).
                    Fail-closed per RuView#596.

CHANGELOG entry under [Unreleased] / Security is loud about the breaking-change part.

Fix shape

  static bool ota_check_auth(httpd_req_t *req)
  {
      if (s_ota_psk[0] == '\0') {
-         /* No PSK provisioned — auth disabled (permissive for dev). */
-         return true;
+         /* No PSK provisioned — fail closed (RuView#596 audit). */
+         ESP_LOGW(TAG, "OTA rejected: no PSK in NVS (run provision.py --ota-psk <hex>)");
+         return false;
      }
      ...
  }

Test plan

  • Edit compiles cleanly (Rust workspace cargo check -p wifi-densepose-sensing-server --no-default-features 4.4s — unrelated, just sanity)
  • python scripts/check_fix_markers.py — 20/20 markers pass including new RuView#596-ota-fail-closed which forbids the old auth disabled (permissive for dev) string
  • (Reviewer, hardware) Flash, boot without PSK, confirm boot log shows the new ESP_LOGW
  • (Reviewer, hardware) curl -X POST http://<node>:8032/ota -H 'Authorization: Bearer wrong' --data-binary @dummy.bin returns 403
  • (Reviewer, hardware) provision.py --ota-psk <hex> --force-partial, then same curl with correct PSK returns 200

Regression guard

Fix-marker RuView#596-ota-fail-closed requires fail-closed, see RuView#596 audit and OTA rejected: no PSK in NVS substrings in ota_update.c, and forbids the old auth disabled (permissive for dev) and No PSK provisioned — auth disabled strings. Any revert would fail the fix-marker CI workflow.

🤖 Generated with claude-flow

… audit)

ota_check_auth() previously returned true when s_ota_psk[0] == '\0'
("permissive for dev"). A freshly-flashed node — or any node where
nobody had provisioned an OTA PSK yet — accepted attacker-controlled
firmware over plain HTTP on port 8032 from any host on the WiFi. No
Secure Boot V2, no signed-image verification, no transport encryption.
Single LAN call could brick or backdoor a node.

This was flagged in the deep security review of PR #596 but was a
PRE-EXISTING bug in main, not new code from that PR — so it stood as
a critical-severity production issue until this commit.

Fix:
- ota_check_auth() now returns false when no PSK is provisioned, with
  ESP_LOGW("OTA rejected: no PSK in NVS …") at the call site so the
  operator can diagnose the rejection from serial logs
- ota_update_init() ESP_LOGW message updated to surface the new posture
  at boot ("upload endpoint will REJECT all requests until provisioned")
- Doc comment on ota_check_auth() rewritten to make the contract
  explicit and reference the audit

The OTA HTTP server itself still starts even when no PSK is set. That
lets the operator run `provision.py --ota-psk <hex>` over USB-CDC to
write the NVS key without reflashing the firmware. The upload endpoint
just refuses every request in the meantime.

Breaking change for any deployment that depended on the unauthenticated
OTA path working out of the box. Documented in CHANGELOG under
[Unreleased] / Security so it's visible at the next release cut.

Fix-marker RuView#596-ota-fail-closed (scripts/fix-markers.json)
requires the new behaviour and forbids the old "permissive for dev"
fallback strings, so a future revert fails CI.

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet ruvnet merged commit 281c4cb into main May 18, 2026
6 checks passed
@ruvnet ruvnet deleted the fix/ota-fail-closed branch May 18, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant