Summary
The WebSocket /ws/sensing broadcast continues to emit source: "esp32" indefinitely after the ESP32 hardware loses power or network connectivity. The UI ("Sensing" tab) then keeps showing "LIVE — ESP32 HARDWARE Connected" with cached/frozen sensor values, with no indication that the data source is offline.
AppStateInner::effective_source() already implements the correct 5-second stale-detection (returns "esp32:offline" when no UDP frame has arrived within ESP32_OFFLINE_TIMEOUT), but it is only invoked from REST endpoints (/health, etc.), never from the WS broadcast path.
Reproduce
- Flash v0.6.4-esp32 firmware, provision WiFi +
--target-ip.
- Start
sensing-server --source esp32 --bind-addr 0.0.0.0.
- Open
http://localhost:8080/ui/index.html → "Sensing" tab. Verify the green "LIVE — ESP32 HARDWARE" banner and live tick / amplitude data.
- Unplug the ESP32 (or power it off).
- Wait 30+ seconds.
Observed
- UI still displays "LIVE — ESP32 HARDWARE Connected".
- WS payload still emits
"source": "esp32".
tick field frozen at the last received value.
- All
nodes[].amplitude, features.*, vital_signs.* frozen at the last frame, but re-broadcast every tick.
classification keeps emitting whatever motion/presence the cached frame implied.
Expected
- After 5 seconds with no UDP frames, WS payload should emit
"source": "esp32:offline" so the UI can switch to an offline/disconnected state (the same way the REST /health endpoint already reports "status": "degraded").
Verification
GET /health correctly reports the offline state:
{"clients":1,"source":"esp32:offline","status":"degraded","tick":44680}
WS /ws/sensing does not — it keeps emitting:
{"type":"sensing_update","source":"esp32","tick":44540, ...same frozen payload...}
Root Cause
effective_source() (main.rs:679) is correct, and is called from REST endpoints (main.rs:2167, 2735, 2789, 2810, 2820, 2844, 3472).
The two WS broadcast sites use a string literal instead:
// main.rs:3791 (edge-vitals path)
let mut update = SensingUpdate {
msg_type: "sensing_update".to_string(),
timestamp: chrono::Utc::now().timestamp_millis() as f64 / 1000.0,
source: "esp32".to_string(), // <-- bypasses effective_source()
tick,
...
// main.rs:4018 (raw CSI path)
let mut update = SensingUpdate {
msg_type: "sensing_update".to_string(),
timestamp: chrono::Utc::now().timestamp_millis() as f64 / 1000.0,
source: "esp32".to_string(), // <-- same bug
tick,
...
Suggested Fix
Both call sites already hold the AppStateInner guard s in scope:
- source: "esp32".to_string(),
+ source: s.effective_source(),
(Single-line change at each site. No new locks, no allocation churn beyond what effective_source already does.)
Why This Matters (Safety)
The project README and various ADRs market use-cases including:
- Fall detection / elder-care monitoring
- Overnight vital-sign tracking (apnea screening)
- Presence-based safety triggers
Silently re-publishing the last received frame as "LIVE" is a silent-failure pattern: a deployed system whose ESP32 lost power could continue reporting "breathing normal / present" indefinitely, masking a real emergency. The 5s timeout was clearly intended to prevent exactly this — the fix just needs to extend coverage to the WS path.
Related / Wider Scope
This patch is the minimum fix and is sufficient to let the UI flip to an offline indicator. A more complete fix could additionally:
- Suppress WS broadcasts entirely when stale (skip the tick loop when
effective_source().ends_with(":offline")), or
- Add a
stale: bool field on SensingUpdate that downstream consumers (UI, recorders, cluster aggregators) can branch on.
The two existing per-node NodeFeatureSnapshot.stale flags (main.rs:478, 511-512) are a precedent for the second approach.
#519 ("Ghost person detection, FPS infinity, skeleton flickering and jumpy vitals with ESP32-S3 multi-node setup") may be partially related — both stem from the broadcast loop not knowing the upstream sensor state has degraded.
Environment
- Firmware: v0.6.4-esp32 (
esp32-csi-node.bin SHA256 prefix 0066d74d35b0dbca…)
- Server:
main branch
- Hardware: ESP32-S3 N16R8 (Waveshare DEV-KIT), single-node, channel 6 / 2.4 GHz
- Host: Windows 11, sensing-server built with stable-x86_64-pc-windows-gnu,
--no-default-features
- UI: Chrome on
localhost:8080/ui/index.html
Summary
The WebSocket
/ws/sensingbroadcast continues to emitsource: "esp32"indefinitely after the ESP32 hardware loses power or network connectivity. The UI ("Sensing" tab) then keeps showing "LIVE — ESP32 HARDWARE Connected" with cached/frozen sensor values, with no indication that the data source is offline.AppStateInner::effective_source()already implements the correct 5-second stale-detection (returns"esp32:offline"when no UDP frame has arrived withinESP32_OFFLINE_TIMEOUT), but it is only invoked from REST endpoints (/health, etc.), never from the WS broadcast path.Reproduce
--target-ip.sensing-server --source esp32 --bind-addr 0.0.0.0.http://localhost:8080/ui/index.html→ "Sensing" tab. Verify the green "LIVE — ESP32 HARDWARE" banner and livetick/ amplitude data.Observed
"source": "esp32".tickfield frozen at the last received value.nodes[].amplitude,features.*,vital_signs.*frozen at the last frame, but re-broadcast every tick.classificationkeeps emitting whatever motion/presence the cached frame implied.Expected
"source": "esp32:offline"so the UI can switch to an offline/disconnected state (the same way the REST/healthendpoint already reports"status": "degraded").Verification
GET /healthcorrectly reports the offline state:{"clients":1,"source":"esp32:offline","status":"degraded","tick":44680}WS
/ws/sensingdoes not — it keeps emitting:{"type":"sensing_update","source":"esp32","tick":44540, ...same frozen payload...}Root Cause
effective_source()(main.rs:679) is correct, and is called from REST endpoints (main.rs:2167, 2735, 2789, 2810, 2820, 2844, 3472).The two WS broadcast sites use a string literal instead:
Suggested Fix
Both call sites already hold the
AppStateInnerguardsin scope:(Single-line change at each site. No new locks, no allocation churn beyond what
effective_sourcealready does.)Why This Matters (Safety)
The project README and various ADRs market use-cases including:
Silently re-publishing the last received frame as "LIVE" is a silent-failure pattern: a deployed system whose ESP32 lost power could continue reporting "breathing normal / present" indefinitely, masking a real emergency. The 5s timeout was clearly intended to prevent exactly this — the fix just needs to extend coverage to the WS path.
Related / Wider Scope
This patch is the minimum fix and is sufficient to let the UI flip to an offline indicator. A more complete fix could additionally:
effective_source().ends_with(":offline")), orstale: boolfield onSensingUpdatethat downstream consumers (UI, recorders, cluster aggregators) can branch on.The two existing per-node
NodeFeatureSnapshot.staleflags (main.rs:478, 511-512) are a precedent for the second approach.#519 ("Ghost person detection, FPS infinity, skeleton flickering and jumpy vitals with ESP32-S3 multi-node setup") may be partially related — both stem from the broadcast loop not knowing the upstream sensor state has degraded.
Environment
esp32-csi-node.binSHA256 prefix0066d74d35b0dbca…)mainbranch--no-default-featureslocalhost:8080/ui/index.html