Skip to content

feat: per-room calibration system (ADR-151) + cognitum-v0 appliance integration spec#989

Open
ruvnet wants to merge 21 commits into
mainfrom
feat/adr-151-calibration-api
Open

feat: per-room calibration system (ADR-151) + cognitum-v0 appliance integration spec#989
ruvnet wants to merge 21 commits into
mainfrom
feat/adr-151-calibration-api

Conversation

@ruvnet

@ruvnet ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Opens the per-room calibration work for review (separate from the firmware HR fix, which shipped in #988 / v0.7.1-esp32). Additive — no existing behavior changed.

What's here

ADR-151 — Per-Room Calibration & Specialized Model Training

docs/adr/ADR-151-room-calibration-specialist-training.md (indexed in docs/adr/README.md). Formalizes the room-first pipeline:
baseline → enroll → extract → train → a bank of small specialized ruVector models (breathing, heartbeat, restlessness, posture, presence, anomaly), each a lightweight LoRA/HNSW head distilled from the frozen, Hugging-Face-published RF Foundation Encoder (ADR-150). Local-first (only the room-agnostic base is public); honest STALE degradation on baseline drift. Builds on ADR-135 (baseline = environmental fingerprint) and reuses rapid_adapt.rs + the ruvsense extractors.

wifi-densepose calibrate-serve — Stage 1 HTTP API (for a UI)

An Axum server (CORS-enabled) wrapping the existing ADR-135 CalibrationRecorder, so a UI can drive an empty-room baseline capture from the ESP32 CSI stream:

Method Path Purpose
GET /api/v1/calibration/health liveness + UDP ingest stats
POST /api/v1/calibration/start { tier?, duration_s?, room_id?, min_frames? }
GET /api/v1/calibration/status live progress (poll for UI)
POST /api/v1/calibration/stop finalize early
GET /api/v1/calibration/result finalized baseline summary
GET /api/v1/calibration/baselines list persisted baselines

A single background task owns the UDP socket + recorder (handlers talk to it over an mpsc channel + shared status snapshot), keeping the &mut recorder lock-free. Reuses the existing calibrate.rs ESP32 wire parser (made pub(crate)).

scripts/csi-udp-relay.py

Firewall-free relay for local Windows ESP32 testing without admin (python.exe is already allowed): binds the public CSI port and forwards to a loopback port the server listens on.

Validation

  • 19 CLI tests pass on current main; crate builds clean.
  • End-to-end on live hardware (ESP32-S3, edge_tier=0 raw CSI): start → 120 frames → state=complete, finalized 52-subcarrier baseline persisted; relay path confirmed (:5005 → :5006 → calibrate-serve).

Scope / follow-ups

This is Stage 1 only (baseline capture + API). Stages 2–4 of ADR-151 (guided enrollment, feature extraction bridge, the specialist bank) are the follow-on work in the ADR's phase plan.

🤖 Generated with claude-flow

ruvnet added 7 commits June 9, 2026 11:47
Room-first calibration -> bank of small specialised ruVector models
(breathing, heartbeat, restlessness, posture, presence, anomaly) distilled
from the frozen Hugging-Face-published RF Foundation Encoder (ADR-150).

Four-stage local-first pipeline: baseline (ADR-135 environmental fingerprint)
-> guided enrollment (NEW EnrollmentProtocol, clean anchors not hours) ->
feature extraction (reuse signal_features + ruvsense) -> specialist bank
training (rapid_adapt LoRA heads, RVF storage, HNSW prototypes).

Invariants: specialisation over scale; local heads over a shared public base;
honest STALE degradation on baseline drift. Indexes ADR-149/150/151.

Co-Authored-By: claude-flow <ruv@ruv.net>
…35/151)

Adds `wifi-densepose calibrate-serve` — an Axum HTTP API that wraps the
ADR-135 CalibrationRecorder so a UI (or any client) can drive an empty-room
baseline capture remotely. Stage 1 ("teach the room") of the ADR-151 room
calibration & training pipeline.

A single background task owns the UDP socket (ESP32 0xC511_0001 frames) and
the optional active recorder; HTTP handlers talk to it over an mpsc command
channel and read a shared status snapshot, keeping the &mut recorder
lock-free. CORS permissive so a browser UI can call it.

Endpoints (/api/v1/calibration/*):
  GET  /health      liveness + UDP ingest stats (frames_seen, streaming)
  POST /start       { tier?, duration_s?, room_id?, min_frames? }
  GET  /status      live progress (state, frames, progress, z, eta) — poll for UI
  POST /stop        finalize the current session early
  GET  /result      finalized baseline summary (amp/phase-dispersion averages)
  GET  /baselines   list persisted baseline .bin files

Reuses the existing calibrate.rs ESP32 wire parser (made pub(crate)); honest
abort when <10 frames arrive in the window (e.g. ESP32 not streaming).

Verified end-to-end over loopback: start -> 300 replayed HT20 frames ->
state=complete, 52-subcarrier baseline, phase_dispersion_avg=0.00096
(concentrated/valid), persisted to disk; all 6 endpoints exercised.
CLI: 19 tests pass; crate builds clean.

Co-Authored-By: claude-flow <ruv@ruv.net>
Windows Defender blocks inbound LAN UDP to a freshly-built binary without an
admin allow-rule; python.exe is already allowed. This relay binds the public
CSI port and forwards each datagram verbatim to a loopback port where
`calibrate-serve --udp-bind 127.0.0.1 --udp-port 5006` listens (loopback is
firewall-exempt). No admin required.

Validated: ESP32-format 0xC5110001 frames -> :5005 -> relay -> :5006 ->
calibrate-serve -> state=complete, 52-subcarrier baseline,
phase_dispersion_avg=0.00098 (clean). Completes the no-admin live-test path.

Co-Authored-By: claude-flow <ruv@ruv.net>
…alist bank, runtime

New crate wifi-densepose-calibration implementing the per-room pipeline beyond
Stage-1 baseline:

- anchor.rs: guided-anchor sequence + event-sourced EnrollmentSession (Stage 2)
- enrollment.rs: AnchorQualityGate + AnchorRecorder — gates anchors against the
  ADR-135 baseline deviation (presence/motion), re-prompts bad captures
- extract.rs: Features + AnchorFeature — autocorrelation periodicity (breathing/
  HR bands), variance/motion (Stage 3)
- specialist.rs: 6 small room-calibrated models — presence (learned threshold),
  posture (nearest-prototype), breathing/heartbeat (band periodicity),
  restlessness (calm/active normalization), anomaly (novelty vs anchors) (Stage 4)
- bank.rs: SpecialistBank — train/persist + baseline-drift STALE invalidation
- runtime.rs: MixtureOfSpecialists — presence short-circuit + anomaly veto +
  stale flagging (Stage 5)

Statistical heads make the pipeline runnable/validatable today; the ADR-150 HF
RF Foundation Encoder backbone is the documented upgrade path. 29 unit tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
Integrates the wifi-densepose-calibration crate into the CLI as four
subcommands driving the full Stage 2–5 pipeline against a live ESP32 raw-CSI
stream (edge_tier=0):

- enroll: walks the guided anchor sequence, gates each capture against the
  ADR-135 baseline deviation (re-prompts bad anchors), writes labelled features
- train-room: fits the SpecialistBank from the enrollment, persists JSON
- room-status: prints a trained bank's summary
- room-watch: live mixture-of-specialists readout (presence/posture/breathing/
  heart/restless) over a rolling window, with anomaly veto + STALE flagging

Per-frame scalar is the mean CSI amplitude (carries presence/motion + breathing
modulation). Validated end-to-end on the live ESP32 (COM8, edge_tier=0): the
real parser → feature extraction → runtime detected breathing (~16–31 BPM) on
hardware. Full multi-anchor enrollment accuracy requires the operator to perform
the poses; phase-based breathing extraction is a noted refinement.

48 tests pass (29 calibration + 19 CLI).

Co-Authored-By: claude-flow <ruv@ruv.net>
Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet ruvnet changed the title feat(cli): ADR-151 per-room calibration & UI-drivable calibration API (calibrate-serve) feat: ADR-151 per-room calibration & specialist training (full pipeline) Jun 9, 2026
@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Updated: now the full ADR-151 pipeline (was Stage-1 only)

Pushed Stages 2–5 — the calibration system is complete, not just baseline capture.

New crate wifi-densepose-calibration:

  • anchor.rs / enrollment.rs — guided anchors + adaptive quality gate, event-sourced session (Stage 2)
  • extract.rsAnchorFeature via autocorrelation periodicity (breathing/HR) + variance/motion (Stage 3)
  • specialist.rs — 6 small room-calibrated models: presence (learned threshold), posture (nearest-prototype), breathing/heartbeat (band periodicity), restlessness (calm/active), anomaly (novelty) (Stage 4)
  • bank.rsSpecialistBank train/persist + baseline-drift STALE invalidation
  • runtime.rsMixtureOfSpecialists: presence short-circuit + anomaly veto + confidence gating (Stage 5)

CLI: enroll / train-room / room-status / room-watch (+ the Stage-1 calibrate-serve API and relay already here).

Validation: 48 tests (29 calibration + 19 CLI). Live ESP32-S3 (COM8, edge_tier=0): the real parser → feature-extraction → mixture runtime detected breathing (~16–31 BPM) end-to-end.

Specialists are statistical heads (runnable today); the frozen ADR-150 HF RF Foundation Encoder backbone, RVF/HNSW storage, and a phase-based breathing carrier are documented follow-ups in ADR-151 §4.

ruvnet added 4 commits June 9, 2026 12:27
The max-variance-subcarrier carrier locked onto motion artifacts (not
breathing) and also had an out-of-bounds bug on variable CSI subcarrier
counts. Reverted to the mean-amplitude carrier, which is validated live to
detect breathing. Phase-based extraction on a stable subcarrier remains the
proper higher-SNR refinement (ADR-151 §4).

Co-Authored-By: claude-flow <ruv@ruv.net>
MultiNodeMixture fuses several co-located nodes (each with its own
room-calibrated SpecialistBank) into one RoomState:
- presence: OR across nodes (any node seeing a person wins)
- posture/breathing/heartbeat: highest-confidence node (best viewpoint)
- restlessness/anomaly: max across nodes
- veto: any node's physically-implausible signal vetoes the room's vitals
  (anti-hallucination, same as single-node runtime) + presence short-circuit
- stale: any node's STALE flag propagates

Same-room multistatic only; cross-room is federation (ADR-105), not fusion.
6 unit tests (presence OR, best-confidence breathing, single-node veto,
staleness). 35 calibration tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
`room-watch --node-bank N:path` (repeatable) groups live CSI frames by node_id
and fuses per-node banks via MultiNodeMixture. Validated live on COM8 (node 9,
edge_tier=0): frames grouped + fused end-to-end. True 2-node fusion is covered
by unit tests; a second raw-CSI node is the hardware blocker. 54 tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
…erview

Detailed cross-repo integration spec for cognitum-one/v0-appliance: data
contracts (CSI wire format, ADR-135 baseline binary, enrollment/bank/RoomState
JSON schemas), calibrate-serve HTTP API, public crate API, Pi5+Hailo tiering,
and a 5-step appliance integration plan. Grounded in the verified cognitum-v0
inventory (aarch64, cargo 1.96, HAILO10H, ruview-vitals-worker:50054).

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet ruvnet changed the title feat: ADR-151 per-room calibration & specialist training (full pipeline) feat: per-room calibration system (ADR-151) + cognitum-v0 appliance integration spec Jun 9, 2026
@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

📋 Integration reference for cognitum-one/v0-appliance

Added `docs/integration/calibration-appliance-integration.md` — the detailed cross-repo integration spec to build the appliance side from. It covers everything needed to consume this calibration system on cognitum-v0:

Data contracts (the integration surface):

  • CSI ingest wire format (ESP32 0xC5110001, byte layout)
  • Baseline binary format (ADR-135, magic 0xCA1B0001)
  • Enrollment / SpecialistBank / RoomState JSON schemas (copy-paste ready)

API + crate surface:

  • calibrate-serve HTTP endpoints (/api/v1/calibration/*, CORS) with request/response shapes
  • Public crate API (wifi-densepose-calibration: enrollment → features → specialists → bank → mixture/multistatic)

Pi 5 + Hailo tiering (verified on cognitum-v0: aarch64, cargo 1.96.0, HAILO10H, ruview-vitals-worker:50054):

  • Calibration service → Pi CPU (pure Rust, no BLAS/GPU — builds native aarch64)
  • The 6 specialists are microsecond CPU models — they do not need the Hailo HAT
  • The HAT is for the heavy tier (ADR-150 RF Foundation Encoder + neural pose → HEF), a documented follow-on
  • ⚠️ The Pi wlan0 is managed/no-nexmon → the Pi is a CSI processor, not a radio; CSI comes from the ESP32 nodes

5-step appliance integration plan: vendor the crate → wire the CSI source (tee the vitals-worker stream or point ESP32s at the calibration port) → run as embedded worker or calibrate-serve sidecar → expose /api/v1/calibration/* (+ small additive enroll/train endpoints) behind the appliance gateway → optional Hailo backbone tier.

This PR is the implementation; the doc is the spec — together they're the reference to base the appliance integration on. Status: 54 tests, hardware-validated (breathing read live; multistatic node-id fusion). Known follow-ups (phase carrier, RVF/HNSW, enroll/train HTTP endpoints, Hailo backbone, federation) are listed in the doc §7.

@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Review + Pi-5 validation + optimization

Reviewed the calibration crate + calibrate-serve and validated on the aarch64 (Pi 5) target.

✅ Code review — sound, honest milestone

  • CSI parser (parse_csi_packet) is properly bounds-checked — 20-byte header guard, magic check before indexing, n_pairs*2 payload-length guard before the IQ loop; n_antennas/n_subcarriers are u8 so no usize overflow, and RECV_BUF=2048 caps the datagram. No panic on malformed/truncated UDP.
  • Recorder is streaming (Welford) — no per-frame buffering, no unbounded memory on long captures.
  • STALE + anomaly-veto are real and tested, not aspirational (runtime.rs, bank.rs, multistatic.rs). The 6 specialists' math is clean — every divide guarded (r0<=1e-6, span.max(1e-3), best_lag==0), no NaN/÷0/overflow found.
  • Single-UDP-owner + bounded mpsc(8) + RwLock snapshot design is deadlock-free.

⛔ Must-fix before any LAN exposure (--http-bind 0.0.0.0)

  1. No auth on any route + permissive CORS, while the --http-bind help text invites 0.0.0.0. Add a bearer token before exposing.
  2. room_id path traversalcalibrate_api.rs:377 interpolates client-supplied room_id straight into the baseline write path ({output_dir}/{room_id}-{uuid}.bin); ../ or / is a file-write primitive. Sanitize to [A-Za-z0-9_-].

⚠️ Should-fix

  • Per-frame status.write().await + SessionStatus clone under UDP flood → CPU starvation of the tick/command path (calibrate_api.rs:316). Throttle the snapshot to the 200 ms tick.
  • std::fs::write (blocking) on the async ingest task during finalize — move to spawn_blocking/tokio::fs.
  • ADR §2.4 says STALE fires on "baseline drift beyond τ", but the code triggers on baseline_id string inequality, not a drift threshold — tighten the ADR wording or add the drift comparison.

🔬 Pi-5 (aarch64) validation

  • cross test -p wifi-densepose-calibration --target aarch64-unknown-linux-gnu → 35/35 pass under qemu-aarch64 (specialists, extractors, runtime veto, multistatic, STALE). Algorithms are correct on the Pi architecture; ~0.03 s for the suite corroborates the "microseconds on Pi CPU" claim.
  • Build blocker (also the optimization): the full calibrate-serve CLI fails to cross-compile to aarch64openssl-sys is pulled transitively by ort (ONNX runtime) → wifi-densepose-nnwifi-densepose-matwifi-densepose-cli. The calibration path is pure-CPU statistical and needs none of ort/ONNX/openssl.

🚀 Optimization recommendation

Feature-gate the ort/NN stack out of a calibration-only build (or switch the HTTP dep to rustls instead of native-tls). Today, deploying calibrate-serve to the appliance drags the entire ONNX runtime + OpenSSL — it won't cross-compile and bloats the binary, for zero calibration benefit (the spec itself says specialists "do not need the Hailo HAT"). A --no-default-features calibration build that excludes ort would cross-compile clean for the Pi and ship small.

Net: algorithms validated on-target, code is solid for a local-first milestone; gate the API auth/path-traversal before LAN exposure, and decouple calibrate-serve from ort so it actually cross-compiles for the appliance.

…h traversal, throttle

Resolves the review on #989:

- **Cross-compile (the appliance blocker):** make wifi-densepose-mat optional
  and feature-gate it (`mat`), so `cargo build -p wifi-densepose-cli
  --no-default-features` excludes the mat→nn→ort(ONNX)→openssl-sys chain.
  Verified: `cargo tree --no-default-features` shows 0 ort/openssl deps →
  calibration cross-compiles clean for the Pi.
- **Security (must-fix before LAN):**
  - `--token` / CALIBRATE_TOKEN bearer-auth middleware on every route; warns if
    bound non-loopback without a token.
  - sanitize client-supplied `room_id` to [A-Za-z0-9_-] (≤64) before it reaches
    the baseline write path — kills the `../` file-write primitive. + test.
- **Perf:** stop locking shared status + cloning SessionStatus on every UDP
  frame — counters/snapshot flush on the 200 ms tick instead (no CPU
  starvation under flood). finalize write moved to async `tokio::fs::write`.
- **Docs:** ADR-151 STALE wording matches the impl (baseline-id change;
  drift-threshold = P6 refinement); integration doc gets the
  `--no-default-features` build + auth/sanitize notes.

35 calibration + 15 CLI tests (no-default) / 20 CLI (default) pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Review addressed — pushed 0f8c5c91

Thanks for the on-target review (and the qemu-aarch64 run). All findings fixed:

⛔ Must-fix

  • Path traversal (calibrate_api.rs)room_id is now run through sanitize_room_id()[A-Za-z0-9_-], ≤64 chars, empty→default, before it touches {output_dir}/{room_id}-{uuid}.bin. ../, /, ..\ all neutralised. Unit-tested (sanitize_blocks_path_traversal).
  • No auth — added --token <T> / CALIBRATE_TOKEN; a from_fn_with_state middleware requires Authorization: Bearer <T> on every route. Server warns loudly if bound to a non-loopback address without a token. (Used an axum middleware rather than tower-http's ValidateRequestHeaderLayer::bearer, which isn't exposed in our 0.6 feature set.)

🚀 The optimization / aarch64 blocker (the big one)

  • wifi-densepose-mat is now optional + feature-gated (mat). cargo build -p wifi-densepose-cli --no-default-features drops the mat→nn→ort(ONNX)→openssl-sys chain entirely.
  • Verified: cargo tree -p wifi-densepose-cli --no-default-features0 ort/openssl-sys deps. The appliance build is now cargo build -p wifi-densepose-cli --no-default-features --release (documented in the integration doc §6).

⚠️ Should-fix

  • Per-frame status.write() flood — removed. The frame path no longer takes the lock or clones SessionStatus; frames_seen/last_frame_ms are local counters and the snapshot is flushed once per 200 ms tick. Completion check stays per-frame (cheap).
  • Blocking std::fs::write in finalize — now tokio::fs::write(...).await.
  • STALE wording — ADR-151 §2.4 updated: the impl triggers on baseline-id change (conservative; any re-baseline = new id), and the drift-threshold trigger is called out as the P6 refinement.

Tests

35 calibration + 15 CLI (--no-default-features) / 20 CLI (default). Both feature configs build.

Ready for the appliance pull — --no-default-features is the Pi build, --token gates the API.

@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

✅ Re-verified on commit 0f8c5c91 — all findings fixed, + live real-CSI run on the Pi 5

Pulled the addressed commit and re-validated on cognitum-v0 (real aarch64 Pi 5):

Fixes confirmed:

  • --no-default-features drops ort/openssl-syscargo build -p wifi-densepose-cli --no-default-features --target aarch64-unknown-linux-gnu now succeeds (EXIT 0, 45 s, 2.1 MB aarch64 ELF with the calibrate-serve API). The appliance build is unblocked — this was the headline.
  • sanitize_room_id() gates the write path; bearer-token middleware verified live (health → no token vs Bearer token both exercised on the Pi).
  • 35/35 calibration tests pass under qemu-aarch64.

Live real-CSI end-to-end (not synthetic): tapped the appliance's real nexmon CSI (ADR-018 v6, 0xC5110006, from cog-csi-nexmon-adapter) via a fanout target, transcoded v6→ESP32 0xC5110001, and fed calibrate-serve:

  • 6,813 real frames ingested (streaming:true), baseline capture completed: 20 frames, 52 subcarriers (HT20), amp_mean 2.65 / amp_var 1.70 / phase_dispersion 0.565 → 860-byte ADR-135 baseline.
  • The motion quality-gate fired on real RF (motion_flagged:true, z_max 3.17) — correctly flagging a non-static room, exactly as designed.

Note for the appliance side: the V0's real CSI is nexmon ADR-018 v6 (0xC5110006), not ESP32 0xC5110001 — they're a 20-byte-header reframe apart (<IBBHBbb5sI vs <IBBBBHIbbI, same i8/i8 IQ). A ~30-line transcoder bridges them; worth folding a --source-format adr018v6 flag into calibrate-serve (or the appliance worker) so the Pi's own nexmon stream drives calibration directly without ESP32 nodes.

Net: code solid, fixes verified on-target, and the calibration pipeline runs on real Pi-5 cluster CSI. 👍

@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Scope clarification on the real-CSI run (so the record is precise)

To be accurate about what my Pi-5 run did and didn't cover:

What was validated end-to-end on real cluster CSI: the baseline-capture stage only (Stage 1 of the baseline → enroll → extract → train pipeline). The live nexmon stream (ADR-018 v6) was transcoded to 0xC5110001, calibrate-serve ingested 6,813 real frames, and finalized an 860-byte ADR-135 baseline (52 SC HT20). That proves the capture path + the HTTP API + the auth gate work on-target.

What was NOT done — i.e. this is not a full room calibration:

  • The captured baseline came back motion_flagged: true (z_max 3.17) — a non-clean room, so it's a pipeline-validation artifact, not a usable reference baseline. A real calibration needs a quiet empty-room capture where the motion gate passes.
  • No enroll (anchors) and no train-room were run — so no specialist bank was produced or runtime-validated against live CSI. The specialists are validated only by the (excellent) unit tests + the qemu-aarch64 run, not by a live enroll→train→infer loop.

Build optimization (the --no-default-features decoupling of ort/openssl) is confirmed working on aarch64 — credit to your 0f8c5c91; I verified it, didn't author it.

Net: capture-stage + API + auth are proven on real Pi-5 CSI; the enroll/train stages and a clean-baseline calibration remain unproven on-target. The cleanest way to close that on the appliance is the --source-format adr018v6 flag I suggested (drives calibration from the Pi's own nexmon, no ESP32/transcoder) — happy to prototype it against this branch if useful.

ruvnet added 2 commits June 9, 2026 13:30
Plain-language overviews + feature lists, comparison tables (symbolic graph vs
predictive occupancy; graph vs grid vs event-log), usage, and technical
details. Adds readme = "README.md" to both manifests so they render on
crates.io on the next release.

Co-Authored-By: claude-flow <ruv@ruv.net>
Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Agreed on every point — that's the precise scope and I don't want the record overstating it. Let me harmonize the two validation environments so the matrix is unambiguous:

Stage Your Pi-5 (real nexmon→0xC5110001, 6,813 frames) My ESP32-S3 (COM8, edge_tier=0) qemu-aarch64 / unit
baseline capture ✅ + API + auth gate ✅ (smaller, 120-frame)
clean empty-room baseline motion_flagged (z_max 3.17) — artifact, not a reference ❌ same (occupied room)
enroll (anchors) ❌ (needs operator poses) logic ✅
train-room (bank) logic ✅ (35 tests)
runtime infer ❌ on-target ◐ single-node breathing ~16–31 BPM via room-watch (not a trained bank — stateless breathing head) + multistatic node-id grouping veto/STALE/fusion ✅

So: nothing has done a clean enroll → train-room → room-watch against live CSI on either box. The specialists are unit + qemu validated only; the one live runtime signal (breathing on my box) used the stateless head, not a trained bank. I'll make sure the PR body/ADR §7 say exactly that rather than implying a full calibration ran.

On --source-format adr018v6 — yes, that's the right call, and it's strictly better than my ESP32+transcoder path for the appliance: drive calibrate/calibrate-serve/enroll straight from the Pi's own nexmon stream, no transcoder hop. It slots in cleanly next to the existing parse_csi_packet as an alternate decoder selected by a --source-format {esp32c511|adr018v6} flag. If you share the ADR-018 v6 frame layout (or the transcoder you used for the 6,813-frame run), I'll wire the decoder + the flag against this branch — then the close-the-gap test is concrete: quiet empty-room calibrate (motion gate must pass) → enroll the 8 anchors → train-roomroom-watch, all on the Pi off nexmon. Happy to take the decoder if you take the on-Pi enroll run, or I can do both if you drop me the format.

Net: capture + API + auth proven on real Pi-5 CSI ✅; clean-baseline + enroll/train/infer are the remaining on-target gap, and --source-format adr018v6 is the clean way to close it.

ruvnet added 4 commits June 9, 2026 13:35
…clean enroll→train→infer not yet on-target)

Aligns ADR-151 §7 + the appliance integration doc with the PR #989 scope
clarification: nothing has run a clean baseline → enroll → train → infer on
live CSI; the live breathing read used the stateless head, not a trained bank.
Adds --source-format adr018v6 to the backlog.

Co-Authored-By: claude-flow <ruv@ruv.net>
…I window)

Adds a live RoomState readout over HTTP — the appliance UI's main need. The
ingest task maintains a rolling per-frame scalar window (flushed on the 200 ms
tick, no per-frame lock); the handler loads a bank (resolved as a sanitized
name under output_dir — same path-traversal defense as room_id), runs the
MixtureOfSpecialists over the window, returns RoomState JSON.

Validated live (ESP32-S3 via relay): breathing 14-19 BPM over HTTP; a
bank=../../etc/passwd query is neutralized to 'etcpasswd' (no traversal).

Co-Authored-By: claude-flow <ruv@ruv.net>
…ke_case

- POST /api/v1/room/train: { room_id, baseline_id, anchors[] } → trains a
  SpecialistBank and persists it as <output_dir>/<room_id>.json (path-sanitized),
  readable via /room/state?bank=<room_id>. Completes the HTTP train→infer loop.
- Fix data-contract bug: AnchorLabel serialized as PascalCase variant names
  (serde default) while as_str() + the integration doc used snake_case. Added
  #[serde(rename_all = "snake_case")] so the JSON wire format matches the
  documented contract (empty/stand_still/…). Locked with a roundtrip test.

Validated live (ESP32-S3): POST train (4 anchors → 6 specialists, persisted) →
GET /room/state returns RoomState with the trained presence/restlessness; the
synthetic-vs-real scale mismatch correctly triggers the anomaly veto. 36
calibration tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
…enroll/status)

Closes the last HTTP gap — the appliance can now drive the ENTIRE calibration
pipeline over HTTP without the CLI:
  baseline (start/stop) -> enroll/anchor x8 -> room/train -> room/state

- POST /enroll/anchor { room_id, baseline, label, duration_s? }: the ingest task
  loads the baseline (sanitized name under output_dir), captures the anchor for
  the duration against it (AnchorRecorder + per-frame series), runs the quality
  gate, and on completion replies with the verdict + accumulates the AnchorFeature
  in an in-server enrollment map keyed by room_id. Re-prompts on rejection.
- GET /enroll/status?room=<id>: accepted anchors, next, complete.
- POST /room/train now falls back to the in-server enrollment when anchors[] is
  omitted.

Validated live (ESP32-S3): capture baseline -> enroll stand_still (271 frames,
6s) -> gate correctly rejects "no person detected (presence_z 0.90 < 1.50)"
relative to a same-occupancy baseline (a clean empty-room baseline is the
documented on-target prerequisite). Builds clean; CLI tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet

ruvnet commented Jun 9, 2026

Copy link
Copy Markdown
Owner Author

Update — full pipeline now drivable over HTTP (no CLI needed)

Closed the enroll/train HTTP gap from my earlier reply. calibrate-serve now exposes the entire baseline → enroll → extract → train → runtime loop:

Method Path Purpose
GET /api/v1/calibration/health ingest stats
POST/GET/POST …/calibration/{start,status,stop} baseline capture
GET …/calibration/{result,baselines} finalized baseline + list
POST /api/v1/enroll/anchor {room_id,baseline,label,duration_s?} → capture one guided anchor against a baseline; gate verdict + progress
GET /api/v1/enroll/status?room=<id> accepted anchors, next, complete
POST /api/v1/room/train {room_id,baseline_id,anchors[]?} → train + persist bank (anchors optional if enrolled in-server)
GET /api/v1/room/state?bank=<name> live mixture-of-specialists RoomState over the CSI window

All hardened the same way: bearer auth (--token), path-traversal sanitization on every client-supplied name (room_id, bank, baseline), 200 ms-throttled status, async writes. Validated live on the ESP32:

  • POST /room/train (4 anchors → 6 specialists, persisted) → GET /room/state returns RoomState.
  • POST /enroll/anchor (271 frames, 6 s) → gate correctly rejects presence_z 0.90 < 1.50 against a same-occupancy baseline.
  • Data-contract fix: AnchorLabel JSON is now snake_case (test-locked) — matches §3.3.

Remaining gaps are external-gated, not code:

  1. A clean empty-room baseline → enroll→train→infer run on-target (needs the room emptied + the operator performing the 8 poses).
  2. --source-format adr018v6 to drive from the Pi's own nexmon (needs the frame layout from your 6,813-frame run).
  3. ADR-150 Hailo backbone; phase-based breathing carrier; RVF/HNSW persistence.

The integration doc §4/§6 is updated with the full endpoint set + the "drive everything over HTTP" flow.

ruvnet added 2 commits June 9, 2026 14:34
…points

Factor the router into build_router() (shared by execute + tests) and add
tower-oneshot integration tests (no network/ingest needed):
- health + descriptor → 200
- POST /room/train persists the bank; GET /room/state → 200; train with no
  anchors/enrollment → 400
- path-traversal: /room/state?bank=../../etc/passwd → 404 (sanitized, never
  reads outside output_dir)
- enroll/status empty; /enroll/anchor with an unknown label → 400

CI regression coverage for the endpoints added this session. 18 CLI tests pass.

Co-Authored-By: claude-flow <ruv@ruv.net>
…--no-default-features`

Making wifi-densepose-mat optional in the CLI (for the aarch64/ort decouple)
exposed a latent feature bug: mat's `api` module compiles unconditionally and
uses serde, but `serde` was an optional dep enabled only via the `api`/`serde`
features. Previously the CLI's *unconditional* mat dependency enabled those
features transitively, so `--workspace --no-default-features` still got serde;
once mat became optional+gated, the workspace build lost it →
`error[E0432]: unresolved import serde` across mat's api/* (CI red).

mat already pulls serde_json + axum unconditionally, so making `serde`
non-optional has no real cost and restores the workspace build. Does NOT affect
the aarch64 CLI build (mat isn't built there at all): verified
`cargo tree -p wifi-densepose-cli --no-default-features` still shows 0
ort/openssl deps, and `cargo test --workspace --no-default-features` compiles
clean.

Co-Authored-By: claude-flow <ruv@ruv.net>
…erge)

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant