fix(align): stream JSONL + support sensing_update format (unblocks ADR-079 P8) by ruvnet · Pull Request #641 · ruvnet/RuView

ruvnet · 2026-05-19T18:50:43Z

Summary

Two real blockers found while running ADR-079 P7→P8 end-to-end for the first time against a 30-min paired session:

Node V8 string limit (~512 MB) on the 750 MB CSI recording. fs.readFileSync(_, 'utf8').split('\n') errored with `Cannot create a string longer than 0x1fffffe8 characters`. Replaced `loadJsonl` with a 1 MiB byte-buffer streaming reader that decodes line-by-line.
Schema mismatch with the current sensing-server. The aligner filtered on legacy `raw_csi` / `feature` types; the live server emits a single `sensing_update` record per tick (with `nodes[].amplitude` and top-level `features`). Result: 0 frames matched every time. Added a `sensing_update` branch that projects each tick into rawCsi/features entries the existing windowing logic can consume, and updated `extractCsiMatrix` to use already-extracted amplitudes when `iqHex` is absent. `timestamp` is now accepted as either ISO string or numeric float-seconds.

End-to-end verified: 1,077 paired samples produced at `--min-confidence 0.3 --window-frames 20`; downstream `train-wiflow-supervised.js` runs to completion.

The PCK gap that came out of this run (0% on every joint, more data + GPU needed) is tracked separately in #640 — those are training concerns, not aligner concerns.

Test plan

Aligner produces 1,077 paired samples (`[56, 20]` shape) from the 30-min P7 session
Memory stays bounded — no V8 string limit error
Training script consumes the paired output successfully end-to-end
Reviewer: spot-check that no schema fields were dropped

🤖 Generated with claude-flow

Two blockers discovered while running ADR-079 P7→P8 end-to-end against a 30-minute paired session (39,088 GT frames + 45,625 CSI frames): 1. `readFileSync(_, 'utf8').split('\n')` hit Node's `String.MaxLength` (~512 MB) on the 750 MB CSI recording. Result: Error: Cannot create a string longer than 0x1fffffe8 characters Replaced loadJsonl with a 1 MiB byte-buffer streaming reader that decodes line-by-line, so memory use stays bounded by the largest single record. 2. The sensing-server has long since switched from the legacy `raw_csi` / `feature` typed records to a single `sensing_update` record per tick (with nodes[].amplitude and top-level features). The aligner filtered on the old types and produced 0 frames every time. Added a `sensing_update` branch that projects each tick into rawCsi/features entries the existing windowing code can consume, and updated extractCsiMatrix to use already-extracted amplitudes when iqHex is absent. timestamp is now accepted as either ISO string (legacy) or numeric float-seconds (current). End-to-end verified: produces 1,077 paired samples at `--min-confidence 0.3 --window-frames 20` from the full 30-min recording; downstream `train-wiflow-supervised.js` runs to completion. See follow-up #640 for the PCK gap (data + GPU needed) — those are training concerns, not aligner concerns.

ruvnet merged commit ef20a72 into main May 19, 2026
13 checks passed

ruvnet deleted the fix/align-ground-truth-streaming-and-sensing-update branch May 19, 2026 18:51

ruvnet mentioned this pull request May 19, 2026

ADR-079 P8 follow-up: PCK@20 3.0% → ≥35% requires more paired data + multi-room framing #645

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(align): stream JSONL + support sensing_update format (unblocks ADR-079 P8)#641

fix(align): stream JSONL + support sensing_update format (unblocks ADR-079 P8)#641
ruvnet merged 1 commit into
mainfrom
fix/align-ground-truth-streaming-and-sensing-update

ruvnet commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented May 19, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant