feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) by ruvnet · Pull Request #695 · ruvnet/RuView

ruvnet · 2026-05-21T22:56:49Z

Phase 2 of ADR-103. Trained count head on the existing 1,077 paired samples (same data that produced pose_v1 yesterday).

Honest result

Metric	Value
Best eval accuracy	65.1%
Within ±1	100% (labels span {0,1}, trivially satisfied)
MAE	0.349
Class 0 (empty) accuracy	100% (140 samples)
Class 1 (person present) accuracy	0% (75 samples)
Confidence↔correctness Spearman	0.023

Model overfit by epoch ~100; the 'best' checkpoint predicts the eval-window class distribution rather than a real classifier. Same data-bound failure mode as pose_v1 (#645). v0.0.1 ships the pipeline + a working artifact + honest numbers; usable counts wait on multi-room paired data.

What v0.0.1 still validates

PyTorch → safetensors → Candle Rust loads cleanly. cog-person-count health reports backend: candle-cpu (not stub), architecture parity bit-exact.
ONNX export bit-clean (16 KB, opset 18, dynamic batch).
Training wall time: 5.6 s for 400 epochs on RTX 5080.
All 15 tests still pass.

Files

scripts/align-ground-truth.js — extended to emit n_persons_mode + n_persons_max per window. Backwards-compatible additive fields.
scripts/train-count.py — new. Mirrors CountNet exactly; CE+BCE+Brier loss; safetensors+ONNX export.
v2/.../cog/artifacts/{count_v1.safetensors, count_v1.onnx, count_train_results.json} — the artifacts.
v2/.../cog/README.md — Status updated with v0.0.1 numbers + honest-caveat section.
docs/benchmarks/person-count-cog.md — new benchmark log mirroring the pose-cog format.

🤖 Generated with claude-flow

…DR-103) Phase 2 of ADR-103: trained count head on the existing 1,077 paired samples (the same data that produced pose_v1 yesterday). Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on the held-out time-window. Per-class: 100% on "empty room" / 0% on "1 person". The model overfit by epoch 100 (train_acc → 1.0, eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the snapshot that happened to predict the eval window's class distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode as pose_v1 (#645), bounded by single-session training data; same fix path (multi-room). What v0.0.1 still validates end-to-end: * PyTorch → safetensors → Candle Rust loads cleanly on first try. `cog-person-count health` reports `backend: candle-cpu` and emits real per-frame predictions instead of the stub backend's hard-coded {1 person, 0 confidence}. Architecture parity between train-count.py and src/inference.rs::CountNet is bit-exact. * ONNX export bit-clean (16 KB, opset 18, dynamic batch axis). * Training wall time: 5.6 s for 400 epochs on RTX 5080. * Binary size unchanged (2.36 MB stripped), model loads via mmap at runtime. This commit ships: * scripts/align-ground-truth.js: extended to emit n_persons_mode + n_persons_max per window so the training pipeline has count labels. Backwards-compatible (additive fields). * scripts/train-count.py: new — mirrors CountNet architecture exactly, loads paired.jsonl, trains 400 epochs with CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON. * v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx, count_train_results.json}: the trained artifacts. * v2/.../cog/README.md: Status table updated with the v0.0.1 numbers + an Honest Caveat section explaining the data-bound result. * docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark log mirroring the format docs/benchmarks/pose-estimation-cog.md established. Includes comparison to ADR-103 v0.1.0 acceptance gates and per-class breakdown. Still pending: * `run` subcommand wiring (long-running polling loop, same as pose) * Cross-compile + sign + GCS upload (mirror of pose cog pipeline) * Live install on cognitum-v0 * v0.2.0: re-train on multi-room data, LoRA per-room adapters, Stoer-Wagner min-cut clip in fusion stage

…al (#697) Phase 4 of ADR-103. Adds the long-running polling loop so the cog's fourth verb (`run`) does real work, completing the ADR-100 runtime contract end-to-end: cog-person-count version → "person-count 0.3.0" cog-person-count manifest → JSON skeleton cog-person-count health → loads weights + 1-shot infer + emit cog-person-count run --config → long-running per-frame emit ← THIS What ships: * src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms, slides a [56, 20] CSI window, runs InferenceEngine::infer, emits publisher::person_count events. Same shape as cog-pose-estimation::runtime — fetch_frame extracts amplitudes from `snapshot.nodes[0].amplitude[]`, fails open on connect errors with a WARN log rather than crashing. * src/lib.rs — registers the runtime module. * src/main.rs — cmd_run now loads RunConfig from a JSON file, builds the InferenceEngine (with weights if cfg.model_path is set, otherwise auto-discover), emits a run.started event, and hands off to the Tokio multi-thread runtime's block_on(run_loop). Single-node fusion is a no-op for N=1 today; v0.2.0 will append predictions from sibling nodes and call fusion::fuse_confidence_weighted before emit. Verified locally: cargo check -p cog-person-count --no-default-features → clean cargo test -p cog-person-count → 15/15 pass (no regressions) cargo build -p cog-person-count --release → 2.36 MB unchanged ./cog-person-count run --config bad-config.json: line 1: {"event":"run.started","fields":{"cog":"person-count", "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100, "model_path":"(auto-discover)"}} line 2: WARN sensing-server fetch failed error=Connection Failed: Connect error: actively refused (loop alive — exits cleanly on SIGTERM, no crash, no NaN) Also adds a "Relationship to the in-process score_to_person_count heuristic" section to cog/README.md explaining the dual-emitter design (sensing-server keeps emitting the PR #491 slot heuristic; the cog runs out-of-process and emits person.count events from the learned model). Operators choose by installing the cog or not — no sensing-server rebuild required. ADR-103 §"Migration" status: 1. Land ADR + scaffold ........... done (#693, #694) 2. Train count_v1 ................ done (#695) 3. Cross-compile + sign + GCS .... done (#696) 4. Server-side wiring ............ done — out-of-process design means no rewire needed; this cog is the wiring. 5. v0.2.0 multi-room + LoRA ...... data-bound (#645)

ruvnet merged commit 6b4994e into main May 21, 2026
12 checks passed

ruvnet deleted the feat/cog-person-count-train branch May 21, 2026 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103)#695

feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103)#695
ruvnet merged 1 commit into
mainfrom
feat/cog-person-count-train

ruvnet commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented May 21, 2026

Honest result

What v0.0.1 still validates

Files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant