ADR-079 P8 follow-up: PCK@20 3.0% → ≥35% requires more paired data + multi-room framing

## Status (after today's cog v0.0.1 ship — see PRs #642, #643, #644)

ADR-079 P7 (data collection), P8 (alignment + train + eval) and **cog packaging end-to-end** all ran today. The pipeline is validated and a signed `cog-pose-estimation@0.0.1` binary is live at `gs://cognitum-apps/cogs/{arm,x86_64}/`, installed on cognitum-v0. The remaining work for a *useful* model is data-bound.

| Lever | Current (v0.0.1) | Target |
|-------|------------------|--------|
| Paired sample count | 1,077 | 30,000+ (multiple 30-min sessions × full-body framing) |
| Camera framing | torso-up at desk (avg n_visible 14.3/17) | full-body, varied movements, multiple rooms |
| Avg detection confidence | 0.476 | ≥ 0.7 |
| Training epochs | 400 (Candle CUDA, 2.1 s on RTX 5080) | 1000+ if needed (still seconds on the GPU) |
| **PCK@20** | **3.0%** | **≥ 35%** |
| **PCK@50** | **18.5%** | ≥ 60% |
| MPJPE (normalized) | 0.093 | < 0.05 |

## What the v0.0.1 numbers tell us

Per-joint PCK@50 ranks show the model is learning where the camera lets it:

```
r_hip       76.9%   ← excellent (right side most consistently in frame)
r_knee      35.2%
l_hip       27.3%
l_elbow     26.4%
l_wrist     24.1%
l_knee      20.8%
r_shoulder  19.9%
...
nose          5.1%   ← essentially random (face joints at desk-level zoom)
l_ankle       7.9%
r_ankle       9.3%
```

The asymmetry is a direct reading of the seated-at-desk camera framing — not a model defect. CSI at 56 subcarriers × 20 frames carries enough spatial info for proximal joints with consistent visibility; it doesn't carry enough for fine-grained extremities. More data won't fix that subcarrier-density bottleneck for fingertips / face, but multi-room full-body data will solve it for the 11 joints that today already show some signal.

## Suggested data-collection plan

1. **3 × 30-min sessions** with the camera backed up so head→ankles fits in frame. Different rooms (or different times of day for the same room) to give the model spatial diversity. Vary movements: walk pattern, arm raises, sit/stand transitions, squats, reaches, lying down.
2. Re-run `scripts/align-ground-truth.js` (now streaming-loader-safe per #641) to produce a multi-session paired set.
3. Train via the existing Candle pipeline on ruvultra's RTX 5080. Expected wall time: still well under a minute even for 30K samples / 1000 epochs.
4. Re-evaluate. PCK@20 should approach the 35% target if the framing + variety land.

## Optimizations available within the pipeline (do not require new data)

- LoRA cross-environment fine-tune (per ADR-079 P9). Today's encoder was random-initialized because the HF presence encoder's MLP architecture didn't match; with multi-room data we can train a real shared encoder first and then per-room LoRA adapters.
- Subcarrier attention weighting was already enabled (top-5: [33, 47, 50, 19, 16]).
- Stoer-Wagner min-cut multi-person separation enabled.

## Artifacts shipped today (for context)

- Signed v0.0.1 binaries: `gs://cognitum-apps/cogs/arm/cog-pose-estimation-arm` + `.../x86_64/...`.
- Trained model: `models/wifi-densepose-pretrained.safetensors` → `pose_v1.safetensors` (507 KB) + `pose_v1.onnx` (12 KB).
- Benchmark log: `docs/benchmarks/pose-estimation-cog.md`.
- Live install: `/var/lib/cognitum/apps/pose-estimation/` on cognitum-v0.

## Acceptance criteria for closing this issue

- [ ] Multi-room paired dataset ≥ 30K samples at avg conf ≥ 0.7 produced.
- [ ] PCK@20 ≥ 35% on a held-out time window from a different session.
- [ ] PCK@50 ≥ 60%.
- [ ] Per-joint PCK@50 ≥ 30% for at least 13 of 17 joints (face joints can lag).
- [ ] Re-release `cog-pose-estimation@0.1.0` with the new weights (no code change required — same Candle inference path, just better weights).

🤖 Generated with [claude-flow](https://github.com/ruvnet/claude-flow)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-079 P8 follow-up: PCK@20 3.0% → ≥35% requires more paired data + multi-room framing #645

Status (after today's cog v0.0.1 ship — see PRs #642, #643, #644)

What the v0.0.1 numbers tell us

Suggested data-collection plan

Optimizations available within the pipeline (do not require new data)

Artifacts shipped today (for context)

Acceptance criteria for closing this issue

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Lever	Current (v0.0.1)	Target
Paired sample count	1,077	30,000+ (multiple 30-min sessions × full-body framing)
Camera framing	torso-up at desk (avg n_visible 14.3/17)	full-body, varied movements, multiple rooms
Avg detection confidence	0.476	≥ 0.7
Training epochs	400 (Candle CUDA, 2.1 s on RTX 5080)	1000+ if needed (still seconds on the GPU)
PCK@20	3.0%	≥ 35%
PCK@50	18.5%	≥ 60%
MPJPE (normalized)	0.093	< 0.05

ADR-079 P8 follow-up: PCK@20 3.0% → ≥35% requires more paired data + multi-room framing #645

Description

Status (after today's cog v0.0.1 ship — see PRs #642, #643, #644)

What the v0.0.1 numbers tell us

Suggested data-collection plan

Optimizations available within the pipeline (do not require new data)

Artifacts shipped today (for context)

Acceptance criteria for closing this issue

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions