feat(ai): combined Netflix + KoNViD-1k tiny-AI trainer driver by lusoris · Pull Request #180 · lusoris/vmaf

lusoris · 2026-04-28T19:49:06Z

Summary

New `ai/train/train_combined.py` that concatenates `NetflixFrameDataset` (9-source Netflix Public corpus) and `KoNViDPairDataset` (KoNViD-1k synthetic-distortion FR pairs) into one training matrix and feeds it through the same `_build_model` + `_train_loop` + `export_onnx` pipeline as `ai/train/train.py`. Model factory and ONNX layout stay identical to the canonical mlp_small / mlp_medium / linear baselines.
Five `--val-mode` options: `netflix-source` (default; mirrors ADR-0203), `konvid-holdout` (deterministic 10% of KoNViD clip keys, whole-clip granularity so no frame leakage), `netflix-source-and-konvid-holdout` (union), and the single-corpus `netflix-only` / `konvid-only` fallbacks.
Addresses Research-0023 §5: the FoxBird-class outlier (LOSO PLCC ~0.94 vs 0.99+ on the other 8 sources) is a content-distribution problem. KoNViD-1k adds 1 200 UGC clips (~17× clip count) on top of the existing 70 Netflix dis-pairs to broaden the high-motion / heavy-grain coverage.
Stacks on PR feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 (KoNViD loader bridge). Base is `feat/konvid-1k-loader`; merge target flips to `master` once feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 lands.

Deep-dive deliverables (ADR-0108)

no digest needed: covered by Research-0023 §5 (FoxBird-class variance source).
no alternatives: only-one-way fix (pure engineering follow-up under ADR-0203's roof).
AGENTS.md invariant note — captured in `docs/rebase-notes.md` entry 0074.
Reproducer / smoke-test command — `pytest ai/tests/test_train_combined_smoke.py` (5 cases, no libvmaf required).
CHANGELOG.md entry — Unreleased § Added.
Rebase note — `docs/rebase-notes.md` entry 0074; documents the stack-on-feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 ordering.

Test plan

`pytest ai/tests/test_train_combined_smoke.py -v` — 5 passed (key-splitter determinism + disjointness + `--epochs 0` paths under three corpus-availability scenarios), no libvmaf or real corpus required.
`pre-commit run --files ` — black + isort + ruff + shellcheck + secret detection green.
Full training run — gated on PR feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178's KoNViD acquisition completing (1200-clip parquet currently being built at `/tmp/konvid_acquire.log`, ETA ~30 min). Once done: `python ai/train/train_combined.py --epochs 30 --model-arch mlp_small --val-mode netflix-source-and-konvid-holdout` and report PLCC / SROCC on the union val split.
Cross-corpus FoxBird recheck — once full run lands, evaluate the combined model against the 9-fold LOSO eval harness (PR feat(ai): tiny-AI 3-arch LOSO evaluation harness + Research-0023 #176) to see whether mixing in KoNViD reduces the FoxBird outlier delta.

🤖 Generated with Claude Code

Concatenates NetflixFrameDataset and KoNViDPairDataset into one training matrix, reusing _build_model + _train_loop + export_onnx from ai/train/train.py so the model factory and ONNX layout stay identical to the canonical baselines. Five validation modes via --val-mode: * netflix-source (default) — mirrors ADR-0203 (Tennis hold-out). * konvid-holdout — deterministic 10 % of KoNViD clip keys, whole- clip granularity (no frame leakage). * netflix-source-and-konvid-holdout — union of both. * netflix-only / konvid-only — single-corpus fallbacks. Addresses Research-0023 §5: the FoxBird-class outlier needs a broader content distribution; KoNViD-1k adds 1 200 UGC clips on top of the existing 70 Netflix dis-pairs (~17× clip count). Stacks on PR #178 (KoNViD loader bridge); rebase order is 0073 → 0074. Smoke test (5 cases, no libvmaf required) covers the key splitter, --epochs 0 export path, and missing-data fallbacks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Empirical close of Research-0023 §5's open question on the FoxBird per-fold outlier (LOSO PLCC ≈ 0.93 vs ≥ 0.99 on the other 8 Netflix sources). Canonical combined-trainer run (mlp_small, 30 epochs, val=Tennis + 10% KoNViD-holdout, seed=0) on the union of the Netflix Public 9-source corpus and the 1200-clip KoNViD-1k parquet produces an ONNX whose FoxBird metrics dramatically improve over Netflix-only baselines: * FoxBird PLCC: 0.9936 (vs 0.9632 vmaf_tiny_v1.onnx baseline) — +3.04 percentage points absolute, moving FoxBird from a 0.93- class outlier to a 0.99+-class clip. * FoxBird RMSE: 17.296 → 3.216 (5.4× lower). * No regression on Netflix-native sources: PLCC ≥ 0.998 on 7/9 clips, Tennis (formal val) at 0.9966. Validates PR #178 (KoNViD acquisition) + PR #180 (combined trainer driver) infrastructure end-to-end. Closes Research-0023 §5 unblocker question — KoNViD-1k is sufficient for this failure mode; no need to acquire BVI-DVC or AOM-CTC. Caveats: per-clip numbers are training-fit, not held-out generalisation. Proper validation (LOSO on combined corpus with each Netflix source held out) is the natural follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ombined training (#183) * docs(research): Research-0025 — FoxBird outlier resolved via KoNViD Empirical close of Research-0023 §5's open question on the FoxBird per-fold outlier (LOSO PLCC ≈ 0.93 vs ≥ 0.99 on the other 8 Netflix sources). Canonical combined-trainer run (mlp_small, 30 epochs, val=Tennis + 10% KoNViD-holdout, seed=0) on the union of the Netflix Public 9-source corpus and the 1200-clip KoNViD-1k parquet produces an ONNX whose FoxBird metrics dramatically improve over Netflix-only baselines: * FoxBird PLCC: 0.9936 (vs 0.9632 vmaf_tiny_v1.onnx baseline) — +3.04 percentage points absolute, moving FoxBird from a 0.93- class outlier to a 0.99+-class clip. * FoxBird RMSE: 17.296 → 3.216 (5.4× lower). * No regression on Netflix-native sources: PLCC ≥ 0.998 on 7/9 clips, Tennis (formal val) at 0.9966. Validates PR #178 (KoNViD acquisition) + PR #180 (combined trainer driver) infrastructure end-to-end. Closes Research-0023 §5 unblocker question — KoNViD-1k is sufficient for this failure mode; no need to acquire BVI-DVC or AOM-CTC. Caveats: per-clip numbers are training-fit, not held-out generalisation. Proper validation (LOSO on combined corpus with each Netflix source held out) is the natural follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(research): Research-0025 — add LOSO-on-combined sweep section (held-out validation) The §"Per-clip result" table was training-fit (FoxBird in train set). Adds a new §"LOSO sweep on combined corpus" with the proper held-out 9-fold sweep on the combined corpus (each Netflix source held out for its fold + 90% of KoNViD shared, seed=0). Headline numbers (held-out, not training-fit): * Mean PLCC across 9 folds: **0.9966 ± 0.0038** (vs Research-0023 Netflix-only LOSO: 0.9808 ± 0.0214 — std 5.6× tighter) * FoxBird held-out fold PLCC: **0.9932** (vs Research-0023 mlp_small Netflix-only LOSO ≈ 0.93) * Mean SROCC: 0.9984 ± 0.0014 (vs 0.9848 ± 0.0176) The 5.6× drop in PLCC standard deviation across folds is the most significant finding — adding KoNViD-1k eliminates content-distribution variance, not just the FoxBird outlier specifically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lusoris <lusoris@pm.me> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lusoris force-pushed the feat/konvid-1k-loader branch from e9128b8 to e26db2e Compare April 28, 2026 19:54

lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from aee3b93 to 5883efa Compare April 28, 2026 19:55

lusoris force-pushed the feat/konvid-1k-loader branch 2 times, most recently from 94fc8ae to 6c1ee5c Compare April 28, 2026 20:21

lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from 5883efa to 425aced Compare April 28, 2026 20:22

lusoris mentioned this pull request Apr 28, 2026

docs(research): Research-0025 — FoxBird outlier resolved via KoNViD combined training #183

Merged

7 tasks

lusoris force-pushed the feat/konvid-1k-loader branch from 6c1ee5c to ca1e775 Compare April 28, 2026 20:56

Base automatically changed from feat/konvid-1k-loader to master April 28, 2026 21:28

lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from 425aced to bd5785c Compare April 28, 2026 21:28

lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from bd5785c to 7783fab Compare April 28, 2026 22:09

lusoris merged commit a143e25 into master Apr 28, 2026
49 checks passed

lusoris deleted the feat/combined-trainer-netflix-konvid branch April 28, 2026 22:28

github-actions Bot mentioned this pull request Apr 28, 2026

chore: release master #1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ai): combined Netflix + KoNViD-1k tiny-AI trainer driver#180

feat(ai): combined Netflix + KoNViD-1k tiny-AI trainer driver#180
lusoris merged 1 commit intomasterfrom
feat/combined-trainer-netflix-konvid

lusoris commented Apr 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lusoris commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Deep-dive deliverables (ADR-0108)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lusoris commented Apr 28, 2026 •

edited

Loading