Skip to content

feat(ai): combined Netflix + KoNViD-1k tiny-AI trainer driver#180

Merged
lusoris merged 1 commit intomasterfrom
feat/combined-trainer-netflix-konvid
Apr 28, 2026
Merged

feat(ai): combined Netflix + KoNViD-1k tiny-AI trainer driver#180
lusoris merged 1 commit intomasterfrom
feat/combined-trainer-netflix-konvid

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented Apr 28, 2026

Summary

  • New `ai/train/train_combined.py` that concatenates `NetflixFrameDataset` (9-source Netflix Public corpus) and `KoNViDPairDataset` (KoNViD-1k synthetic-distortion FR pairs) into one training matrix and feeds it through the same `_build_model` + `_train_loop` + `export_onnx` pipeline as `ai/train/train.py`. Model factory and ONNX layout stay identical to the canonical mlp_small / mlp_medium / linear baselines.
  • Five `--val-mode` options: `netflix-source` (default; mirrors ADR-0203), `konvid-holdout` (deterministic 10% of KoNViD clip keys, whole-clip granularity so no frame leakage), `netflix-source-and-konvid-holdout` (union), and the single-corpus `netflix-only` / `konvid-only` fallbacks.
  • Addresses Research-0023 §5: the FoxBird-class outlier (LOSO PLCC ~0.94 vs 0.99+ on the other 8 sources) is a content-distribution problem. KoNViD-1k adds 1 200 UGC clips (~17× clip count) on top of the existing 70 Netflix dis-pairs to broaden the high-motion / heavy-grain coverage.
  • Stacks on PR feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 (KoNViD loader bridge). Base is `feat/konvid-1k-loader`; merge target flips to `master` once feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 lands.

Deep-dive deliverables (ADR-0108)

  • no digest needed: covered by Research-0023 §5 (FoxBird-class variance source).
  • no alternatives: only-one-way fix (pure engineering follow-up under ADR-0203's roof).
  • AGENTS.md invariant note — captured in `docs/rebase-notes.md` entry 0074.
  • Reproducer / smoke-test command — `pytest ai/tests/test_train_combined_smoke.py` (5 cases, no libvmaf required).
  • CHANGELOG.md entry — Unreleased § Added.
  • Rebase note — `docs/rebase-notes.md` entry 0074; documents the stack-on-feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178 ordering.

Test plan

  • `pytest ai/tests/test_train_combined_smoke.py -v` — 5 passed (key-splitter determinism + disjointness + `--epochs 0` paths under three corpus-availability scenarios), no libvmaf or real corpus required.
  • `pre-commit run --files ` — black + isort + ruff + shellcheck + secret detection green.
  • Full training run — gated on PR feat(ai): KoNViD-1k → VMAF-pair acquisition + loader bridge #178's KoNViD acquisition completing (1200-clip parquet currently being built at `/tmp/konvid_acquire.log`, ETA ~30 min). Once done: `python ai/train/train_combined.py --epochs 30 --model-arch mlp_small --val-mode netflix-source-and-konvid-holdout` and report PLCC / SROCC on the union val split.
  • Cross-corpus FoxBird recheck — once full run lands, evaluate the combined model against the 9-fold LOSO eval harness (PR feat(ai): tiny-AI 3-arch LOSO evaluation harness + Research-0023 #176) to see whether mixing in KoNViD reduces the FoxBird outlier delta.

🤖 Generated with Claude Code

@lusoris lusoris force-pushed the feat/konvid-1k-loader branch from e9128b8 to e26db2e Compare April 28, 2026 19:54
@lusoris lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from aee3b93 to 5883efa Compare April 28, 2026 19:55
@lusoris lusoris force-pushed the feat/konvid-1k-loader branch 2 times, most recently from 94fc8ae to 6c1ee5c Compare April 28, 2026 20:21
@lusoris lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from 5883efa to 425aced Compare April 28, 2026 20:22
@lusoris lusoris force-pushed the feat/konvid-1k-loader branch from 6c1ee5c to ca1e775 Compare April 28, 2026 20:56
Base automatically changed from feat/konvid-1k-loader to master April 28, 2026 21:28
@lusoris lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from 425aced to bd5785c Compare April 28, 2026 21:28
Concatenates NetflixFrameDataset and KoNViDPairDataset into one
training matrix, reusing _build_model + _train_loop + export_onnx
from ai/train/train.py so the model factory and ONNX layout stay
identical to the canonical baselines.

Five validation modes via --val-mode:

* netflix-source (default) — mirrors ADR-0203 (Tennis hold-out).
* konvid-holdout — deterministic 10 % of KoNViD clip keys, whole-
  clip granularity (no frame leakage).
* netflix-source-and-konvid-holdout — union of both.
* netflix-only / konvid-only — single-corpus fallbacks.

Addresses Research-0023 §5: the FoxBird-class outlier needs a
broader content distribution; KoNViD-1k adds 1 200 UGC clips on
top of the existing 70 Netflix dis-pairs (~17× clip count).

Stacks on PR #178 (KoNViD loader bridge); rebase order is
0073 → 0074. Smoke test (5 cases, no libvmaf required) covers the
key splitter, --epochs 0 export path, and missing-data fallbacks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lusoris lusoris force-pushed the feat/combined-trainer-netflix-konvid branch from bd5785c to 7783fab Compare April 28, 2026 22:09
@lusoris lusoris merged commit a143e25 into master Apr 28, 2026
49 checks passed
@lusoris lusoris deleted the feat/combined-trainer-netflix-konvid branch April 28, 2026 22:28
@github-actions github-actions Bot mentioned this pull request Apr 28, 2026
lusoris pushed a commit that referenced this pull request Apr 28, 2026
Empirical close of Research-0023 §5's open question on the
FoxBird per-fold outlier (LOSO PLCC ≈ 0.93 vs ≥ 0.99 on the
other 8 Netflix sources).

Canonical combined-trainer run (mlp_small, 30 epochs, val=Tennis
+ 10% KoNViD-holdout, seed=0) on the union of the Netflix Public
9-source corpus and the 1200-clip KoNViD-1k parquet produces an
ONNX whose FoxBird metrics dramatically improve over Netflix-only
baselines:

* FoxBird PLCC: 0.9936 (vs 0.9632 vmaf_tiny_v1.onnx baseline) —
  +3.04 percentage points absolute, moving FoxBird from a 0.93-
  class outlier to a 0.99+-class clip.
* FoxBird RMSE: 17.296 → 3.216 (5.4× lower).
* No regression on Netflix-native sources: PLCC ≥ 0.998 on 7/9
  clips, Tennis (formal val) at 0.9966.

Validates PR #178 (KoNViD acquisition) + PR #180 (combined
trainer driver) infrastructure end-to-end. Closes Research-0023
§5 unblocker question — KoNViD-1k is sufficient for this failure
mode; no need to acquire BVI-DVC or AOM-CTC.

Caveats: per-clip numbers are training-fit, not held-out
generalisation. Proper validation (LOSO on combined corpus with
each Netflix source held out) is the natural follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lusoris pushed a commit that referenced this pull request Apr 28, 2026
Empirical close of Research-0023 §5's open question on the
FoxBird per-fold outlier (LOSO PLCC ≈ 0.93 vs ≥ 0.99 on the
other 8 Netflix sources).

Canonical combined-trainer run (mlp_small, 30 epochs, val=Tennis
+ 10% KoNViD-holdout, seed=0) on the union of the Netflix Public
9-source corpus and the 1200-clip KoNViD-1k parquet produces an
ONNX whose FoxBird metrics dramatically improve over Netflix-only
baselines:

* FoxBird PLCC: 0.9936 (vs 0.9632 vmaf_tiny_v1.onnx baseline) —
  +3.04 percentage points absolute, moving FoxBird from a 0.93-
  class outlier to a 0.99+-class clip.
* FoxBird RMSE: 17.296 → 3.216 (5.4× lower).
* No regression on Netflix-native sources: PLCC ≥ 0.998 on 7/9
  clips, Tennis (formal val) at 0.9966.

Validates PR #178 (KoNViD acquisition) + PR #180 (combined
trainer driver) infrastructure end-to-end. Closes Research-0023
§5 unblocker question — KoNViD-1k is sufficient for this failure
mode; no need to acquire BVI-DVC or AOM-CTC.

Caveats: per-clip numbers are training-fit, not held-out
generalisation. Proper validation (LOSO on combined corpus with
each Netflix source held out) is the natural follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lusoris added a commit that referenced this pull request Apr 28, 2026
…ombined training (#183)

* docs(research): Research-0025 — FoxBird outlier resolved via KoNViD

Empirical close of Research-0023 §5's open question on the
FoxBird per-fold outlier (LOSO PLCC ≈ 0.93 vs ≥ 0.99 on the
other 8 Netflix sources).

Canonical combined-trainer run (mlp_small, 30 epochs, val=Tennis
+ 10% KoNViD-holdout, seed=0) on the union of the Netflix Public
9-source corpus and the 1200-clip KoNViD-1k parquet produces an
ONNX whose FoxBird metrics dramatically improve over Netflix-only
baselines:

* FoxBird PLCC: 0.9936 (vs 0.9632 vmaf_tiny_v1.onnx baseline) —
  +3.04 percentage points absolute, moving FoxBird from a 0.93-
  class outlier to a 0.99+-class clip.
* FoxBird RMSE: 17.296 → 3.216 (5.4× lower).
* No regression on Netflix-native sources: PLCC ≥ 0.998 on 7/9
  clips, Tennis (formal val) at 0.9966.

Validates PR #178 (KoNViD acquisition) + PR #180 (combined
trainer driver) infrastructure end-to-end. Closes Research-0023
§5 unblocker question — KoNViD-1k is sufficient for this failure
mode; no need to acquire BVI-DVC or AOM-CTC.

Caveats: per-clip numbers are training-fit, not held-out
generalisation. Proper validation (LOSO on combined corpus with
each Netflix source held out) is the natural follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(research): Research-0025 — add LOSO-on-combined sweep section (held-out validation)

The §"Per-clip result" table was training-fit (FoxBird in train set).
Adds a new §"LOSO sweep on combined corpus" with the proper held-out
9-fold sweep on the combined corpus (each Netflix source held out for
its fold + 90% of KoNViD shared, seed=0).

Headline numbers (held-out, not training-fit):

* Mean PLCC across 9 folds: **0.9966 ± 0.0038**
  (vs Research-0023 Netflix-only LOSO: 0.9808 ± 0.0214 — std 5.6×
  tighter)
* FoxBird held-out fold PLCC: **0.9932**
  (vs Research-0023 mlp_small Netflix-only LOSO ≈ 0.93)
* Mean SROCC: 0.9984 ± 0.0014 (vs 0.9848 ± 0.0176)

The 5.6× drop in PLCC standard deviation across folds is the most
significant finding — adding KoNViD-1k eliminates content-distribution
variance, not just the FoxBird outlier specifically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Lusoris <lusoris@pm.me>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant