feat(tools): vmaf-tune — VVenC + NNVC adapter (AI-augmented H.266)#368
Merged
feat(tools): vmaf-tune — VVenC + NNVC adapter (AI-augmented H.266)#368
Conversation
14 tasks
lusoris
pushed a commit
that referenced
this pull request
May 3, 2026
…s 17 adapters) Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)` and asks it for the FFmpeg argv slice via `adapter.ffmpeg_codec_args(preset, quality)` plus an optional `adapter.extra_params()`. Adapters that don't yet expose `ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so partial in-flight adapter PRs stay drivable end-to-end. `parse_versions(stderr, encoder=...)` selects a per-codec version probe (libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC, QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than raising. The `EncodeRequest.crf` field is preserved unchanged for the SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for adapter-side codec-agnostic vocabulary. Existing 13-test x264 suite still green; new 19-test multi-codec suite covers 9 representative codec shapes plus the unknown-codec / missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs (#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368 libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves) which can now drive end-to-end encodes without copying or mutating the harness. Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter contract" section, rebase-notes #228 invariant, CHANGELOG entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py as a Phase A codec adapter on the ADR-0237 contract. Drives Fraunhofer HHI's VVC / H.266 encoder via FFmpeg's -c:v libvvenc wrapper. Quality knob "qp" with informative range (17, 50), default 32. The harness's canonical 7-name preset vocabulary compresses onto VVenC's native 5-level scale (faster / fast / medium / slow / slower) via a static map matching the rule used by the parallel HEVC / AV1 adapter PRs. First-class NN-VC (neural-network video coding) plumbing: nnvc_intra toggle (default off) emits -vvenc-params IntraNN=1 to enable VVC's learned 5x5 / 7x7 / 9x9 conv intra-prediction. Typical effect at 1080p natural content: ~1-3% bitrate gain at iso-VMAF, ~5-10x slower intra encode time. NN loop filter and NN super-resolution toggles deferred to follow-up ADRs once Phase B has a corpus to estimate their cost / quality curves separately. VVenC + NN-VC is the closest thing the open-source video stack has to a "neural-augmented codec" today and is the natural counterpart to the fork's existing tiny-AI measurement surface (vmaf_tiny_v2, fr_regressor_v1, nr_metric_v1) — measurement and generation now share the same vmaf-tune harness so future Phase B / C predictors can learn when the NN-VC tools are worth their compute cost. Test seam: tests mock subprocess so neither libvvenc nor a libvvenc- enabled FFmpeg is required for the unit gate. 22 tests pass; ruff + black + isort clean. ADR-0285 covers the design; companion docs update under docs/usage/vmaf-tune.md adds a "VVenC (H.266 / VVC + NNVC)" section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
957cb78 to
28b7aef
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new vmaf-tune codec adapter for FFmpeg's libvvenc encoder so the Phase A corpus tooling can target VVC/H.266 and document an initial NN-VC surface. This extends the fork-local multi-codec adapter layer under tools/vmaf-tune/ and updates the surrounding tests and documentation.
Changes:
- Adds
VVenCAdapterand registerslibvvencin the codec adapter registry. - Adds adapter-focused tests and relaxes the corpus registry test to allow multiple codecs.
- Adds user/docs artifacts for the new adapter, including an ADR, rebase note, changelog entry, and ADR index row.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tools/vmaf-tune/tests/test_corpus.py |
Relaxes the codec-registry assertion from x264-only to x264-present. |
tools/vmaf-tune/tests/test_codec_adapter_vvenc.py |
Adds unit tests for registry wiring, preset mapping, validation, NNVC params, and command construction. |
tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py |
Introduces the new VVenC adapter, preset projection map, quality metadata, and NNVC extra-param builder. |
tools/vmaf-tune/src/vmaftune/codec_adapters/__init__.py |
Registers and exports VVenCAdapter. |
docs/usage/vmaf-tune.md |
Documents VVenC usage, preset/QP behavior, and NN-VC concepts. |
docs/rebase-notes.md |
Adds a rebase-tracking entry for the new adapter work. |
docs/adr/README.md |
Adds an ADR index row for ADR-0285. |
docs/adr/0285-vmaf-tune-vvenc-nnvc.md |
Adds the ADR describing the adapter decision and tradeoffs. |
CHANGELOG.md |
Adds an Unreleased changelog entry for the adapter. |
Comments suppressed due to low confidence (2)
docs/adr/README.md:265
docs/adr/README.mdis a generated file in this repo:docs/adr/_index_fragments/README.md:3-30requires adding a fragment underdocs/adr/_index_fragments/and appending the slug to_order.txt, then regenerating the index. This PR only edits the rendered README, soscripts/docs/concat-adr-index.sh --checkwill report drift and the new row will be lost the next time the index is regenerated.
| [ADR-0272](0272-fr-regressor-v2-codec-aware-scaffold.md) | `fr_regressor_v2` codec-aware scaffold — first downstream consumer of the vmaf-tune Phase A JSONL corpus ([ADR-0237](0237-quality-aware-encode-automation.md)). Ships [`ai/scripts/train_fr_regressor_v2.py`](../../ai/scripts/train_fr_regressor_v2.py), a smoke ONNX (`fr_regressor_v2.onnx` registered with `smoke: true`), sidecar JSON, and full doc surface ([model card](../ai/models/fr_regressor_v2.md), [research digest](../research/0058-fr-regressor-v2-feasibility.md)). Two-input ONNX: 6 canonical libvmaf features (`adm2`, `vif_scale0..3`, `motion2`, StandardScaler-normalised) + 8-D codec block (6-way encoder one-hot + preset_norm + crf_norm, both in `[0, 1]`). MLP shape `6 -> 16 -> 16 -> 1` with codec block concatenated before the first dense layer (matches the existing `FRRegressor(num_codecs=8)` plumbing landed by [ADR-0235](0235-codec-aware-fr-regressor.md)). Registry row stays `smoke: true` until a follow-up PR (T7-FR-REGRESSOR-V2-PROD) re-runs training on a real Phase A corpus and clears v1's 0.95 LOSO PLCC ship gate with the ≥0.005 multi-codec lift required by ADR-0235. | Proposed | ai, dnn, tiny-ai, fr-regressor, codec-aware, vmaf-tune, fork-local |
CHANGELOG.md:33
- The Unreleased section is generated from
changelog.d/*fragments in this repo (changelog.d/README.md:3-29), but this PR editsCHANGELOG.mddirectly and does not add a matching fragment underchangelog.d/added/. Regenerating the changelog will drop this entry, and CI's changelog drift check will fail until the fragment source is added.
- **`fr_regressor_v2` codec-aware scaffold — first downstream consumer
of the vmaf-tune Phase A JSONL corpus (ADR-0272, prereq for
Phase B).** Ships
[`ai/scripts/train_fr_regressor_v2.py`](ai/scripts/train_fr_regressor_v2.py)
— a scaffold-only trainer that consumes the JSONL corpus emitted by
`vmaf-tune corpus` (ADR-0237 Phase A) and trains the codec-aware
variant of the v1 FR regressor. Two-input ONNX (`features` shape
`(N, 6)` canonical-6 + `codec` shape `(N, 8)` block —
`[encoder_onehot(6), preset_norm, crf_norm]`); reuses the existing
`FRRegressor(num_codecs=8)` class plumbed by ADR-0235. A `--smoke`
mode synthesises 100 fake corpus rows and trains 1 epoch so the
pipeline is end-to-end exercisable in CI without hours of encode
time. Registers `fr_regressor_v2` in `model/tiny/registry.json`
with `smoke: true` until a follow-up PR runs production training on
a real Phase A corpus and clears the ADR-0235 ship gate (≥0.005
multi-codec PLCC lift over v1's 0.95 LOSO floor). Doc surface:
[model card](docs/ai/models/fr_regressor_v2.md),
[research digest](docs/research/0058-fr-regressor-v2-feasibility.md),
[ADR-0272](docs/adr/0272-fr-regressor-v2-codec-aware-scaffold.md),
`ai/AGENTS.md` invariant note pinning the codec block layout and
encoder vocabulary. Smoke validated locally (`python
ai/scripts/train_fr_regressor_v2.py --smoke` produces a valid
opset-17 two-input ONNX, op-allowlist clean, torch-vs-ORT roundtrip
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+110
to
+158
| # NNVC tool toggles. Default off so Phase A grids stay deterministic | ||
| # and reasonably fast; flipping any of these shifts the encoder's | ||
| # rate-distortion curve and is recorded in the corpus row's | ||
| # ``extra_params`` for downstream predictor conditioning. | ||
| nnvc_intra: bool = False | ||
|
|
||
| def native_preset(self, preset: str) -> str: | ||
| """Return the native VVenC preset for a 7-name canonical preset. | ||
|
|
||
| Raises ``ValueError`` for unknown names. Pure function — no | ||
| I/O — so the search loop can pre-compute the projection. | ||
| """ | ||
| if preset not in _PRESET_MAP: | ||
| raise ValueError( | ||
| f"unknown libvvenc preset {preset!r}; expected one of " f"{tuple(_PRESET_MAP)}" | ||
| ) | ||
| return _PRESET_MAP[preset] | ||
|
|
||
| def validate(self, preset: str, qp: int) -> None: | ||
| """Raise ``ValueError`` if ``(preset, qp)`` is unsupported.""" | ||
| if preset not in _PRESET_MAP: | ||
| raise ValueError( | ||
| f"unknown libvvenc preset {preset!r}; expected one of " f"{tuple(_PRESET_MAP)}" | ||
| ) | ||
| lo, hi = self.quality_range | ||
| if not lo <= qp <= hi: | ||
| raise ValueError(f"qp {qp} outside libvvenc range [{lo}, {hi}]") | ||
|
|
||
| def extra_params(self) -> tuple[str, ...]: | ||
| """FFmpeg ``-c:v libvvenc`` arg suffix for the NNVC toggles. | ||
|
|
||
| Returns an immutable tuple so callers can safely concatenate | ||
| into ``EncodeRequest.extra_params``. Empty when no NNVC tool | ||
| is enabled. | ||
|
|
||
| FFmpeg's ``libvvenc`` wrapper forwards opaque ``-vvenc-params | ||
| key=value:key=value`` strings down to the underlying VVenC | ||
| config object, which is the surface VVenC's CLI documents | ||
| for NNVC toggles. | ||
| """ | ||
| toggles: list[str] = [] | ||
| if self.nnvc_intra: | ||
| # ``IntraNN`` is the VVenC config-key for the learned | ||
| # intra-prediction tool. Value 1 enables the 5×5 / 7×7 / | ||
| # 9×9 conv ladder; value 0 keeps the handcrafted modes. | ||
| toggles.append("IntraNN=1") | ||
| if not toggles: | ||
| return () | ||
| return ("-vvenc-params", ":".join(toggles)) |
Comment on lines
34
to
36
| from .h264_qsv import H264QsvAdapter | ||
| from .hevc_amf import HEVCAMFAdapter | ||
| from .hevc_nvenc import HevcNvencAdapter |
Comment on lines
+51
to
+57
| # Compress the fork's canonical 7-name preset vocabulary onto VVenC's | ||
| # 5-level scale. The 7-name vocabulary is the union of x264's 10 | ||
| # presets minus duplicates and is the one the search loop emits; | ||
| # every adapter decides locally how to project onto its native scale. | ||
| # Anything strictly slower than ``slow`` (placebo / slowest / slower) | ||
| # pins to VVenC's deepest preset; anything strictly faster than | ||
| # ``fast`` pins to ``faster``. ``medium`` is the default. |
Comment on lines
+90
to
+107
| def test_vvenc_ffmpeg_command_carries_native_preset_and_qp(): | ||
| # The harness validates / projects via the adapter, then composes the | ||
| # ffmpeg argv via build_ffmpeg_command. We check the wired surface. | ||
| a = VVenCAdapter(nnvc_intra=True) | ||
| a.validate("slower", 27) | ||
| req = EncodeRequest( | ||
| source=Path("/tmp/ref.yuv"), | ||
| width=1920, | ||
| height=1080, | ||
| pix_fmt="yuv420p", | ||
| framerate=24.0, | ||
| encoder=a.encoder, | ||
| preset=a.native_preset("slower"), | ||
| crf=27, # encoder-agnostic name; carries QP for VVenC | ||
| output=Path("/tmp/out.mkv"), | ||
| extra_params=a.extra_params(), | ||
| ) | ||
| cmd = build_ffmpeg_command(req) |
lusoris
pushed a commit
that referenced
this pull request
May 5, 2026
…s 17 adapters) Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)` and asks it for the FFmpeg argv slice via `adapter.ffmpeg_codec_args(preset, quality)` plus an optional `adapter.extra_params()`. Adapters that don't yet expose `ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so partial in-flight adapter PRs stay drivable end-to-end. `parse_versions(stderr, encoder=...)` selects a per-codec version probe (libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC, QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than raising. The `EncodeRequest.crf` field is preserved unchanged for the SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for adapter-side codec-agnostic vocabulary. Existing 13-test x264 suite still green; new 19-test multi-codec suite covers 9 representative codec shapes plus the unknown-codec / missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs (#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368 libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves) which can now drive end-to-end encodes without copying or mutating the harness. Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter contract" section, rebase-notes #228 invariant, CHANGELOG entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lusoris
added a commit
that referenced
this pull request
May 5, 2026
…s 17 adapters) (#376) * feat(tools): vmaf-tune encode.py — codec-agnostic dispatcher (unblocks 17 adapters) Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)` and asks it for the FFmpeg argv slice via `adapter.ffmpeg_codec_args(preset, quality)` plus an optional `adapter.extra_params()`. Adapters that don't yet expose `ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so partial in-flight adapter PRs stay drivable end-to-end. `parse_versions(stderr, encoder=...)` selects a per-codec version probe (libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC, QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than raising. The `EncodeRequest.crf` field is preserved unchanged for the SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for adapter-side codec-agnostic vocabulary. Existing 13-test x264 suite still green; new 19-test multi-codec suite covers 9 representative codec shapes plus the unknown-codec / missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs (#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368 libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves) which can now drive end-to-end encodes without copying or mutating the harness. Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter contract" section, rebase-notes #228 invariant, CHANGELOG entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(docs): renumber encode-multi-codec ADR 0294→0297 + research 0069→0070 --------- Co-authored-by: Lusoris <lusoris@pm.me> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py— Phase A codec adapter on the ADR-0237 contract for Fraunhofer HHI's VVC / H.266 encoder, plus first-class NN-VC (neural-network video coding) plumbing.-c:v libvvencwrapper. Quality knobqpwith informative range(17, 50), default32. The harness's canonical 7-name preset vocabulary compresses onto VVenC's native 5-level scale (faster / fast / medium / slow / slower) via a static map matching the rule used by the parallel HEVC / AV1 adapter PRs.nnvc_intra: bool = Falseemits-vvenc-params IntraNN=1to enable VVC's learned 5×5 / 7×7 / 9×9 conv intra-prediction (~1-3% bitrate gain at iso-VMAF, ~5-10× slower intra encode). NN loop filter and NN super-resolution deferred to follow-up ADRs.Why this matters for the AI angle
VVenC + NN-VC is the closest thing the open-source video stack has to a "neural-augmented codec" today. It is the natural counterpart to the fork's existing tiny-AI measurement surface (
vmaf_tiny_v2,fr_regressor_v1,nr_metric_v1):Putting both behind the same
vmaf-tuneharness lets future Phase B / C predictors learn when the NN-VC tools are worth their compute cost.ADR-0108 deliverables
docs/adr/0285-vmaf-tune-vvenc-nnvc.md§ Alternatives considered — covers directvvencappdriver, full-NNVC-surface bundling, full QP range,-qpvs-crf.tools/vmaf-tune/AGENTS.mdalready documents the adapter contract and theCORPUS_ROW_KEYSschema invariant; this PR does not change that schema.python -m pytest tools/vmaf-tune/tests/(22/22 pass, mocks subprocess so no libvvenc / FFmpeg required locally).Test plan
python -m pytest tools/vmaf-tune/tests/— 22 passedruff check tools/vmaf-tune/— cleanblack --check— cleanisort --check— clean (post pre-commit autofix)libvvenc-enabled FFmpeg (out of scope for this PR — FFmpeg-libvvenc CI runner not configured yet)Companion docs
docs/adr/0285-vmaf-tune-vvenc-nnvc.mddocs/usage/vmaf-tune.md— new "VVenC (H.266 / VVC + NNVC)" section explaining the NN-tool semantics.Coordination
tools/vmaf-tune/tests/test_corpus.py— relaxedknown_codecs() == ("libx264",)to"libx264" in known_codecs()so the registry can now span multiple codecs.🤖 Generated with Claude Code