Skip to content

feat(tools): vmaf-tune — VVenC + NNVC adapter (AI-augmented H.266)#368

Merged
lusoris merged 1 commit intomasterfrom
feat/vmaf-tune-codec-adapter-vvenc
May 5, 2026
Merged

feat(tools): vmaf-tune — VVenC + NNVC adapter (AI-augmented H.266)#368
lusoris merged 1 commit intomasterfrom
feat/vmaf-tune-codec-adapter-vvenc

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented May 3, 2026

Summary

  • Adds tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py — Phase A codec adapter on the ADR-0237 contract for Fraunhofer HHI's VVC / H.266 encoder, plus first-class NN-VC (neural-network video coding) plumbing.
  • Drives FFmpeg's -c:v libvvenc wrapper. Quality knob qp with informative range (17, 50), default 32. The harness's canonical 7-name preset vocabulary compresses onto VVenC's native 5-level scale (faster / fast / medium / slow / slower) via a static map matching the rule used by the parallel HEVC / AV1 adapter PRs.
  • One Phase A NN-VC toggle: nnvc_intra: bool = False emits -vvenc-params IntraNN=1 to enable VVC's learned 5×5 / 7×7 / 9×9 conv intra-prediction (~1-3% bitrate gain at iso-VMAF, ~5-10× slower intra encode). NN loop filter and NN super-resolution deferred to follow-up ADRs.

Why this matters for the AI angle

VVenC + NN-VC is the closest thing the open-source video stack has to a "neural-augmented codec" today. It is the natural counterpart to the fork's existing tiny-AI measurement surface (vmaf_tiny_v2, fr_regressor_v1, nr_metric_v1):

  • Tiny-AI = end-to-end measurement (learned VMAF / quality predictors).
  • NN-VC = end-to-end generation (learned intra prediction / loop filter / super-resolution inside the codec).

Putting both behind the same vmaf-tune harness lets future Phase B / C predictors learn when the NN-VC tools are worth their compute cost.

ADR-0108 deliverables

  • (1) Research digest: no digest needed: adapter is a thin one-file drop on the existing ADR-0237 contract; option-space already covered by Research-0044.
  • (2) Decision matrix: in docs/adr/0285-vmaf-tune-vvenc-nnvc.md § Alternatives considered — covers direct vvencapp driver, full-NNVC-surface bundling, full QP range, -qp vs -crf.
  • (3) AGENTS.md invariant note: existing tools/vmaf-tune/AGENTS.md already documents the adapter contract and the CORPUS_ROW_KEYS schema invariant; this PR does not change that schema.
  • (4) Reproducer / smoke-test command: python -m pytest tools/vmaf-tune/tests/ (22/22 pass, mocks subprocess so no libvvenc / FFmpeg required locally).
  • (5) CHANGELOG fragment: "Unreleased — lusoris fork" entry added.
  • (6) Rebase note: entry 0229 added.

Test plan

  • python -m pytest tools/vmaf-tune/tests/ — 22 passed
  • ruff check tools/vmaf-tune/ — clean
  • black --check — clean
  • isort --check — clean (post pre-commit autofix)
  • Conventional Commits hook — passed
  • CI: integration smoke gated to runner with libvvenc-enabled FFmpeg (out of scope for this PR — FFmpeg-libvvenc CI runner not configured yet)

Companion docs

Coordination

  • Sibling PRs (parallel, landing 2026-05-03): x264 / x265 / svt-av1 / libaom + NVENC / QSV / AMF / VideoToolbox adapter PRs. This one is the 17th codec adapter; the FR-regressor v2 schema-expansion PR (ADR-0235) coordinates the codec one-hot — VVenC takes the next free slot.
  • Tests touched: tools/vmaf-tune/tests/test_corpus.py — relaxed known_codecs() == ("libx264",) to "libx264" in known_codecs() so the registry can now span multiple codecs.

🤖 Generated with Claude Code

lusoris pushed a commit that referenced this pull request May 3, 2026
…s 17 adapters)

Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A
hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now
looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)`
and asks it for the FFmpeg argv slice via
`adapter.ffmpeg_codec_args(preset, quality)` plus an optional
`adapter.extra_params()`. Adapters that don't yet expose
`ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so
partial in-flight adapter PRs stay drivable end-to-end.
`parse_versions(stderr, encoder=...)` selects a per-codec version probe
(libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC,
QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than
raising. The `EncodeRequest.crf` field is preserved unchanged for the
SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for
adapter-side codec-agnostic vocabulary.

Existing 13-test x264 suite still green; new 19-test multi-codec suite
covers 9 representative codec shapes plus the unknown-codec /
missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs
(#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368
libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves)
which can now drive end-to-end encodes without copying or mutating the
harness.

Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter
contract" section, rebase-notes #228 invariant, CHANGELOG entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lusoris lusoris marked this pull request as ready for review May 5, 2026 09:54
Copilot AI review requested due to automatic review settings May 5, 2026 09:54
Adds tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py as a Phase A
codec adapter on the ADR-0237 contract. Drives Fraunhofer HHI's VVC /
H.266 encoder via FFmpeg's -c:v libvvenc wrapper. Quality knob "qp"
with informative range (17, 50), default 32. The harness's canonical
7-name preset vocabulary compresses onto VVenC's native 5-level scale
(faster / fast / medium / slow / slower) via a static map matching
the rule used by the parallel HEVC / AV1 adapter PRs.

First-class NN-VC (neural-network video coding) plumbing: nnvc_intra
toggle (default off) emits -vvenc-params IntraNN=1 to enable VVC's
learned 5x5 / 7x7 / 9x9 conv intra-prediction. Typical effect at
1080p natural content: ~1-3% bitrate gain at iso-VMAF, ~5-10x slower
intra encode time. NN loop filter and NN super-resolution toggles
deferred to follow-up ADRs once Phase B has a corpus to estimate
their cost / quality curves separately.

VVenC + NN-VC is the closest thing the open-source video stack has to
a "neural-augmented codec" today and is the natural counterpart to
the fork's existing tiny-AI measurement surface (vmaf_tiny_v2,
fr_regressor_v1, nr_metric_v1) — measurement and generation now
share the same vmaf-tune harness so future Phase B / C predictors
can learn when the NN-VC tools are worth their compute cost.

Test seam: tests mock subprocess so neither libvvenc nor a libvvenc-
enabled FFmpeg is required for the unit gate. 22 tests pass; ruff +
black + isort clean.

ADR-0285 covers the design; companion docs update under
docs/usage/vmaf-tune.md adds a "VVenC (H.266 / VVC + NNVC)" section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lusoris lusoris force-pushed the feat/vmaf-tune-codec-adapter-vvenc branch from 957cb78 to 28b7aef Compare May 5, 2026 09:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new vmaf-tune codec adapter for FFmpeg's libvvenc encoder so the Phase A corpus tooling can target VVC/H.266 and document an initial NN-VC surface. This extends the fork-local multi-codec adapter layer under tools/vmaf-tune/ and updates the surrounding tests and documentation.

Changes:

  • Adds VVenCAdapter and registers libvvenc in the codec adapter registry.
  • Adds adapter-focused tests and relaxes the corpus registry test to allow multiple codecs.
  • Adds user/docs artifacts for the new adapter, including an ADR, rebase note, changelog entry, and ADR index row.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/vmaf-tune/tests/test_corpus.py Relaxes the codec-registry assertion from x264-only to x264-present.
tools/vmaf-tune/tests/test_codec_adapter_vvenc.py Adds unit tests for registry wiring, preset mapping, validation, NNVC params, and command construction.
tools/vmaf-tune/src/vmaftune/codec_adapters/vvenc.py Introduces the new VVenC adapter, preset projection map, quality metadata, and NNVC extra-param builder.
tools/vmaf-tune/src/vmaftune/codec_adapters/__init__.py Registers and exports VVenCAdapter.
docs/usage/vmaf-tune.md Documents VVenC usage, preset/QP behavior, and NN-VC concepts.
docs/rebase-notes.md Adds a rebase-tracking entry for the new adapter work.
docs/adr/README.md Adds an ADR index row for ADR-0285.
docs/adr/0285-vmaf-tune-vvenc-nnvc.md Adds the ADR describing the adapter decision and tradeoffs.
CHANGELOG.md Adds an Unreleased changelog entry for the adapter.
Comments suppressed due to low confidence (2)

docs/adr/README.md:265

  • docs/adr/README.md is a generated file in this repo: docs/adr/_index_fragments/README.md:3-30 requires adding a fragment under docs/adr/_index_fragments/ and appending the slug to _order.txt, then regenerating the index. This PR only edits the rendered README, so scripts/docs/concat-adr-index.sh --check will report drift and the new row will be lost the next time the index is regenerated.
| [ADR-0272](0272-fr-regressor-v2-codec-aware-scaffold.md) | `fr_regressor_v2` codec-aware scaffold — first downstream consumer of the vmaf-tune Phase A JSONL corpus ([ADR-0237](0237-quality-aware-encode-automation.md)). Ships [`ai/scripts/train_fr_regressor_v2.py`](../../ai/scripts/train_fr_regressor_v2.py), a smoke ONNX (`fr_regressor_v2.onnx` registered with `smoke: true`), sidecar JSON, and full doc surface ([model card](../ai/models/fr_regressor_v2.md), [research digest](../research/0058-fr-regressor-v2-feasibility.md)). Two-input ONNX: 6 canonical libvmaf features (`adm2`, `vif_scale0..3`, `motion2`, StandardScaler-normalised) + 8-D codec block (6-way encoder one-hot + preset_norm + crf_norm, both in `[0, 1]`). MLP shape `6 -> 16 -> 16 -> 1` with codec block concatenated before the first dense layer (matches the existing `FRRegressor(num_codecs=8)` plumbing landed by [ADR-0235](0235-codec-aware-fr-regressor.md)). Registry row stays `smoke: true` until a follow-up PR (T7-FR-REGRESSOR-V2-PROD) re-runs training on a real Phase A corpus and clears v1's 0.95 LOSO PLCC ship gate with the ≥0.005 multi-codec lift required by ADR-0235. | Proposed | ai, dnn, tiny-ai, fr-regressor, codec-aware, vmaf-tune, fork-local |

CHANGELOG.md:33

  • The Unreleased section is generated from changelog.d/* fragments in this repo (changelog.d/README.md:3-29), but this PR edits CHANGELOG.md directly and does not add a matching fragment under changelog.d/added/. Regenerating the changelog will drop this entry, and CI's changelog drift check will fail until the fragment source is added.
- **`fr_regressor_v2` codec-aware scaffold — first downstream consumer
  of the vmaf-tune Phase A JSONL corpus (ADR-0272, prereq for
  Phase B).** Ships
  [`ai/scripts/train_fr_regressor_v2.py`](ai/scripts/train_fr_regressor_v2.py)
  — a scaffold-only trainer that consumes the JSONL corpus emitted by
  `vmaf-tune corpus` (ADR-0237 Phase A) and trains the codec-aware
  variant of the v1 FR regressor. Two-input ONNX (`features` shape
  `(N, 6)` canonical-6 + `codec` shape `(N, 8)` block —
  `[encoder_onehot(6), preset_norm, crf_norm]`); reuses the existing
  `FRRegressor(num_codecs=8)` class plumbed by ADR-0235. A `--smoke`
  mode synthesises 100 fake corpus rows and trains 1 epoch so the
  pipeline is end-to-end exercisable in CI without hours of encode
  time. Registers `fr_regressor_v2` in `model/tiny/registry.json`
  with `smoke: true` until a follow-up PR runs production training on
  a real Phase A corpus and clears the ADR-0235 ship gate (≥0.005
  multi-codec PLCC lift over v1's 0.95 LOSO floor). Doc surface:
  [model card](docs/ai/models/fr_regressor_v2.md),
  [research digest](docs/research/0058-fr-regressor-v2-feasibility.md),
  [ADR-0272](docs/adr/0272-fr-regressor-v2-codec-aware-scaffold.md),
  `ai/AGENTS.md` invariant note pinning the codec block layout and
  encoder vocabulary. Smoke validated locally (`python
  ai/scripts/train_fr_regressor_v2.py --smoke` produces a valid
  opset-17 two-input ONNX, op-allowlist clean, torch-vs-ORT roundtrip

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +110 to +158
# NNVC tool toggles. Default off so Phase A grids stay deterministic
# and reasonably fast; flipping any of these shifts the encoder's
# rate-distortion curve and is recorded in the corpus row's
# ``extra_params`` for downstream predictor conditioning.
nnvc_intra: bool = False

def native_preset(self, preset: str) -> str:
"""Return the native VVenC preset for a 7-name canonical preset.

Raises ``ValueError`` for unknown names. Pure function — no
I/O — so the search loop can pre-compute the projection.
"""
if preset not in _PRESET_MAP:
raise ValueError(
f"unknown libvvenc preset {preset!r}; expected one of " f"{tuple(_PRESET_MAP)}"
)
return _PRESET_MAP[preset]

def validate(self, preset: str, qp: int) -> None:
"""Raise ``ValueError`` if ``(preset, qp)`` is unsupported."""
if preset not in _PRESET_MAP:
raise ValueError(
f"unknown libvvenc preset {preset!r}; expected one of " f"{tuple(_PRESET_MAP)}"
)
lo, hi = self.quality_range
if not lo <= qp <= hi:
raise ValueError(f"qp {qp} outside libvvenc range [{lo}, {hi}]")

def extra_params(self) -> tuple[str, ...]:
"""FFmpeg ``-c:v libvvenc`` arg suffix for the NNVC toggles.

Returns an immutable tuple so callers can safely concatenate
into ``EncodeRequest.extra_params``. Empty when no NNVC tool
is enabled.

FFmpeg's ``libvvenc`` wrapper forwards opaque ``-vvenc-params
key=value:key=value`` strings down to the underlying VVenC
config object, which is the surface VVenC's CLI documents
for NNVC toggles.
"""
toggles: list[str] = []
if self.nnvc_intra:
# ``IntraNN`` is the VVenC config-key for the learned
# intra-prediction tool. Value 1 enables the 5×5 / 7×7 /
# 9×9 conv ladder; value 0 keeps the handcrafted modes.
toggles.append("IntraNN=1")
if not toggles:
return ()
return ("-vvenc-params", ":".join(toggles))
Comment on lines 34 to 36
from .h264_qsv import H264QsvAdapter
from .hevc_amf import HEVCAMFAdapter
from .hevc_nvenc import HevcNvencAdapter
Comment on lines +51 to +57
# Compress the fork's canonical 7-name preset vocabulary onto VVenC's
# 5-level scale. The 7-name vocabulary is the union of x264's 10
# presets minus duplicates and is the one the search loop emits;
# every adapter decides locally how to project onto its native scale.
# Anything strictly slower than ``slow`` (placebo / slowest / slower)
# pins to VVenC's deepest preset; anything strictly faster than
# ``fast`` pins to ``faster``. ``medium`` is the default.
Comment on lines +90 to +107
def test_vvenc_ffmpeg_command_carries_native_preset_and_qp():
# The harness validates / projects via the adapter, then composes the
# ffmpeg argv via build_ffmpeg_command. We check the wired surface.
a = VVenCAdapter(nnvc_intra=True)
a.validate("slower", 27)
req = EncodeRequest(
source=Path("/tmp/ref.yuv"),
width=1920,
height=1080,
pix_fmt="yuv420p",
framerate=24.0,
encoder=a.encoder,
preset=a.native_preset("slower"),
crf=27, # encoder-agnostic name; carries QP for VVenC
output=Path("/tmp/out.mkv"),
extra_params=a.extra_params(),
)
cmd = build_ffmpeg_command(req)
@lusoris lusoris merged commit 2316846 into master May 5, 2026
54 of 57 checks passed
@lusoris lusoris deleted the feat/vmaf-tune-codec-adapter-vvenc branch May 5, 2026 10:14
lusoris pushed a commit that referenced this pull request May 5, 2026
…s 17 adapters)

Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A
hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now
looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)`
and asks it for the FFmpeg argv slice via
`adapter.ffmpeg_codec_args(preset, quality)` plus an optional
`adapter.extra_params()`. Adapters that don't yet expose
`ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so
partial in-flight adapter PRs stay drivable end-to-end.
`parse_versions(stderr, encoder=...)` selects a per-codec version probe
(libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC,
QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than
raising. The `EncodeRequest.crf` field is preserved unchanged for the
SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for
adapter-side codec-agnostic vocabulary.

Existing 13-test x264 suite still green; new 19-test multi-codec suite
covers 9 representative codec shapes plus the unknown-codec /
missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs
(#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368
libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves)
which can now drive end-to-end encodes without copying or mutating the
harness.

Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter
contract" section, rebase-notes #228 invariant, CHANGELOG entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lusoris added a commit that referenced this pull request May 5, 2026
…s 17 adapters) (#376)

* feat(tools): vmaf-tune encode.py — codec-agnostic dispatcher (unblocks 17 adapters)

Refactors `tools/vmaf-tune/src/vmaftune/encode.py` away from the Phase A
hard-coded `libx264` `-c:v / -preset / -crf` argv. `run_encode` now
looks up the codec adapter via `codec_adapters.get_adapter(req.encoder)`
and asks it for the FFmpeg argv slice via
`adapter.ffmpeg_codec_args(preset, quality)` plus an optional
`adapter.extra_params()`. Adapters that don't yet expose
`ffmpeg_codec_args` fall back silently to the legacy x264-CRF shape so
partial in-flight adapter PRs stay drivable end-to-end.
`parse_versions(stderr, encoder=...)` selects a per-codec version probe
(libx264, libx265, libsvtav1, libvpx-vp9, libaom-av1, libvvenc, NVENC,
QSV, AMF, VideoToolbox); unknown encoders return "unknown" rather than
raising. The `EncodeRequest.crf` field is preserved unchanged for the
SCHEMA_VERSION=1 row contract; a `quality` property mirrors it for
adapter-side codec-agnostic vocabulary.

Existing 13-test x264 suite still green; new 19-test multi-codec suite
covers 9 representative codec shapes plus the unknown-codec /
missing-method fallback paths. Unblocks 17 in-flight codec adapter PRs
(#360 libaom, #362 libx265, #364 NVENC, #366 AMF, #367 QSV, #368
libvvenc, #370 libsvtav1, #373 VideoToolbox, plus follow-on waves)
which can now drive end-to-end encodes without copying or mutating the
harness.

Ships ADR-0294 + research digest 0054, vmaf-tune.md "Codec adapter
contract" section, rebase-notes #228 invariant, CHANGELOG entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(docs): renumber encode-multi-codec ADR 0294→0297 + research 0069→0070

---------

Co-authored-by: Lusoris <lusoris@pm.me>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants