Skip to content

feat(tools): vmaf-tune fast — proxy-based recommend (research + scaffold)#355

Merged
lusoris merged 2 commits intomasterfrom
feat/vmaf-tune-fast-path
May 4, 2026
Merged

feat(tools): vmaf-tune fast — proxy-based recommend (research + scaffold)#355
lusoris merged 2 commits intomasterfrom
feat/vmaf-tune-fast-path

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented May 3, 2026

Summary

Phase A.5 scaffold of tools/vmaf-tune/ — adds an opt-in
vmaf-tune fast subcommand that combines VMAF-proxy + Bayesian
search + GPU-accelerated verify to collapse the recommendation
use case from the Phase A grid's hours-long wall-time to
seconds-to-minutes. Slow Phase A grid path stays canonical as the
ground-truth corpus generator (ADR-0237 contract); fast-path is
opt-in via pip install vmaf-tune[fast].

  • New ADR docs/adr/0276-vmaf-tune-fast-path.md (Proposed). Cites parent
    ADR-0237 + ADR-0272 (fr_regressor_v2).
  • New research digest
    docs/research/0060-vmaf-tune-fast-path.md — bottleneck
    analysis, five candidate acceleration levers, speedup model
    (≈20–50× without NVENC, ≈100–500× with NVENC follow-up),
    decision matrix.
  • New scaffold module
    tools/vmaf-tune/src/vmaftune/fast.py — production-shape
    fast_recommend(...) entry point, Optuna TPE search loop,
    synthetic-predictor --smoke mode that exercises the pipeline
    without ffmpeg / ONNX / GPU, lazy-imported optional Optuna dep
    gated behind a new [fast] install extra.
  • CLI wiring in
    tools/vmaf-tune/src/vmaftune/cli.py: new fast subcommand
    with --target-vmaf, --encoder, --crf-lo/--crf-hi,
    --n-trials, --time-budget-s, --smoke.
  • 5 new tests in
    tools/vmaf-tune/tests/test_fast.py; full
    tools/vmaf-tune/tests/ suite is 18/18 green.
  • User docs extended in docs/usage/vmaf-tune.md with a new
    Phase A.5 section (install, smoke quick start, CLI flags,
    speedup model, "what's needed for production" checklist).

What's deferred (follow-up PR)

Per ADR-0276 §Consequences and Research-0060 §"What is deferred":

  • Real fr_regressor_v2.onnx weights (gated on PR feat(ai): fr_regressor_v2 codec-aware scaffold (Phase B prereq) #347 + Phase A
    corpus generation).
  • ONNX Runtime wiring for the proxy inference call (the scaffold
    exposes a predictor= injection seam).
  • Sample-chunk encode loop + canonical-6 extract pipeline.
  • GPU verify pass (CUDA / Vulkan / SYCL auto-detection).
  • NVENC / QSV / AMF auto-detection (lever C, ≈10× extra speedup).
  • Per-shot parallelisation (lever D, integrates with TransNet V2).
  • Recommendation-quality benchmark gating ADR Acceptance.

Test plan

  • pytest tools/vmaf-tune/tests/ — 18/18 pass.
  • vmaf-tune fast --smoke --target-vmaf 92 — recommends
    CRF=18 / predicted VMAF≈92.4 / 4121 kbps.
  • vmaf-tune fast --smoke --target-vmaf 70 — recommends
    higher CRF (search responds to target).
  • pre-commit run --files <touched> — all gates pass
    (black / isort / ruff / semgrep / secrets).
  • Production-loop integration tests — deferred to follow-up
    PR (gated on real fr_regressor_v2 weights existing).

ADR-0108 deliverables (six)

Hard rules honoured

  • No claim of Netflix-equivalent quality; the digest's speedup
    numbers are explicit upper bounds and ADR-0276 gates Acceptance
    on a recommendation-quality benchmark against the slow grid.
  • No baked-in gpu_encoder choice that fails on hosts without
    NVENC; the scaffold defaults to libx264, lever C is
    follow-up-only.
  • Phase A grid path is untouched — fast-path is opt-in.

🤖 Generated with Claude Code

@lusoris lusoris force-pushed the feat/vmaf-tune-fast-path branch from a41f22e to 56e810b Compare May 4, 2026 21:06
…old)

Phase A.5 of `tools/vmaf-tune/` (ADR-0276 Proposed, Research-0060). Adds
an opt-in `vmaf-tune fast` subcommand that combines three acceleration
levers — VMAF proxy via `fr_regressor_v2` (ADR-0272), Bayesian search
via Optuna's TPE sampler, GPU-accelerated VMAF verify (ADR-0157,
ADR-0186) — to collapse the recommendation use case from the Phase A
grid's hours-long wall-time to seconds-to-minutes (~20-50× without
NVENC, ~100-500× with NVENC follow-up).

Slow Phase A grid stays canonical as the ground-truth corpus generator
(ADR-0237 contract); fast-path is opt-in via `pip install
vmaf-tune[fast]`. This PR ships the scaffold only — Optuna search loop,
smoke-mode synthetic predictor, CLI subcommand, production-shape entry
point. Real encode + ONNX inference + GPU verify wiring is a follow-up
PR gated on Phase A corpus existence and `fr_regressor_v2` weights
training (PR #347).

Smoke test: `vmaf-tune fast --smoke --target-vmaf 92` — runs Optuna
over a synthetic x264-shaped CRF→VMAF curve without ffmpeg, ONNX
Runtime, or a GPU. 5 new tests in `tests/test_fast.py`; full
`tools/vmaf-tune/tests/` suite is 18/18 green.

ADR-0108 deliverables:
- (1) Research digest: `docs/research/0060-vmaf-tune-fast-path.md`
- (2) Decision matrix: ADR-0276 §Alternatives considered
- (3) AGENTS.md invariants: `tools/vmaf-tune/AGENTS.md` (fast-path is
      opt-in; Optuna stays lazy-imported)
- (4) Reproducer: `vmaf-tune fast --smoke --target-vmaf 92` (in PR body)
- (5) CHANGELOG fragment: `changelog.d/added/vmaf-tune-fast-path-scaffold.md`
- (6) Rebase-notes entry: 0229 (no upstream impact; entirely fork-local)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lusoris lusoris force-pushed the feat/vmaf-tune-fast-path branch from 56e810b to d8409c1 Compare May 4, 2026 21:37
@lusoris lusoris marked this pull request as ready for review May 4, 2026 21:38
Copilot AI review requested due to automatic review settings May 4, 2026 21:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new opt-in Phase A.5 “fast” recommendation scaffold to tools/vmaf-tune/, intended to drastically reduce CRF recommendation wall-time using a proxy + Bayesian search loop (production encode/proxy/verify wiring is explicitly deferred).

Changes:

  • Introduces vmaftune.fast.fast_recommend() with Optuna-driven TPE search and a --smoke synthetic predictor path.
  • Wires a new vmaf-tune fast CLI subcommand and adds an optional [fast] extra (Optuna) in pyproject.toml.
  • Adds smoke-focused unit tests plus accompanying ADR/research/docs/changelog/rebase-note entries.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/vmaf-tune/src/vmaftune/fast.py New fast-path scaffold: Optuna search loop + smoke-mode predictor + production-shaped API.
tools/vmaf-tune/src/vmaftune/cli.py Adds fast subcommand and JSON output plumbing; lazy-imports fast module.
tools/vmaf-tune/tests/test_fast.py New tests validating smoke mode + predictor injection behavior (Optuna-gated).
tools/vmaf-tune/pyproject.toml Adds optional dependency extras (fast) and includes Optuna in dev.
tools/vmaf-tune/AGENTS.md Documents invariants: fast-path is opt-in; Optuna must remain optional/lazy-imported.
docs/usage/vmaf-tune.md Documents Phase A.5 fast scaffold usage, flags, and production checklist.
docs/research/0060-vmaf-tune-fast-path.md Research digest motivating proxy+Bayesian+GPU-verify fast-path.
docs/adr/0276-vmaf-tune-fast-path.md New proposed ADR defining the fast-path scope and non-goals.
docs/adr/_index_fragments/0276-vmaf-tune-fast-path.md ADR index row fragment for ADR-0276.
docs/adr/_index_fragments/_order.txt Adds ADR-0276 to ADR index ordering.
changelog.d/added/vmaf-tune-fast-path-scaffold.md Changelog fragment announcing the new fast-path scaffold.
docs/rebase-notes.md Adds rebase-note entry for the new fast-path scaffold.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +149 to +156
def fast_recommend(
src: Path | None,
target_vmaf: float,
encoder: str = "libx264",
time_budget_s: int = 300, # noqa: ARG001 — production only
crf_range: tuple[int, int] = (DEFAULT_CRF_LO, DEFAULT_CRF_HI),
n_trials: int = SMOKE_N_TRIALS,
smoke: bool = False,
Comment on lines +167 to +173
- In ``smoke=False`` mode, raises ``NotImplementedError`` with a
pointer to the follow-up issue.

Parameters
----------
src
Path to the source video. ``None`` only in smoke mode.
Comment on lines +224 to +231
# Suppress Optuna's default INFO-level chatter; the CLI is the
# right place to surface progress (follow-up PR).
optuna.logging.set_verbosity(optuna.logging.WARNING)
study = optuna.create_study(
direction="minimize",
sampler=optuna.samplers.TPESampler(seed=0),
)
study.optimize(objective, n_trials=n_trials, show_progress_bar=False)
"--encoder",
default="libx264",
choices=list(known_codecs()),
help="codec adapter (Phase A.5: libx264 only; defaults to host's available)",
Comment thread docs/rebase-notes.md
# Expected pre-PR-346 (current master): 42/48 mismatches at higher ratio.
# If the count drops below 5/48 on NVIDIA, ADR-0273 should record the
# delta and consider closing T-VK-CIEDE-F32-F64.
### 0229 — `tools/vmaf-tune fast` Phase A.5 scaffold (ADR-0276)
Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants