Skip to content

feat(tools): vmaf-tune Phase D — per-shot CRF tuning (transnet_v2)#369

Merged
lusoris merged 3 commits intomasterfrom
feat/vmaf-tune-phase-d-per-shot
May 5, 2026
Merged

feat(tools): vmaf-tune Phase D — per-shot CRF tuning (transnet_v2)#369
lusoris merged 3 commits intomasterfrom
feat/vmaf-tune-phase-d-per-shot

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented May 3, 2026

Summary

Phase D scaffold for vmaf-tune — per-shot CRF tuning. Closes the
orchestration layer for Bucket #1 of Research-0061's vmaf-tune
capability audit (the Netflix-style table-stakes per-shot encoding
feature, ranked High impact / M effort).

  • Adds tools/vmaf-tune/src/vmaftune/per_shot.py:
    detect_shots() (wraps the C-side vmaf-perShot binary —
    ADR-0222 / TransNet V2 ADR-0223 — with a single-shot fallback),
    tune_per_shot() (pluggable predicate seam Phase B's bisect
    drops into; default predicate returns the codec adapter's
    default CRF), merge_shots() (emits per-segment FFmpeg argv +
    a concat-demuxer command).
  • New CLI subcommand: vmaf-tune tune-per-shot --src VIDEO --target-vmaf 92 .... Plan emitted as JSON to stdout (or
    --plan-out / --script-out).
  • Scaffold-only — no encodes are run, no native per-codec emission
    (--qpfile / --zones / SVT-AV1 segment tables) yet. Per-codec
    native emission lands per-codec alongside each new adapter.

First per-phase split off
ADR-0237;
sibling to in-flight Phase B (PR #347).

Type

  • feat — new fork-local feature scaffold

Bug-status hygiene

  • no state delta: scaffold-only orchestration layer, no bug interaction.

Netflix golden-data gate

  • Did not modify any assertAlmostEqual score.

Deep-dive deliverables (ADR-0108)

  • Research digest — no digest needed: Research-0061 covers Phase D scope (already merged via PR docs(research): 0061 docs-only PR CI fast-track design #393)
    Bucket chore: release master #1 ranks per-shot tuning High impact, M effort. Re-cited in
    ADR-0276 §References.
  • Decision matrix — captured in ADR-0276 §Alternatives
    considered: five alternatives (pluggable scaffold chosen,
    wait-for-deps, native --qpfile/--zones from day one, inline
    ONNX shot detector, Phase B bisect inlined into Phase D).
  • AGENTS.md invariant note — added a "Phase D
    rebase-sensitive invariants" block to tools/vmaf-tune/AGENTS.md:
    predicate signature contract, half-open Shot ranges,
    vmaf-perShot as canonical detector.
  • Reproducer / smoke-test command — see "Reproducer" below.
  • CHANGELOG fragment
    changelog.d/added/ADR-0276-vmaf-tune-phase-d-per-shot.md
    (per ADR-0221 fragment pattern).
  • Rebase note — entry 0228 in docs/rebase-notes.md.

Reproducer

Mocked smoke run (no ffmpeg, vmaf, or vmaf-perShot required):

python -m pytest tools/vmaf-tune/tests/test_per_shot.py -q
# 16 new tests pass; total vmaf-tune suite: 29 tests
python tools/vmaf-tune/vmaf-tune tune-per-shot --help

End-to-end shape (with mocked vmaf-perShot via the runner seam, see
test_cli_tune_per_shot_smoke in
tools/vmaf-tune/tests/test_per_shot.py) emits a JSON plan with
per-shot (start_frame, end_frame, crf, predicted_vmaf) rows plus
per-segment FFmpeg argv and a concat-demuxer command.

Out of scope

  • Phase B's bisect predicate (PR feat(ai): fr_regressor_v2 codec-aware scaffold (Phase B prereq) #347) — wired in via the
    predicate argument once it lands.
  • Per-codec native per-shot emission (--qpfile for x264,
    --zones for x265, SVT-AV1 segment tables) — lands per-codec
    alongside each new adapter PR.
  • Held-out per-shot validation corpus and any numerical quality
    claims — separate research item, likely BVI-DVC + Netflix
    Public + KoNViD subsets re-encoded through Phase A's grid sweep.
  • Native ONNX-Runtime-from-Python shot detection path —
    vmaf-perShot stays the canonical detector to avoid two parallel
    truth sources.

@lusoris lusoris force-pushed the feat/vmaf-tune-phase-d-per-shot branch from 95c379b to 45fd4f2 Compare May 3, 2026 19:39
@lusoris lusoris marked this pull request as ready for review May 5, 2026 10:14
Copilot AI review requested due to automatic review settings May 5, 2026 10:14
Closes the orchestration layer for Bucket #1 of Research-0061's
`vmaf-tune` capability audit (the Netflix-style table-stakes per-shot
encoding feature). Ships `tools/vmaf-tune/src/vmaftune/per_shot.py`
plus the `vmaf-tune tune-per-shot` CLI subcommand:

* `detect_shots()` wraps the C-side `vmaf-perShot` binary
  (ADR-0222 / TransNet V2 ADR-0223) with a single-shot fallback
  when the binary is unavailable or fails.
* `tune_per_shot()` exposes a pluggable predicate seam Phase B's
  bisect (PR #347) drops into. Default predicate returns the codec
  adapter's default CRF so the scaffold round-trips before Phase B
  lands as code.
* `merge_shots()` emits one `ffmpeg` argv per shot (`-ss` +
  `-frames:v`) plus a final concat-demuxer command.

Scaffold-only — does not run encodes, does not yet emit native
per-codec mechanisms (`--qpfile` for x264, `--zones` for x265,
SVT-AV1 segment tables); per-segment + concat is the portable
fallback. Per-codec native emission lands per-codec alongside each
new adapter. 16 new tests pass with mocked `vmaf-perShot` + mocked
encoder; total `vmaf-tune` suite is 29 tests, zero binaries
required.

First per-phase split off ADR-0237. Updates ADR index, CHANGELOG,
docs/usage/vmaf-tune.md (new "Phase D" section + flag table + plan
JSON schema), tools/vmaf-tune/AGENTS.md (per-shot rebase invariants),
and docs/rebase-notes.md (entry 0228).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lusoris lusoris force-pushed the feat/vmaf-tune-phase-d-per-shot branch from 45fd4f2 to 2c16e6a Compare May 5, 2026 10:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds the Phase D scaffold for vmaf-tune to support per-shot CRF tuning orchestration (shot detection → per-shot recommendation → FFmpeg segment+concat plan), including a tune-per-shot CLI subcommand, docs, and smoke tests.

Changes:

  • Introduces vmaftune.per_shot with detect_shots, tune_per_shot, merge_shots, and plan rendering/persistence helpers.
  • Adds vmaf-tune tune-per-shot CLI subcommand that emits a JSON encoding plan (optionally to files).
  • Documents Phase D scaffold behavior and adds unit/smoke tests with mocked subprocess seams.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tools/vmaf-tune/src/vmaftune/per_shot.py New per-shot orchestration module: shot detection wrapper, per-shot tuning loop, FFmpeg plan emission.
tools/vmaf-tune/src/vmaftune/cli.py Adds tune-per-shot subcommand and JSON/shell-script plan output plumbing.
tools/vmaf-tune/tests/test_per_shot.py New tests covering fallback/JSON parsing/tuning/clamping/plan emission and CLI smoke.
tools/vmaf-tune/AGENTS.md Captures Phase D invariants (predicate signature, half-open shots, canonical detector).
docs/usage/vmaf-tune.md Documents Phase D scaffold usage, flags, and JSON plan schema.
docs/rebase-notes.md Adds rebase note entry for Phase D scaffold touchpoints and invariants.
docs/adr/README.md Registers ADR-0276 in the ADR index.
docs/adr/0276-vmaf-tune-phase-d-per-shot.md New ADR describing the Phase D scaffold decision and tradeoffs.
CHANGELOG.md Adds changelog entry for the new scaffold and CLI subcommand.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +391 to +399
def _shell_join(parts: Iterable[str]) -> str:
"""Quote-aware join — minimum viable, no exotic shell escaping.

Stops short of full ``shlex.quote`` because the scaffold's argv is
constructed in-process and does not contain user-controlled
metacharacters; the helper exists for human-readable output, not
for safe shell evaluation.
"""
return " ".join(parts)
Comment on lines +348 to +364
start_seconds = shot.start_frame / framerate
return (
ffmpeg_bin,
"-y",
"-hide_banner",
"-ss",
f"{start_seconds:.6f}",
"-i",
str(source),
"-frames:v",
str(shot.length),
"-c:v",
encoder,
"-crf",
str(crf),
str(output),
)
Comment on lines +301 to +302
# concat-demuxer expects POSIX-style escaped paths.
listing_lines.append(f"file '{seg_path.as_posix()}'")
Comment on lines +293 to +298
per_shot.add_argument(
"--encoder",
default="libx264",
choices=list(known_codecs()),
help="codec adapter (Phase D scaffold: libx264 only)",
)
Comment thread docs/usage/vmaf-tune.md
Comment on lines 18 to 26
This doc covers **Phase A** of the six-phase roadmap: a multi-codec grid
sweep that produces the corpus the later phases consume. Phases B (target-VMAF
bisect), C (per-title CRF predictor), D (per-shot dynamic CRF), E (Pareto ABR
This doc covers **Phase A** of the six-phase roadmap (a `libx264` grid
sweep that produces the corpus the later phases consume) and the
**Phase D scaffold** (per-shot CRF tuning, see
[ADR-0276](../adr/0276-vmaf-tune-phase-d-per-shot.md)). Phases B
(target-VMAF bisect), C (per-title CRF predictor), E (Pareto ABR
ladder) and F (MCP tools) are not implemented yet — see ADR-0237.
bitdepth: int = 8,
total_frames: int | None = None,
per_shot_bin: str = "vmaf-perShot",
runner: object | None = None,
Comment on lines +153 to +156
runner_fn = runner or subprocess.run
completed = runner_fn( # type: ignore[operator]
cmd, capture_output=True, text=True, check=False
)
@lusoris lusoris merged commit 0ae78ac into master May 5, 2026
54 checks passed
@lusoris lusoris deleted the feat/vmaf-tune-phase-d-per-shot branch May 5, 2026 10:38
@lusoris lusoris restored the feat/vmaf-tune-phase-d-per-shot branch May 6, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants