feat(tools): vmaf-tune Phase D — per-shot CRF tuning (transnet_v2)#369
Merged
feat(tools): vmaf-tune Phase D — per-shot CRF tuning (transnet_v2)#369
Conversation
95c379b to
45fd4f2
Compare
Closes the orchestration layer for Bucket #1 of Research-0061's `vmaf-tune` capability audit (the Netflix-style table-stakes per-shot encoding feature). Ships `tools/vmaf-tune/src/vmaftune/per_shot.py` plus the `vmaf-tune tune-per-shot` CLI subcommand: * `detect_shots()` wraps the C-side `vmaf-perShot` binary (ADR-0222 / TransNet V2 ADR-0223) with a single-shot fallback when the binary is unavailable or fails. * `tune_per_shot()` exposes a pluggable predicate seam Phase B's bisect (PR #347) drops into. Default predicate returns the codec adapter's default CRF so the scaffold round-trips before Phase B lands as code. * `merge_shots()` emits one `ffmpeg` argv per shot (`-ss` + `-frames:v`) plus a final concat-demuxer command. Scaffold-only — does not run encodes, does not yet emit native per-codec mechanisms (`--qpfile` for x264, `--zones` for x265, SVT-AV1 segment tables); per-segment + concat is the portable fallback. Per-codec native emission lands per-codec alongside each new adapter. 16 new tests pass with mocked `vmaf-perShot` + mocked encoder; total `vmaf-tune` suite is 29 tests, zero binaries required. First per-phase split off ADR-0237. Updates ADR index, CHANGELOG, docs/usage/vmaf-tune.md (new "Phase D" section + flag table + plan JSON schema), tools/vmaf-tune/AGENTS.md (per-shot rebase invariants), and docs/rebase-notes.md (entry 0228). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
45fd4f2 to
2c16e6a
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds the Phase D scaffold for vmaf-tune to support per-shot CRF tuning orchestration (shot detection → per-shot recommendation → FFmpeg segment+concat plan), including a tune-per-shot CLI subcommand, docs, and smoke tests.
Changes:
- Introduces
vmaftune.per_shotwithdetect_shots,tune_per_shot,merge_shots, and plan rendering/persistence helpers. - Adds
vmaf-tune tune-per-shotCLI subcommand that emits a JSON encoding plan (optionally to files). - Documents Phase D scaffold behavior and adds unit/smoke tests with mocked subprocess seams.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/vmaf-tune/src/vmaftune/per_shot.py | New per-shot orchestration module: shot detection wrapper, per-shot tuning loop, FFmpeg plan emission. |
| tools/vmaf-tune/src/vmaftune/cli.py | Adds tune-per-shot subcommand and JSON/shell-script plan output plumbing. |
| tools/vmaf-tune/tests/test_per_shot.py | New tests covering fallback/JSON parsing/tuning/clamping/plan emission and CLI smoke. |
| tools/vmaf-tune/AGENTS.md | Captures Phase D invariants (predicate signature, half-open shots, canonical detector). |
| docs/usage/vmaf-tune.md | Documents Phase D scaffold usage, flags, and JSON plan schema. |
| docs/rebase-notes.md | Adds rebase note entry for Phase D scaffold touchpoints and invariants. |
| docs/adr/README.md | Registers ADR-0276 in the ADR index. |
| docs/adr/0276-vmaf-tune-phase-d-per-shot.md | New ADR describing the Phase D scaffold decision and tradeoffs. |
| CHANGELOG.md | Adds changelog entry for the new scaffold and CLI subcommand. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+391
to
+399
| def _shell_join(parts: Iterable[str]) -> str: | ||
| """Quote-aware join — minimum viable, no exotic shell escaping. | ||
|
|
||
| Stops short of full ``shlex.quote`` because the scaffold's argv is | ||
| constructed in-process and does not contain user-controlled | ||
| metacharacters; the helper exists for human-readable output, not | ||
| for safe shell evaluation. | ||
| """ | ||
| return " ".join(parts) |
Comment on lines
+348
to
+364
| start_seconds = shot.start_frame / framerate | ||
| return ( | ||
| ffmpeg_bin, | ||
| "-y", | ||
| "-hide_banner", | ||
| "-ss", | ||
| f"{start_seconds:.6f}", | ||
| "-i", | ||
| str(source), | ||
| "-frames:v", | ||
| str(shot.length), | ||
| "-c:v", | ||
| encoder, | ||
| "-crf", | ||
| str(crf), | ||
| str(output), | ||
| ) |
Comment on lines
+301
to
+302
| # concat-demuxer expects POSIX-style escaped paths. | ||
| listing_lines.append(f"file '{seg_path.as_posix()}'") |
Comment on lines
+293
to
+298
| per_shot.add_argument( | ||
| "--encoder", | ||
| default="libx264", | ||
| choices=list(known_codecs()), | ||
| help="codec adapter (Phase D scaffold: libx264 only)", | ||
| ) |
Comment on lines
18
to
26
| This doc covers **Phase A** of the six-phase roadmap: a multi-codec grid | ||
| sweep that produces the corpus the later phases consume. Phases B (target-VMAF | ||
| bisect), C (per-title CRF predictor), D (per-shot dynamic CRF), E (Pareto ABR | ||
| This doc covers **Phase A** of the six-phase roadmap (a `libx264` grid | ||
| sweep that produces the corpus the later phases consume) and the | ||
| **Phase D scaffold** (per-shot CRF tuning, see | ||
| [ADR-0276](../adr/0276-vmaf-tune-phase-d-per-shot.md)). Phases B | ||
| (target-VMAF bisect), C (per-title CRF predictor), E (Pareto ABR | ||
| ladder) and F (MCP tools) are not implemented yet — see ADR-0237. |
| bitdepth: int = 8, | ||
| total_frames: int | None = None, | ||
| per_shot_bin: str = "vmaf-perShot", | ||
| runner: object | None = None, |
Comment on lines
+153
to
+156
| runner_fn = runner or subprocess.run | ||
| completed = runner_fn( # type: ignore[operator] | ||
| cmd, capture_output=True, text=True, check=False | ||
| ) |
12 tasks
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase D scaffold for
vmaf-tune— per-shot CRF tuning. Closes theorchestration layer for Bucket #1 of Research-0061's
vmaf-tunecapability audit (the Netflix-style table-stakes per-shot encoding
feature, ranked High impact / M effort).
tools/vmaf-tune/src/vmaftune/per_shot.py:detect_shots()(wraps the C-sidevmaf-perShotbinary —ADR-0222 / TransNet V2 ADR-0223 — with a single-shot fallback),
tune_per_shot()(pluggable predicate seam Phase B's bisectdrops into; default predicate returns the codec adapter's
default CRF),
merge_shots()(emits per-segment FFmpeg argv +a concat-demuxer command).
vmaf-tune tune-per-shot --src VIDEO --target-vmaf 92 .... Plan emitted as JSON to stdout (or--plan-out/--script-out).(
--qpfile/--zones/ SVT-AV1 segment tables) yet. Per-codecnative emission lands per-codec alongside each new adapter.
First per-phase split off
ADR-0237;
sibling to in-flight Phase B (PR #347).
Type
feat— new fork-local feature scaffoldBug-status hygiene
Netflix golden-data gate
assertAlmostEqualscore.Deep-dive deliverables (ADR-0108)
Bucket chore: release master #1 ranks per-shot tuning High impact, M effort. Re-cited in
ADR-0276 §References.
considered: five alternatives (pluggable scaffold chosen,
wait-for-deps, native
--qpfile/--zonesfrom day one, inlineONNX shot detector, Phase B bisect inlined into Phase D).
AGENTS.mdinvariant note — added a "Phase Drebase-sensitive invariants" block to
tools/vmaf-tune/AGENTS.md:predicate signature contract, half-open
Shotranges,vmaf-perShotas canonical detector.changelog.d/added/ADR-0276-vmaf-tune-phase-d-per-shot.md(per ADR-0221 fragment pattern).
docs/rebase-notes.md.Reproducer
Mocked smoke run (no
ffmpeg,vmaf, orvmaf-perShotrequired):python -m pytest tools/vmaf-tune/tests/test_per_shot.py -q # 16 new tests pass; total vmaf-tune suite: 29 tests python tools/vmaf-tune/vmaf-tune tune-per-shot --helpEnd-to-end shape (with mocked vmaf-perShot via the runner seam, see
test_cli_tune_per_shot_smokeintools/vmaf-tune/tests/test_per_shot.py) emits a JSON plan withper-shot
(start_frame, end_frame, crf, predicted_vmaf)rows plusper-segment FFmpeg argv and a concat-demuxer command.
Out of scope
predicateargument once it lands.--qpfilefor x264,--zonesfor x265, SVT-AV1 segment tables) — lands per-codecalongside each new adapter PR.
claims — separate research item, likely BVI-DVC + Netflix
Public + KoNViD subsets re-encoded through Phase A's grid sweep.
vmaf-perShotstays the canonical detector to avoid two paralleltruth sources.