Skip to content

feat(ai,tools): hw_encoder_corpus.py — Phase A real-corpus runner#392

Merged
lusoris merged 2 commits intomasterfrom
feat/phase-a-hw-encoder-corpus-runner
May 5, 2026
Merged

feat(ai,tools): hw_encoder_corpus.py — Phase A real-corpus runner#392
lusoris merged 2 commits intomasterfrom
feat/phase-a-hw-encoder-corpus-runner

Conversation

@lusoris
Copy link
Copy Markdown
Owner

@lusoris lusoris commented May 4, 2026

Summary

fr_regressor_v2 needs per-frame canonical-6 + VMAF + codec metadata to actually train. The existing vmaf-tune corpus CLI (ADR-0237, Phase A) emits only pooled VMAF — a scope-cut.

This PR ships scripts/dev/hw_encoder_corpus.py: encodes a raw YUV with NVENC / QSV / VAAPI / libx264 at a CRF/CQ grid, decodes back to raw YUV, scores with libvmaf (CUDA backend), and emits one JSONL row per (source, encoder, cq, frame) carrying integer_adm2, integer_vif_scale0..3, integer_motion2 (canonical-6) + per-frame VMAF + encoder / cq / enc_bytes / enc_time_ms.

Smoke evidence

Local run on this fork (RTX 4090 + Arc A380):

pipeline encoder rows VMAF range mean
NVENC h264_nvenc 5,640 16.21–100.00 86.43
NVENC hevc_nvenc 5,640 18.35–100.00 87.85
NVENC av1_nvenc 5,640 44.97–100.00 95.09
Arc QSV h264_qsv 5,640 28.75–100.00 88.34
Arc QSV hevc_qsv 5,640 28.41–100.00 88.05
Arc QSV av1_qsv 5,640 26.10–100.00 88.21

33,840 rows in ~5 min wall — both pipelines run concurrently because NVENC and Arc QSV use different hardware engines on different cards. Output lands in runs/phase_a/ (gitignored); rerun to reproduce.

Companion doc

docs/development/intel-arc-vaapi-driver-priority.md captures the LIBVA_DRIVER_NAME=iHD gotcha for multi-card hosts. NVIDIA's libva-driver-nvidia shim shadows Intel's iHD by default; QSV calls against /dev/dri/renderD129 (Arc) silently route through NVIDIA's translator and MFX session fails with -9. Forcing LIBVA_DRIVER_NAME=iHD fixes it.

ADR-0108 deliverables

  • (1) Research digest
    no digest needed: tooling, not a research result; ADR-0237 + future fr_regressor_v2 docs cover the why
  • (2) Decision matrix
    no alternatives: only-one-way fix — vmaf-tune corpus's pooled-only schema can't be extended in-place without breaking its row contract
  • (3) AGENTS.md invariant note: docs/rebase-notes.md entry 0234 documents the LIBVA_DRIVER_NAME=iHD invariant
  • (4) Reproducer / smoke-test command: see Smoke evidence + docs/rebase-notes.md 0234
  • (5) CHANGELOG fragment: changelog.d/added/hw-encoder-corpus-runner.md
  • (6) Rebase note: docs/rebase-notes.md entry 0234

Test plan

  • Local NVENC pipeline (RTX 4090) — 16,920 rows produced
  • Local Arc QSV pipeline (Arc A380) — 16,920 rows produced
  • Per-frame canonical-6 schema verified
  • CI matrix (will run when promoted from draft)

🤖 Generated with Claude Code

Encodes a raw YUV with NVENC / QSV / VAAPI / libx264 at a CRF/CQ
grid, decodes back to raw YUV, scores with libvmaf (CUDA backend),
and emits one JSONL row per (source, encoder, cq, frame) carrying
canonical-6 features + per-frame VMAF + encode metadata.

Local smoke produced 33,840 per-frame rows in ~5 min wall time
across 9 Netflix refs x 6 hardware codecs (h264/hevc/av1 on both
NVIDIA NVENC and Intel Arc QSV) x 4 CQ values.

Companion doc docs/development/intel-arc-vaapi-driver-priority.md
captures the LIBVA_DRIVER_NAME=iHD gotcha for multi-card hosts.

Output landing in runs/phase_a/ is gitignored — rerun to reproduce.

Co-Authored-By: Claude <noreply@anthropic.com>
@lusoris lusoris marked this pull request as ready for review May 5, 2026 03:29
Copilot AI review requested due to automatic review settings May 5, 2026 03:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fork-local “Phase A real-corpus runner” that generates per-frame canonical-6 + VMAF + encoder metadata rows (JSONL) by encoding/decoding via ffmpeg (NVENC/QSV/VAAPI/libx264) and scoring via libvmaf’s CUDA path, plus accompanying documentation and changelog notes.

Changes:

  • Introduces scripts/dev/hw_encoder_corpus.py to produce per-frame training rows for fr_regressor_v2.
  • Documents the Intel Arc multi-card VAAPI/QSV driver-selection pitfall (LIBVA_DRIVER_NAME=iHD).
  • Records the change in rebase notes and adds a changelog fragment.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
scripts/dev/hw_encoder_corpus.py New dev script to run hw-encode → decode → CUDA-VMAF scoring and emit per-frame JSONL rows.
docs/rebase-notes.md Adds a new rebase-note entry for the runner (but includes apparent paste/formatting issues).
docs/development/intel-arc-vaapi-driver-priority.md New doc explaining the LIBVA_DRIVER_NAME=iHD workaround on multi-GPU systems.
changelog.d/added/hw-encoder-corpus-runner.md Changelog fragment announcing the new runner + companion doc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +63 to +108
pre_args: list[str] = ["ffmpeg", "-y", "-loglevel", "error"]
if encoder.endswith("_qsv") and qsv_device is not None:
pre_args += [
"-init_hw_device",
f"qsv:hw,child_device={qsv_device}",
]
if encoder.endswith("_vaapi") and vaapi_device is not None:
pre_args += [
"-init_hw_device",
f"vaapi=va:{vaapi_device}",
]
pre_args += [
"-f",
"rawvideo",
"-pix_fmt",
pix_fmt,
"-s",
f"{width}x{height}",
"-r",
str(framerate),
"-i",
str(source),
]
if encoder.endswith("_nvenc"):
post = ["-c:v", encoder, "-cq", str(cq), "-preset", "p4"]
elif encoder.endswith("_qsv"):
post = ["-c:v", encoder, "-global_quality", str(cq), "-preset", "medium"]
elif encoder.endswith("_vaapi"):
post = [
"-vf",
"format=nv12,hwupload=extra_hw_frames=16",
"-c:v",
encoder,
"-qp",
str(cq),
]
else:
# CPU fallback (libx264) — the corpus may want a CPU baseline row.
post = ["-c:v", encoder, "-crf", str(cq), "-preset", "medium"]
if extra:
post += extra
cmd = pre_args + post + [str(out_mp4)]

t0 = time.monotonic()
p = subprocess.run(cmd, capture_output=True, text=True, check=False) # noqa: S603
elapsed_ms = (time.monotonic() - t0) * 1000.0
Comment on lines +184 to +202
for fr in payload.get("frames", []):
m = fr.get("metrics", {})
if not all(k in m for k in CANONICAL_6):
continue
row = {
"src": src,
"encoder": encoder,
"cq": cq,
"enc_bytes": enc_bytes,
"enc_time_ms": enc_time_ms,
"frame_index": fr.get("frameNum", len(rows)),
"vmaf": m.get("vmaf"),
"adm2": m["integer_adm2"],
"vif_scale0": m["integer_vif_scale0"],
"vif_scale1": m["integer_vif_scale1"],
"vif_scale2": m["integer_vif_scale2"],
"vif_scale3": m["integer_vif_scale3"],
"motion2": m["integer_motion2"],
}
Comment on lines +242 to +245
src_stem = args.source.stem
written = 0
with args.out.open("a", encoding="utf-8") as fh:
for cq in args.cq:
Comment on lines +261 to +287
if rc != 0 or sz == 0:
print(
f"[skip] {src_stem} {args.encoder} cq{cq}: encode rc={rc}", file=sys.stderr
)
continue
yuv = td_path / f"{src_stem}_{args.encoder}_cq{cq}.yuv"
if decode_to_raw(mp4, yuv, args.pix_fmt) != 0 or not yuv.exists():
print(
f"[skip] {src_stem} {args.encoder} cq{cq}: decode failed", file=sys.stderr
)
continue
json_out = td_path / "vmaf.json"
if (
score_cuda(
args.vmaf_bin,
args.source,
yuv,
args.width,
args.height,
args.pix_fmt,
json_out,
)
!= 0
or not json_out.exists()
):
print(f"[skip] {src_stem} {args.encoder} cq{cq}: score failed", file=sys.stderr)
continue
Comment thread docs/rebase-notes.md

- **Touches**:
- `libvmaf/src/vulkan/kernel_template.h` — fork-local; new
- `libvmaf/src/vulkan/kernel_template.h` — fork-local. Output landing in `runs/phase_a/` is gitignored — rerun the script to reproduce.
Comment thread docs/rebase-notes.md
Comment on lines +7172 to +7175
- **Touches**: new `scripts/dev/hw_encoder_corpus.py` (no existing
caller; opt-in tooling). Output landing in `runs/phase_a/` is gitignored — rerun the script to reproduce.
`docs/development/intel-arc-vaapi-driver-priority.md`. Output landing in `runs/phase_a/` is gitignored — rerun the script to reproduce.
stratified sample, 58 KiB).
Comment thread docs/rebase-notes.md
Comment on lines +7176 to +7182
- **Invariant**: the script's QSV path forces
`env['LIBVA_DRIVER_NAME']='iHD'` (set by the calling shell, not
inside the script) when targeting `/dev/dri/renderD129` on a
multi-card host that has NVIDIA's libva-driver-nvidia shim
installed. Without that, libva picks up NVIDIA's NVDEC-VAAPI
translation and the MFX session handshake fails with -9. See the
companion doc for the failure mode + fix.
Comment on lines +30 to +32
VAAPI encoders against the Arc card. The fork's
[`scripts/dev/hw_encoder_corpus.py`](../../scripts/dev/hw_encoder_corpus.py)
forces this in its QSV invocation path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants