feat(engine): closed-GOP VP9 encoder args + concat-copy smoke test#950
Conversation
miguel-heygen
left a comment
There was a problem hiding this comment.
APPROVE — correct closed-GOP VP9 encoder args + solid smoke test.
Verified:
-g <gopSize>+-keyint_min <gopSize>forces keyframes at exact intervals — correct for concat-copy-auto-alt-ref 0disables alternate reference frames that would break cross-chunk concat — load-bearing VP9-specific arg-cpu-used 2pins speed/quality tradeoff across workers for visual consistency at seams- Correct to omit
-sc_threshold 0(unsupported by libvpx-vp9) and-bf 0(VP9's reference model differs from H.264) - Alpha + closed-GOP deduplication emits
-auto-alt-ref 0exactly once — test asserts this - Smoke test is well-structured: 60 frames → 4 chunks → concat-copy → 3 verification passes (ffprobe, decode-to-null, frame count)
buildEncoderArgspublic export needed for smoke test and PR #951
All CI green.
vanceingalls
left a comment
There was a problem hiding this comment.
Solid PR 8.1 — a well-scoped gating experiment that gives 8.2 a clean evidence-based decision between Path A and Path B. The closed-GOP VP9 extension is conservative, defaulted-off, and well-tested at the unit level. Comments throughout chunkEncoder.ts are exemplary — every flag has a documented "why" tied to a specific failure mode.
Calibrated strengths
chunkEncoder.ts:300-309— the alpha +lockGopVp9double-push avoidance is the kind of nit that's easy to miss; the regression test atchunkEncoder.test.ts:601-626pins it down. Both paths want-auto-alt-ref 0for different reasons (closed-GOP boundary safety vs. libvpx-vp9 alpha incompat), and the code correctly emits it exactly once.chunkEncoder.test.ts:644-660—vp9 + lockGopForChunkConcat=true throws on missing gopSizemirrors the libx264/libx265 contract exactly. Eager surfacing of caller error is the right call.packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts:230-258— three independent verifications (-show_streams,-f null -decode,-count_frames) layered with failure-fingerprint error messages is exactly the kind of forensic test 8.2 needs to make an architectural decision. The "Path A vs Path B" framing in the failure messages is the test contract this PR exists to establish.
Findings
important — smoke test doesn't exercise the alpha (yuva420p) path. The smoke test pins pixelFormat: "yuv420p" in the encode-chunks step (line 156), but the production webm use case is alpha — getEncoderPreset("standard", "webm") returns yuva420p (chunkEncoder.ts:51-56). Alpha + concat-copy has a different failure surface than yuv420p: the alpha_mode=1 metadata must survive the concat, and libvpx-vp9's alpha channel encoding is structurally a second hidden stream whose keyframe placement must also align at chunk seams. PR 8.2 will need to know whether Path A works for alpha, not just for opaque content. Recommend adding a second describe block (webm VP9 + alpha concat-copy smoke) that re-runs the same three verifications with pixelFormat: "yuva420p" and an additional ffprobe check that the output's pix_fmt is still yuva420p. If alpha breaks concat-copy, 8.2 must take Path B regardless of what the opaque path shows — better to find that out now.
important — stderr.length > 0 decode-error gate is brittle. At webm-concat-copy.test.ts:281-285, the assertion result.exitCode !== 0 || result.stderr.length > 0 will flake on ffmpeg builds that emit non-error stderr lines at -v error (DTS warnings, container-quirk notes, etc.). The exit code is the load-bearing signal; treat stderr as diagnostic context, not a gate. Suggest: gate only on result.exitCode !== 0, and surface stderr in the message either way. Otherwise this test will flake in CI when libavformat picks up a chatty warning in a future ffmpeg upgrade.
nit — closed-GOP VP9 lacks an explicit scenecut analog to the H.264 path. The libx264/libx265 branch belt-and-suspenders with -sc_threshold 0 + -force_key_frames expr:... + x264-params: scenecut=0:open-gop=0:repeat-headers=1. The VP9 branch relies only on -g == -keyint_min to suppress mid-GOP keyframes. In practice that usually works for libvpx-vp9, but libvpx is known to insert auto keyframes on hard scene cuts even with keyint_min == keyint in some builds — and the smoke test's testsrc2 source is animated-but-smooth so it won't trigger that path. If a real composition has a hard cut at frame 7 of a 15-frame chunk, you may still end up with a non-IDR boundary. Either (a) add -sc_threshold 0 (libvpx-vp9 accepts the flag at the demuxer level, though its effect is encoder-dependent), or (b) add a smoke variant that injects a hard scene cut mid-chunk and verifies the chunk still starts with a keyframe via ffprobe -show_frames -select_streams v:0 -read_intervals %+#1. Not a blocker for the gating experiment, but worth flagging for 8.2 to handle before production composition content lands on this path.
nit — -cpu-used 2 planHash framing in the code comment is misleading. chunkEncoder.ts:271-276 says cpu-used is locked for "planHash round-trip deterministic." But planHash (planHash.ts:76-90) pins ffmpegVersion + encoderConfigCanonicalJson and doesn't observe cpu-used at all. The real reason -cpu-used 2 is good — cross-worker visual consistency when workers have different ffmpeg-build defaults — is in the same comment, and is correct. Just drop the planHash sentence; it'll confuse future readers who go looking for cpu-used in the planHash input shape.
nit — -cpu-used 2 is a hardcoded magic number. If a worker pool ever wants cpu-used 4 for faster encodes (at quality cost), there's no knob. Fine for an MVP gating PR; flag for 8.2+ if the value needs tuning. EncoderOptions.vp9CpuUsed?: number would be the natural extension.
nit — spawnSync without timeout. The 4-chunk VP9 encodes at webm-concat-copy.test.ts:142-167 are synchronous and unbounded. ffmpeg hangs are rare but possible (e.g. lavfi input deadlock on certain build configurations); a hung smoke test will block the runner with no diagnostic. Pass timeout: 30_000 (or similar) to spawnSync for hygiene.
Verdict
Verdict: APPROVE
Reasoning: The closed-GOP VP9 args are correctly scoped, defaulted-off, and unit-tested. The smoke test design is the right shape for a gating experiment — three independent verifications with forensic failure messages. The important findings (missing alpha-path coverage, brittle stderr gate) reduce confidence in PR 8.2's Path A decision but are addressable as follow-up commits on this PR or in a small follow-up before 8.2 ships. None are correctness blockers for what 8.1 itself merges (the encoder args + smoke harness, both inert until 8.2 wires them in). Ship it; recommend folding the alpha smoke and stderr fix in before 8.2.
Review by Vai
PR 8.1 of the WebM distributed-rendering plan (v1.5 backlog #1; see DISTRIBUTED-RENDERING-PLAN.md §7.2). Gates whether PR 8.2 ships Path A (concat-copy) or Path B (re-encode-in-assemble) for webm distributed mode. Two changes: 1. Extend buildEncoderArgs to lay closed-GOP args on libvpx-vp9 when lockGopForChunkConcat=true: -g <chunkSize>, -keyint_min <chunkSize>, -auto-alt-ref 0, -cpu-used 2. Mirrors the existing libx264/libx265 branches. Alt-ref disabling is the critical bit — libvpx-vp9's default non-displayable alt-ref frames can reach across chunk seams and break concat-copy. cpu-used=2 pins the speed/quality tradeoff so chunks encoded on workers with different libvpx-vp9 defaults produce visually consistent output across seams. 2. Smoke test at packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts. Generates 60 PNGs via lavfi testsrc2, encodes them as 4 VP9 chunks of 15 frames using buildEncoderArgs with lockGopForChunkConcat=true, concat-copies via ffmpeg -f concat -c copy, then runs three independent verifications: ffprobe -show_streams, ffmpeg -f null - decode test, and ffprobe -count_frames. Each verification surfaces its failure fingerprint in the error message so PR 8.2 has the data it needs to pick the right architectural path. Smoke test result locally: ALL 6 tests pass. Path A (concat-copy) works for libvpx-vp9 with the new closed-GOP args. PR 8.2 will ship the plan-time-error-removal path. Also exports buildEncoderArgs from @hyperframes/engine so adapters / tests can construct args without re-implementing the contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4573e4e to
1865b39
Compare
|
Thanks for the thorough reviews! Filed follow-up fixups on this stack — relevant ones for 8.1:
|
Merge activity
|

Description
PR 1 of 4 in the WebM (VP9) distributed-rendering series. A gating
experiment that proves closed-GOP libvpx-vp9 chunks survive
ffmpeg -f concat -c copylosslessly, so the rest of the stack canship Path A (concat-copy) rather than the slower
re-encode-in-assemble fallback.
Two changes:
Closed-GOP VP9 encoder args.
buildEncoderArgsnow lays-g <chunkSize>,-keyint_min <chunkSize>,-auto-alt-ref 0, and-cpu-used 2on libvpx-vp9 whenlockGopForChunkConcat=true.Mirrors the existing libx264/libx265 branches. The alt-ref disable
is load-bearing — libvpx-vp9's default non-displayable alt-ref
frames can reach across chunk seams and break concat-copy.
-cpu-used 2pins the speed/quality tradeoff so chunks encoded onworkers with different libvpx-vp9 defaults produce visually
consistent output across seams. Default (
lockGopForChunkConcatunset) preserves the existing in-process VP9 path unchanged.
Concat-copy smoke test at
packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts.Generates 60 PNGs via lavfi
testsrc2, encodes them as 4 VP9 chunksof 15 frames using
buildEncoderArgswithlockGopForChunkConcat=true, concat-copies viaffmpeg -f concat -c copy, then runs three independent verifications:ffprobe -show_streams,ffmpeg -f null -decode test, andffprobe -count_frames. Each verification surfaces its failurefingerprint in the error message.
Smoke test passes 6/6 locally → Path A works; the rest of the stack
takes it.
Also exports
buildEncoderArgsfrom@hyperframes/enginesoadapters / tests can construct args without re-implementing the
contract.
Testing
bunx vitest run --root packages/engine src/services/chunkEncoder.test.ts— 62/62 pass (new VP9 closed-GOP tests included)bun test packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts— passesbunx oxlint+bunx oxfmt --checkon all changed files — cleanbunx tsc --noEmit -p packages/engine/tsconfig.json— clean🤖 Generated with Claude Code