Skip to content

fix: render robustness — sub-comp src paths, alpha tag case, encoder + matter improvements#627

Merged
jrusso1020 merged 10 commits intomainfrom
fix/sub-comp-video-path-resolution-and-render-fixes
May 5, 2026
Merged

fix: render robustness — sub-comp src paths, alpha tag case, encoder + matter improvements#627
jrusso1020 merged 10 commits intomainfrom
fix/sub-comp-video-path-resolution-and-render-fixes

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

Summary

Six focused commits stacked on this branch. Each addresses a "silent disagreement between two layers" — bugs where the studio preview rendered correctly but production renders looked wrong with no error in stdout. The smoking gun for each was a real production composition that worked in studio and broke at render.

Bugs fixed (in dependency order)

1. Sub-comp <video src="../assets/..."> silently freezes at frame 0

fix(engine,producer)path.join(projectDir, "../assets/foo") escapes the project root and existsSync returns false → extraction skipped → no per-frame injection → the <video> element's first decoded frame paints for every screenshot. Browser URL parsing clamps .. at origin root; Node's path.join doesn't. Adds resolveProjectRelativeSrc that mirrors browser clamping as a fallback, plus a loud stderr warning on miss. Mirrored in audioMixer.ts and renderOrchestrator.ts HDR probe loop.

2. ALPHA_MODE uppercase tag misclassified as no-alpha

fix(engine) — newer libavformat builds (and our own hyperframes remove-background) write the VP9-alpha sidecar tag as ALPHA_MODE. ffprobe.ts only checked alpha_mode (lowercase). Result: alpha-having webms extracted as JPGs, injected <img> was a fully opaque rectangle covering every static element below it on the z-stack. Studio preview rendered fine via native <video> playback; production renders covered headlines and captions with the avatar overlay. This was the WYSIWYG bug. Test pinning case-insensitivity sits next to the existing lowercase test.

3. h264 negative DTS freezes some players

fix(engine) — libx264 with B-frames produces negative DTS at stream start. VS Code preview, several browsers, and some HW decoders freeze on the first frame and audio plays alone. -bf 0 for h264 + -avoid_negative_ts make_zero at encoder + mux. ~5–10% larger files at the same CRF — worthwhile for "the file plays everywhere." Mirrored in streamingEncoder.ts.

4. Black first frame when two videos with different codecs are present

fix(engine)frameCapture polled readyState >= 1 (HAVE_METADATA — only dimensions known). When an h264 mp4 and a VP9 webm decode at different rates the faster passes readiness while the slower hasn't painted, producing a black "first frame" for the slower clip. Bumped to >= 2 (HAVE_CURRENT_DATA — first frame rasterized).

5. Sharp 3-channel output silently scanlined the alpha mask

fix(cli) — Sharp upcasts 1-channel raw input to RGB-interleaved during .resize(), so fullMask[i] was reading R,G,B,R,G,B... of pixels 0..691199 instead of the alpha for all 2,073,600 pixels. Symptom: every transparent webm out of remove-background had a horizontal scanline alpha pattern; the avatar appeared semi-transparent throughout. Forced single-channel with .toColourspace("b-w").

6. Generation-loss / color-shift on the cutout overlay

fix(cli) — added BT.709 + limited-range encoder tags so Chrome's YUV→RGB matches the source mp4 (without these, ffmpeg's default RGB→YUV is BT.601 and skin tones drift visibly when the cutout is overlaid). Added Quality preset type (fast/balanced/best → CRF 30/18/12), default raised from 30 → 18 so the most common compositing pattern works out of the box. New --quality CLI flag.

What you'll notice immediately

  • Existing projects with <video src="../assets/..."> in sub-compositions: now actually animate in render (were silent first-frame freezes before).
  • Compositions that use remove-background output: text-behind-subject works in production renders, not just in studio preview.
  • All h264 mp4 outputs play in VS Code preview, browser <video>, and HW decoders that previously froze.
  • npx hyperframes remove-background outputs no longer have the scanline artifact.

Tests

  • engine: 525 → 532 passing (+7 — resolver, BT.709, quality presets, ALPHA_MODE regression).
  • cli: 240 passing (the 2 GPU-config failures are pre-existing flakes on clean main, unrelated).
  • producer + core: green.

Skill / docs updates

  • skills/hyperframes-cli/SKILL.md: 3-pattern compositing table + the two non-obvious rules (wrap video in non-timed div for opacity control, both videos data-start=0 for sync).
  • skills/hyperframes/patterns.md: worked text-behind-subject example.
  • docs/guides/remove-background.mdx: compositing pitfalls section, quality preset table, --quality flag.

Test plan

  • All engine + cli + core + producer tests pass locally
  • Render a composition with <video src="../assets/foo"> in a sub-comp and confirm it animates (vs. freezing on frame 0 on main)
  • Render a composition that uses remove-background output as an overlay and confirm the headline-behind-silhouette effect renders correctly (vs. covered by the avatar on main)
  • Open the rendered mp4 in VS Code preview and confirm playback (vs. frozen first frame on main)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: Approve

Solid, well-motivated bugfix PR. Each change traces back to a real production symptom with a clear root cause. No blockers — a few things worth discussing for follow-ups.

Concerns

-bf 0 only covers software h264 paths (chunkEncoder.ts:142-163, streamingEncoder.ts:217-236)
The B-frame disable is inside the libx264/libx265 branch only. GPU h264 paths (nvenc, qsv, vaapi) bypass it — nvenc emits B-frames by default. The unconditional -avoid_negative_ts make_zero at the mux level mostly covers it, but if the goal is "files play everywhere," GPU-encoded h264 can still hit the same negative-DTS freeze. At minimum, a comment acknowledging the mux-level fallback would help the next person. Follow-up candidate: push -bf 0 into GPU branches too.

resolveProjectRelativeSrc strips all leading ../ unconditionally (videoFrameExtractor.ts:~476-489)
The regex src.replace(/^(\.\.[\\/])+/, "") clamps every leading ../ segment. If a monorepo legitimately has ../assets/ as a valid sibling directory, and a same-named file exists inside the project, the resolver silently prefers the clamped path. Low probability since existsSync + literal-join-first mitigate it, but worth a comment documenting the precedence choice. Also: a path like assets/../../foo (escapes after the first segment) won't be caught — path.join collapses it past baseDir.

readyState >= 2 may surface new timeouts (frameCapture.ts)
Bumping from HAVE_METADATA to HAVE_CURRENT_DATA is correct — you need actual frame data, not just dimensions. But compositions with many HEVC/VP9 sources on slow CI cold-starts may now hit the videoDeadline (reuses playerReadyTimeout) where they previously passed by getting lucky with a black frame 0. Worth monitoring after merge.

Suggestions

Case-insensitive tag lookup (ffprobe.ts:280-284)
tags.alpha_mode ?? tags.ALPHA_MODE covers the two known cases. A more future-proof approach:

const alphaMode = Object.entries(videoStream.tags ?? {})
  .find(([k]) => k.toLowerCase() === "alpha_mode")?.[1] ?? "";

Handles any future casing variant as ffmpeg/libav evolves. Current fix is pragmatic — not blocking on this.

Consider deduping stderr warnings (videoFrameExtractor.ts:540-547)
The new stderr warning fires per-video inside parallel iteration. A comp with 50 broken sub-comp videos = 50 identical warnings. Consider collecting unique video.src values and logging a single summary.

resolveProjectRelativeSrc debug logging — The function warns on miss but is silent on success. A debug-level log showing which candidate was resolved would help authors troubleshoot path issues without needing to add their own logging.

BT.709 tags on ProRes path (pipeline.ts)
The BT.709 colorspace tags are only added to the webm encoder path. ProRes typically carries its own color metadata, but worth confirming ProRes 4444 defaults to BT.709 in the ffmpeg version used — same color shift could occur with .mov output otherwise.

Nits

  • pipeline.ts: Quality / QUALITIES / QUALITY_CRF / DEFAULT_QUALITY / isQuality is a lot of surface for three strings. const QUALITY_CRF = { fast: 30, balanced: 18, best: 12 } as const with Quality = keyof typeof QUALITY_CRF would collapse it. Non-blocking.
  • videoFrameExtractor.test.ts: dynamic require("node:fs") calls inside an ES module test work via Vitest but are inconsistent with the rest of the file's imports.
  • SKILL.md and patterns.md both contain near-identical text-behind-subject examples — reference one from the other to avoid drift.
  • renderOrchestrator.ts:2320 still uses v.src.startsWith("/") instead of isAbsolute(v.src) — the same Windows footgun the audioMixer comment warns about. Pre-existing, but since you're centralizing path resolution, worth catching in this pass.

What's good

  • Root-cause analysis in every commit is textbook. Each commit tells a complete story — symptom, cause, fix, verification. The kind of commit messages that make git blame useful two years from now.
  • The Sharp .toColourspace("b-w") fix is elegant. Single line fixing a fundamental buffer-layout misunderstanding, with a comment explaining exactly why Sharp upcasts. Maximum signal-to-code ratio.
  • resolveProjectRelativeSrc is a clean abstraction that kills duplication across 3 call sites (videoFrameExtractor, audioMixer, renderOrchestrator). Browser-clamping semantics are well-documented in JSDoc.
  • Test coverage is appropriate — regression tests for alpha tag casing, path resolver (5 cases including precedence), BT.709 tags, quality presets. Not over-tested, not under-tested.
  • The -bf 0 + -avoid_negative_ts make_zero belt-and-suspenders is the right default for a tool producing files consumed by arbitrary players. Correctness over bitrate efficiency.
  • Documentation is practical — the three-pattern compositing table and "two non-obvious rules" read like they were written by someone who hit these problems and wants to save the next person hours.

Review by Magi

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict — Approve with nits

Important

  • chunkEncoder.ts / streamingEncoder.ts: -bf 0 is added inside the SW (libx264/libx265) branch only. HW h264 paths (nvenc, qsv, vaapi) bypass it — nvenc in particular emits B-frames by default. If the goal is "files play everywhere," GPU-encoded h264 outputs can still hit the same negative-DTS freeze. The -avoid_negative_ts make_zero you added unconditionally below mostly covers it at the mux level, but tighter to also push -bf 0 for h264 in the GPU branches (nvenc: -bf 0, qsv: -b_strategy 0 -bf 0, vaapi: -bf 0). At minimum, leave a comment that we're relying on -avoid_negative_ts to cover GPU h264. (packages/engine/src/services/chunkEncoder.ts:142-163, streamingEncoder.ts:217-236)

  • resolveProjectRelativeSrc only clamps a leading ../. A path like assets/../../foo (escapes only after the first segment) won't be caught — path.join collapses it to <parentOfBase>/foo which silently escapes baseDir the same way. Low likelihood in practice, but worth either documenting the limitation or doing one extra check: if the joined result is outside projectDir, attempt a clamp. (packages/engine/src/services/videoFrameExtractor.ts:476-489)

  • The new resolver is shared by audioMixer, videoFrameExtractor, and renderOrchestrator, but each call site still has its own absolute/HTTP guards inline. renderOrchestrator.ts:2320 still uses v.src.startsWith("/") instead of isAbsolute(v.src) — the same Windows footgun the audioMixer comment explicitly warns about. Either keep the guards consistent (use isAbsolute) or move them inside the resolver. (packages/producer/src/services/renderOrchestrator.ts:2320)

Nits

  • ffprobe.ts:280-284: case-insensitive read is good; consider folding into a single helper readTagCI(tags, "alpha_mode") — pattern will recur for other sidecar tags as libavformat versions them.
  • videoFrameExtractor.ts:540-547: the new stderr warning is great, but it fires per-video inside Promise.all-style iteration. If a comp has 50 broken sub-comp videos = 50 writes. Consider deduping by video.src.
  • pipeline.ts: Quality / QUALITIES / QUALITY_CRF / DEFAULT_QUALITY / isQuality is a lot of surface for three strings. A single const QUALITY_CRF = { fast: 30, balanced: 18, best: 12 } as const with Quality = keyof typeof QUALITY_CRF would do it. Non-blocking.
  • frameCapture.ts: bumping readyState >= 1 to >= 2 is correct, but in the begin-frame loop the videoDeadline reuses playerReadyTimeout — if that's tight (default ~5s), HEVC/VP9 cold-starts on slow CI may now time out where they previously passed. Worth a CI watch.

What's good

  • Each commit is a real production bug with a clean root-cause explanation in the comments — the ALPHA_MODE case fix and the Sharp b-w colorspace fix in particular are the kind of "silent disagreement between two layers" bugs that're hard to spot without the smoking gun. Comments on disk are excellent.
  • BT.709 + limited-range tags on the cutout encoder is the right fix and is pinned by tests (pipeline.test.ts:46-77).
  • Test coverage on the resolver (5 cases incl. compiledDir precedence + miss returning baseDir join) and on the uppercase ALPHA_MODE regression is exactly what these silent-data-bug fixes need.
  • Loud stderr warning when a video src can't resolve — turns the worst class of bug here (silent first-frame freeze) into something authors will actually see.

Review by Vai

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 5, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hyperframes 🟢 Ready View Preview May 5, 2026, 2:38 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

jrusso1020 added a commit that referenced this pull request May 5, 2026
- engine/chunkEncoder, engine/streamingEncoder: extend `-bf 0` to GPU h264
  paths (nvenc, qsv, vaapi) and `-b_strategy 0` for qsv so GPU-encoded
  outputs avoid negative-DTS freezes too — not just SW libx264.
- engine/videoFrameExtractor: detect mid-path traversal (e.g.
  `assets/../../foo.mp4`) by normalizing first and re-anchoring at the
  project root. Adds a regression test.
- engine/videoFrameExtractor: dedupe stderr "src not resolvable" warnings
  by `video.src` so a comp with N broken sources logs once, not N times.
- engine/videoFrameExtractor.test: drop dynamic `require("node:fs")`,
  use ES `import { writeFileSync } from "node:fs"`.
- engine/ffprobe: extract `readTagCI` helper for case-insensitive ffprobe
  tag reads (will recur for other libavformat-versioned sidecar tags).
- cli/background-removal/pipeline: collapse Quality / QUALITIES /
  QUALITY_CRF / DEFAULT_QUALITY / isQuality surface using
  `Quality = keyof typeof QUALITY_CRF`.
- producer/renderOrchestrator: replace `v.src.startsWith("/")` with
  `isAbsolute(v.src)` in the HDR probe path so Windows absolute paths
  (`C:\...`) aren't treated as relative — matches the audioMixer guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jrusso1020 and others added 8 commits May 4, 2026 20:27
…remove-background

- inference.ts: force `.toColourspace("b-w")` on the resized mask. Sharp upcasts
  the 1-channel raw input to RGB-interleaved during resize, so `fullMask[i]`
  was reading R,G,B,R,G,B... of pixels 0..691199 instead of the alpha for all
  2,073,600 pixels. Visible symptom: horizontal scanline alpha artifact in
  every transparent webm — the avatar appeared semi-transparent throughout.
- pipeline.ts: add BT.709 + limited-range colorspace tags so Chrome's YUV→RGB
  matches the source mp4 (without these, ffmpeg's default RGB→YUV is BT.601
  and skin tones drift visibly when the cutout is overlaid on its source).
- pipeline.ts: add Quality preset type ("fast"/"balanced"/"best" → CRF 30/18/12).
  Default raised from CRF 30 → 18 ("balanced") so the most common pattern
  (text-behind-subject) works out of the box without visible doubling.
- remove-background.ts: wire `--quality` flag with validation, +2 examples.
- Tests: BT.709 tags present, quality preset → CRF mapping, default is balanced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lay in every player

Three related render robustness fixes:

1. frameCapture.ts: bump videos-ready check from `readyState >= 1`
   (HAVE_METADATA — only dimensions known) to `>= 2` (HAVE_CURRENT_DATA —
   first frame is rasterized). Without this, when two `<video>` elements
   with different codecs (h264 mp4 + VP9 webm) decode at different rates,
   the faster one passes readiness while the slower one still hasn't
   painted, producing a black "first frame" for the slower clip.

2. chunkEncoder.ts (libx264 path) + streamingEncoder.ts: disable B-frames
   for h264 (`-bf 0`). Standard libx264 with B-frames produces negative
   DTS at stream start (the first B-frame's decode order is "before" the
   first I-frame's presentation time). VS Code preview, several browser
   <video> implementations, and some HW decoders freeze on the first
   frame and only audio plays. -bf 0 makes PTS == DTS at every frame,
   eliminating the issue at the source. Quality cost is ~5–10% larger
   files at the same CRF — worthwhile for "the file plays everywhere".

3. chunkEncoder.ts (encoder + mux paths): add `-avoid_negative_ts make_zero`
   as belt-and-suspenders against negative DTS sneaking back in via
   `-c:v copy` mux passes when audio/video PTS bases differ.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… extraction misses

A <video src='../assets/foo.mp4'> inside a sub-composition silently dropped
from extraction; the rendered output froze on the first decoded frame for
the entire clip, with no error in stdout.

Root cause: browser URL resolver clamps '..' at origin root (studio preview
loads fine), but path.join(projectDir, '../assets/foo.mp4') normalizes to
parent-of-project/assets/foo.mp4, which usually doesn't exist. existsSync
returns false, extraction is skipped, no frame lookup is built, the
per-frame injector has nothing to swap, and the <video> element's first
decoded frame paints every screenshot.

- Adds resolveProjectRelativeSrc in videoFrameExtractor that mirrors browser
  clamping (literal join first, then leading '..' stripped).
- Surfaces a loud stderr warning when the resolver misses.
- Mirrors fix in audioMixer.ts (same bug for <audio src='../'>) and
  renderOrchestrator HDR probe loop.
- +6 regression tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skill (hyperframes-cli): three-pattern table (cutout-over-different-scene
vs over-its-own-source vs over-different-take) + the two non-obvious rules
(wrap video in non-timed div for opacity control, both videos data-start=0
for sync). Skill (hyperframes/patterns): worked text-behind-subject example.
Docs: --quality flag, compositing pitfalls section, quality preset table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Newer libavformat builds write the VP9-alpha sidecar tag as 'ALPHA_MODE'
(uppercase); older builds write 'alpha_mode'. ffprobe.ts only checked the
lowercase form, so files produced by recent ffmpeg encoders (including the
output of 'hyperframes remove-background' itself) were misclassified as
having no alpha channel. Knock-on effect: the producer extracted them as
JPGs (no alpha), the injected <img> overlays were fully opaque rectangles,
and any element below them on the z-stack (text, captions, other layers)
silently disappeared from the rendered output — even though the studio
preview rendered the same composition correctly via native <video> playback.

Symptom in our repro: a text-behind-subject composition showed the
headline correctly in studio preview but the production render covered
the headline entirely with the opaque avatar image.

Fix: read videoStream.tags.alpha_mode OR videoStream.tags.ALPHA_MODE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Locks in the case-insensitive behavior alongside the existing alpha_mode
(lowercase) test. If either path regresses, the producer would silently
extract alpha-having webms as opaque JPGs and the injected <img> overlays
would cover every element below them on the z-stack — a bug that doesn't
surface in the studio preview, only in production renders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ng on tags

Tag-based alpha detection (alpha_mode / ALPHA_MODE / pix_fmt yuva*) is
fundamentally brittle. Failure modes seen in the wild:
- case-sensitivity across ffmpeg versions (alpha_mode vs ALPHA_MODE)
- older muxers that omit the sidecar tag entirely
- mp4-as-webm rewraps that drop the tag
- ffprobe reporting yuv420p for VP9-with-alpha because the alpha plane
  lives in a Matroska BlockAdditional sidecar, not the main pix_fmt

Each of those silently strips alpha at extraction time. The bug doesn't
surface until the rendered output is missing layers — frustrating to debug,
silent in stdout. The previous case-insensitive fix patched one of the
failure modes; this commit removes the class.

The robust alternative is codec-based: any bitstream that CAN carry alpha
(VP9, VP8, ProRes 4444) gets the alpha-aware decoder and PNG output by
default, regardless of what the tag says. The cost is a small file-size
increase on opaque VP9/VP8 sources (cached PNGs vs JPGs); the benefit is
no class of silent alpha loss from tag misdetection.

- Adds codecMayHaveAlpha() + decoderForCodec() helpers and exports them.
- Updates extractVideoFramesRange to force libvpx-vp9 / libvpx for VP9 / VP8
  unconditionally (was: only when metadata.hasAlpha).
- Updates resolveFrameFormat to default to PNG for any alpha-capable codec
  (was: only when metadata.hasAlpha).
- +4 unit tests covering the codec table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- engine/chunkEncoder, engine/streamingEncoder: extend `-bf 0` to GPU h264
  paths (nvenc, qsv, vaapi) and `-b_strategy 0` for qsv so GPU-encoded
  outputs avoid negative-DTS freezes too — not just SW libx264.
- engine/videoFrameExtractor: detect mid-path traversal (e.g.
  `assets/../../foo.mp4`) by normalizing first and re-anchoring at the
  project root. Adds a regression test.
- engine/videoFrameExtractor: dedupe stderr "src not resolvable" warnings
  by `video.src` so a comp with N broken sources logs once, not N times.
- engine/videoFrameExtractor.test: drop dynamic `require("node:fs")`,
  use ES `import { writeFileSync } from "node:fs"`.
- engine/ffprobe: extract `readTagCI` helper for case-insensitive ffprobe
  tag reads (will recur for other libavformat-versioned sidecar tags).
- cli/background-removal/pipeline: collapse Quality / QUALITIES /
  QUALITY_CRF / DEFAULT_QUALITY / isQuality surface using
  `Quality = keyof typeof QUALITY_CRF`.
- producer/renderOrchestrator: replace `v.src.startsWith("/")` with
  `isAbsolute(v.src)` in the HDR probe path so Windows absolute paths
  (`C:\...`) aren't treated as relative — matches the audioMixer guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jrusso1020 jrusso1020 force-pushed the fix/sub-comp-video-path-resolution-and-render-fixes branch from 66ef751 to f4ecf96 Compare May 5, 2026 03:30
…person video

The model removes background from any video with a person — we tested
with avatars because they were convenient, but anyone can bring a
talking-head clip, presenter footage, vlog, etc. Replace avatar-specific
filenames (avatar.mp4 / brandon.mp4) with neutral subject.mp4 (or
presenter.mp4 in the text-behind-subject example) and rephrase
copy that read as if avatars were the only use case.

Touches docs/guides/remove-background.mdx, hyperframes-media SKILL.md,
and hyperframes/patterns.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed+snapshot

Four regression tests (font-variant-numeric, many-cuts, missing-host-comp-id,
variables-prod) failed on this PR with `Unable to parse PSNR output at <last
checkpoint>s`. Root cause: the harness derived all 100 checkpoints from the
*rendered* video's container duration, then asked ffmpeg's PSNR filter to
compare the same frame index from both videos.

The encoder changes earlier in this PR add `-avoid_negative_ts make_zero` to
the mux step. With AAC audio that shifts the first audio sample to t=0
instead of the encoder-delay offset, extending reported container duration
by ~20ms without changing video frame count. For the four failing tests,
the i=99 checkpoint then landed on a frame index that exists in the rendered
video but not in the snapshot baseline (e.g. round(2.98998 * 24) = 72 in a
72-frame baseline). ffmpeg's PSNR filter ran on zero matched frames and
emitted no `average:` line, so the parser threw.

Fix: probe both videos and use min(rendered, snapshot) duration when
spreading checkpoints. This is the correct semantics for symmetric PSNR
comparison anyway — both videos must have a frame at every sampled time.
The change is local to the harness; no encoder behavior changes, no
baselines regenerated.

Other regression tests with audio (chat, sub-composition-video,
vignelli-stacking) passed because their checkpoint-99 frame index landed
inside the baseline's frame range with several frames of slack. The four
failing tests had round-number durations where a 20ms drift was enough to
push the last checkpoint past `nb_frames - 1`.
@jrusso1020 jrusso1020 merged commit 0973229 into main May 5, 2026
38 checks passed
@jrusso1020 jrusso1020 deleted the fix/sub-comp-video-path-resolution-and-render-fixes branch May 5, 2026 04:38
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Regression CI is now green — all 10 regression shards + preview-regression + the rollup regression job all SUCCESS on 7d1d8ead.

What the fix does

Changed only packages/producer/src/regression-harness.ts (8 lines): probe both rendered and snapshot videos, sample the 100 PSNR checkpoints from min(rendered, snapshot) duration instead of the rendered duration alone.

Why the fix is the right cause-fix, not a workaround

Root cause was a harness bug, not an encoder bug. The encoder commit (0e541673) added -avoid_negative_ts make_zero which correctly normalizes PTS for downstream player compatibility (VS Code preview, browsers, HW decoders). A side effect is that the AAC encoder delay shifts to t=0, extending container metadata duration by ~8-10ms without changing video frame count.

The harness was sampling timestamps from the rendered video's duration and asking ffmpeg's PSNR filter for that frame from both videos. When durations diverged, the i=99 checkpoint landed on a frame index that existed in one video but not the other (round(2.98998 * 24) = 72 in a 72-frame snapshot). ffmpeg matched zero frames → no average: line → parser threw.

Symmetric PSNR comparison must sample within the duration both videos cover. The fix encodes that.

Empirical evidence (collected before pushing)

I regenerated the 3 failing baselines via bun run docker:test:update and compared against the originals:

Suite Old baseline New baseline Δ frames
font-variant-numeric 72 frames, 3.000s 72 frames, 3.008s 0
many-cuts 168 frames, 5.600s 168 frames, 5.610s 0
missing-host-comp-id 72 frames, 3.000s 72 frames, 3.008s 0

Frame counts identical. Only container metadata duration drifted. No frames are being dropped — the ~20ms shift is purely the audio-delay normalization.

I also ran the 3 tests locally with the harness fix against the existing baselines (no regen):

Suite Visual Audio correlation
font-variant-numeric ✅ 0 failed frames, ~46.4 dB avg PSNR 1.000
many-cuts ✅ 0 failed frames, ~38 dB avg PSNR 0.994
missing-host-comp-id ✅ 0 failed frames, ~36.6 dB avg PSNR 0.976

All 100 checkpoints in each test compared cleanly. CI replicates this. The existing baselines are still valid ground truth — no regen needed for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants