Skip to content

fix: skip metadata waits for injected video frames#575

Merged
miguel-heygen merged 1 commit intomainfrom
fix/reused-video-metadata
Apr 30, 2026
Merged

fix: skip metadata waits for injected video frames#575
miguel-heygen merged 1 commit intomainfrom
fix/reused-video-metadata

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen commented Apr 30, 2026

Problem

Closes #574.

On Windows with cached headless-shell Chrome, a composition that reuses the same video file in three timeline clips can fail before frame capture starts:

<video id="video1" src="1.mp4" data-start="0" muted data-duration="4" data-track-index="0" data-media-start="0"></video>
<video id="video2" src="1.mp4" data-start="4" muted data-duration="4" data-track-index="0" data-media-start="4"></video>
<video id="video3" src="1.mp4" data-start="8" muted data-duration="4" data-track-index="0" data-media-start="8"></video>

The reported render reaches video frame extraction, then dies at frame-capture initialization with:

[FrameCapture] video metadata not ready after 45000ms. Video elements must load metadata before capture starts.

The important detail is that by this stage HyperFrames has already extracted video pixels through FFmpeg. Native Chromium video metadata is only being waited on for DOM layout stability, not because Chromium is the source of rendered pixels.

Root Cause

The render pipeline has two separate media responsibilities:

  • FFmpeg extracts video frames and audio from declared media.
  • Chromium owns DOM layout and capture, while injected FFmpeg frames supply the video pixels before each captured frame.

Before this PR, every capture session still waited for every DOM <video> to reach readyState >= 1 unless the element was a native HDR exception. That made native browser media metadata a hard render prerequisite even when the browser would not decode or provide the final video pixels.

That is why the issue fails at 25% Starting frame capture: FFmpeg extraction has already succeeded, but capture initialization blocks on repeated native <video src="1.mp4"> metadata loading in cached Windows headless-shell Chrome.

There was a second constraint: the readiness wait also prevents first-frame layout bugs. If a skipped <video> has no native metadata, Chromium can use the default 300x150 intrinsic video size, which breaks layouts such as width: 100%; height: auto before the first injected frame. The fix therefore must not simply skip all video readiness waits; it must provide dimensions for any skipped videos.

What This Fixes

  • Treats videos with successfully extracted FFmpeg frames and usable dimensions as out-of-band rendered video sources.
  • Skips native browser metadata readiness waits for those extracted videos because Chromium is not responsible for their pixels.
  • Passes FFmpeg-probed dimensions into capture as videoMetadataHints.
  • Applies those hints before the readiness wait in both screenshot and BeginFrame initialization paths.
  • Sets missing width / height attributes and an explicit aspect-ratio only when the element does not already provide one, preserving author styles where present.
  • Keeps native HDR video IDs in the skip list, preserving the existing HEVC/HDR behavior where Chrome may not decode the source but FFmpeg/native HDR compositing can still render it.
  • Uses one buildCaptureOptions() helper so calibration, HDR DOM capture, streaming capture, parallel capture, and sequential capture receive the same skip IDs and metadata hints.
  • Adds tests for the skip-list and metadata-hint contract.
  • Adds a Windows CI regression that reproduces the issue shape after the canary render warms the cached-browser path.

Reviewer Map

Primary files:

  • packages/producer/src/services/renderOrchestrator.ts
    • collectVideoReadinessSkipIds() includes native HDR IDs plus extracted videos that have finite positive FFmpeg dimensions.
    • collectVideoMetadataHints() converts extracted FFmpeg metadata into capture hints.
    • buildCaptureOptions() threads skipReadinessVideoIds and videoMetadataHints into every capture path.
  • packages/engine/src/services/frameCapture.ts
    • applyVideoMetadataHints() runs in the page before video readiness polling.
    • Both screenshot and BeginFrame initialization call it before checking non-skipped videos for readyState >= 1.
  • packages/engine/src/types.ts
    • Adds CaptureVideoMetadataHint and documents that readiness skips should be paired with metadata hints when layout may depend on intrinsic dimensions.
  • packages/producer/src/services/renderOrchestrator.test.ts
    • Covers that extracted videos with dimensions are skipped, invalid dimensions are not, native HDR IDs are preserved, and hints are stable/sorted.
  • .github/workflows/windows-render.yml

Why This Is Safe

The skip is intentionally gated:

  • A standard video is skipped only after extractAllVideoFrames() succeeded for that video and returned usable dimensions.
  • Videos with invalid dimensions are not skipped, so the old browser readiness guard still applies.
  • DOM videos are still present for layout and element bounds; only the native metadata wait is skipped for sources whose pixels come from FFmpeg injection.
  • Metadata hints are applied conservatively: existing width, height, and explicit aspect-ratio are not overwritten.
  • Non-extracted videos, images, fonts, page readiness, and window.__hf readiness keep the existing waits.
  • The fix is not limited to the sequential path from the issue; it is threaded through calibration, HDR DOM capture, streaming encode, parallel capture, and sequential capture.

A first local revision skipped readiness too broadly and caused overlay-montage-prod first-frame layout shrinkage. The current version fixes that by pairing skips with FFmpeg metadata hints; overlay-montage-prod now passes and is listed in verification below.

Verification

Root-Cause Reproduction Before Fix

The reporter did not attach the actual 1.mp4, so the regression uses the exact issue markup and a deterministic generated 12s H.264 file named 1.mp4.

I reproduced the failure in GitHub Actions by running this branch's new Windows workflow against unpatched main:

gh workflow run windows-render.yml --repo heygen-com/hyperframes --ref fix/reused-video-metadata -f ref=main

That means the workflow contains the new issue #574 regression, but the code under test is main without this fix.

Baseline failure:

This is the same failure class as the issue, on Windows, in cached-browser mode, before the fix.

Fixed Windows Regression

The same regression passes on this PR branch:

Local Checks

  • bun run build:hyperframes-runtime
  • bunx vitest run packages/producer/src/services/renderOrchestrator.test.ts
  • bun run --filter @hyperframes/producer typecheck
  • bun run --filter @hyperframes/engine typecheck
  • bunx oxlint packages/engine/src/services/frameCapture.ts packages/engine/src/types.ts packages/engine/src/index.ts packages/producer/src/services/renderOrchestrator.ts packages/producer/src/services/renderOrchestrator.test.ts
  • bunx oxfmt --check .github/workflows/windows-render.yml packages/engine/src/services/frameCapture.ts packages/engine/src/types.ts packages/engine/src/index.ts packages/producer/src/services/renderOrchestrator.ts packages/producer/src/services/renderOrchestrator.test.ts
  • git diff --check
  • Lefthook pre-commit: lint, format, typecheck where applicable
  • Lefthook commit-msg: commitlint

Local Render Checks

  • Created /tmp/hf-issue-574-repro with the issue shape: three clips using the same 1.mp4, data-media-start=0/4/8, 12s total.
  • PRODUCER_PLAYER_READY_TIMEOUT_MS=5000 bun packages/cli/src/cli.ts render /tmp/hf-issue-574-repro --workers 1 --quality draft --fps 30 --output /tmp/hf-issue-574-h264-fixed-v2.mp4 -> completed.
  • Created /tmp/hf-issue-574-prores with the same three-clip shape using one FFmpeg-readable ProRes .mov, which exercises the browser-metadata failure class because Chromium should not be needed to decode the source.
  • PRODUCER_PLAYER_READY_TIMEOUT_MS=3000 bun packages/cli/src/cli.ts render /tmp/hf-issue-574-prores --workers 1 --quality draft --fps 30 --output /tmp/hf-issue-574-prores-fixed-v2.mp4 -> completed.
  • bun run --filter @hyperframes/producer test --sequential --keep-temp overlay-montage-prod -> passed; this guards against skipped metadata shrinking height:auto video layout before the first injected frame.
  • ffmpeg -v error -i /tmp/hf-issue-574-prores-fixed-v2.mp4 -f null -
  • ffmpeg -v error -i /tmp/hf-issue-574-h264-fixed-v2.mp4 -f null -
  • ffprobe -v error -show_entries format=duration:stream=codec_name,width,height,r_frame_rate -of json /tmp/hf-issue-574-h264-fixed-v2.mp4 -> H.264, 320x180, 30fps, 12.0s.

Current PR Checks

Browser Verification

  • Used agent-browser to open file:///tmp/hf-issue-574-h264-fixed-v2.mp4 and verify the rendered output displays in Chromium.
  • Screenshot: .debug/issue-574/h264-output-page.png
  • Agent-browser recording: .debug/issue-574/h264-output-playback.webm

Notes / Caveats

  • The reporter's exact 1.mp4 was not attached to [FrameCapture] video metadata not ready after 45000ms #574. The committed Windows regression uses a generated deterministic H.264 file with the same filename and exact markup from the issue.
  • The exact H.264 issue shape did not reproduce the timeout on this macOS/system-Chrome machine before the fix; it rendered successfully locally. The GitHub Actions baseline above reproduces it on Windows/cache without the fix.
  • The Windows fixture intentionally runs after the existing canary render so the browser path is Browser: cache, matching the reporter's environment.
  • The generated fixture emits sparse-keyframe warnings. Those warnings are expected and are not the failure being fixed; the baseline failure occurs before any frame capture because native browser video metadata never becomes ready.
  • Browser proof artifacts are local-only under .debug/issue-574/ and intentionally not committed.

@miguel-heygen miguel-heygen force-pushed the fix/reused-video-metadata branch 2 times, most recently from f6d2a0a to 79d6b41 Compare April 30, 2026 15:35
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — well-scoped fix that generalizes the existing HDR-skip path and adds proper layout-hint compensation.

Root cause analysis: the readiness wait at frameCapture.ts:initializeSession was waiting for readyState >= 1 on every <video> element. For videos whose pixels come from out-of-band FFmpeg extraction (HDR + standard), the browser never plays them — the <video> element is kept only for layout. Waiting on those is unnecessary and on Windows with multiple instances of the same MP4 source it actually times out at 45s ("video metadata not ready"). The previous HDR-only skip was the right idea but too narrow.

Fix shape:

  • Generalize the skip via collectVideoReadinessSkipIds — combines nativeHdrVideoIds with all extracted videos that have usable dimensions.
  • Compensate for the lost layout signal via videoMetadataHints + applyVideoMetadataHints — sets width/height HTML attributes and CSS aspect-ratio from FFmpeg-probed dimensions, but only when the author hasn't already set them. So author CSS/HTML wins.
  • Rename buildHdrCaptureOptionsbuildCaptureOptions since it's no longer HDR-specific.

Cross-platform safety: Linux/macOS used to wait for native metadata then derive layout from it. Now they skip the wait but get explicit width/height/aspect-ratio hints. Equivalent layout outcome, faster path.

CI gate: the new step in windows-render.yml scaffolds the exact issue #574 repro (3× same MP4 source) with PRODUCER_PLAYER_READY_TIMEOUT_MS=15000. If the bug regresses, the timeout fires fast and CI catches it. Solid regression-gate shape.

One nit, non-blocking: CaptureVideoMetadataHint.durationSeconds is collected but not consumed in applyVideoMetadataHints. Probably reserved for future use (programmatic <video>.duration setting?), but worth either using or dropping in a follow-up to avoid dead-data.

Tests: the two new tests on collectVideoMetadataHints and collectVideoReadinessSkipIds cover the right cases (filtering bad dimensions, sort-stability, native HDR id passthrough). Both pass locally. Note: there's an unrelated pre-existing failure in the file ("rejects a maliciously crafted key that tries to escape compileDir" — fails on origin/main too, looks like Linux-runner sensitivity to a Windows-path test).

— Review by Rames Jusso

@miguel-heygen miguel-heygen force-pushed the fix/reused-video-metadata branch from 79d6b41 to 79671dd Compare April 30, 2026 16:29
@miguel-heygen miguel-heygen merged commit 6a59ef6 into main Apr 30, 2026
37 checks passed
Copy link
Copy Markdown
Collaborator Author

Merge activity

@miguel-heygen miguel-heygen deleted the fix/reused-video-metadata branch April 30, 2026 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FrameCapture] video metadata not ready after 45000ms

2 participants