perf(engine): reduce init overhead in headless capture sessions#1718
Merged
Conversation
Two changes to reduce the ~2.5x captureAvgMs regression between v0.6.42 and v0.7.5 (issue #1715 / #1653): 1. Synchronous GSAP proxy flush: expose flushPendingOperations() as window.__hfFlushSync in the early stub, then call it from initializeSession before pollHfReady. In headless mode there's no UI responsiveness concern, so draining the proxy queue instantly eliminates the dominant init cost for tween-heavy compositions (previously 100 ops/rAF-tick at 33ms intervals). 2. Parallel init polls: run pollVideosReady, pollImagesReady + decodeAllImages, document.fonts.ready, and waitForOptionalTailwindReady concurrently via Promise.all instead of sequentially. These are independent DOM queries that don't depend on each other's completion. Both optimizations apply to screenshot and BeginFrame capture modes. Closes #1715 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The initial sync flush drained the GSAP proxy queue but left the "timelines built" signal deferred via setTimeout(0). Now the flush also force-publishes immediately when the queue is empty, so the readiness cascade (__hfTimelinesBuilding → __renderReady → __hf.duration) can complete without waiting for a deferred macrotask + poll cycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
miguel-heygen
approved these changes
Jun 28, 2026
miguel-heygen
left a comment
Collaborator
There was a problem hiding this comment.
Reviewed the current head f7797418 and the full three-file diff.
Specific checks:
packages/producer/stubs/hf-early-stub.ts:399exposes a guarded synchronous__hfFlushSyncthat drains the queued GSAP timeline operations and immediately publishes the built signal only when the queue is empty. This preserves the existing deferred/rAF path for normal page execution while giving headless capture an explicit fast path.packages/engine/src/services/frameCapture.ts:970and:1106call the flush afterpage.goto(..., domcontentloaded)and beforepollHfReadyin both screenshot and BeginFrame modes, so readiness still gates rendering after the forced drain.packages/engine/src/services/frameCapture.ts:987and:1129parallelize only independent readiness waits; the previous image/video warning behavior anddecodeAllImagesstep are preserved.- Generated inline stub is updated with the source stub change.
CI is fully green, including Build, Test, Typecheck, CLI smoke, CodeQL, Windows render/tests, perf, preview parity, and all regression shards. No blockers found.
Verdict: APPROVE
Reasoning: The patch addresses the init-time bottleneck without changing frame-seek semantics, preserves both capture-mode readiness contracts, and has full CI/regression coverage green.
— Magi
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Per-frame capture time (
captureAvgMs) roughly doubled between v0.6.42 and v0.7.5 — p50 went from ~64ms to ~162ms in production benchmarks (#1715, #1653). The regression is init-time overhead being amortized into per-frame timing, not actual per-frame cost.Root cause
Two bottlenecks in
initializeSession:GSAP proxy queue drain via rAF — The
HF_EARLY_STUBbatches GSAP timeline operations (.to(),.from(), etc.) and drains them 100 ops per rAF tick. In BeginFrame mode, rAF ticks at the warmup loop's 33ms interval. For a composition with 8000 tweens: 80 ticks × 33ms = ~2.6 seconds of drain time. In screenshot mode, rAF runs at native ~16ms, but it's still the dominant init cost.Sequential init polls —
pollVideosReady,pollImagesReady+decodeAllImages,document.fonts.ready, andwaitForOptionalTailwindReadyran sequentially despite being independent DOM queries.Fix
Synchronous GSAP proxy flush — Expose
flushPendingOperations()aswindow.__hfFlushSyncin the early stub. Call it frominitializeSessionbeforepollHfReady. In headless mode there's no UI responsiveness concern, so draining the queue instantly eliminates the largest init-time cost for tween-heavy compositions.Parallel init polls — Run the four independent readiness checks concurrently via
Promise.allinstead of sequentially. Wall-clock time drops to the slowest individual check instead of the sum.Both optimizations apply to screenshot and BeginFrame capture modes.
Expected impact
For a composition with 8000 GSAP tweens on an AWS c6a.4xlarge (the production bench environment):
The exact improvement depends on composition complexity (tween count, media element count), but the GSAP flush alone should recover the majority of the regression for tween-heavy compositions at p50.
Testing
oxlint+oxfmt --checkclean on changed filesbrowserManager.test.tstests pass (21/21)__hfFlushSynccall is guarded with?.()— no-op if the stub isn't present (e.g., non-producer page loads)Closes #1715
— Miga