Skip to content

feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295

Open
vanceingalls wants to merge 1 commit into
mainfrom
drawelement-fast-capture
Open

feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295
vanceingalls wants to merge 1 commit into
mainfrom
drawelement-fast-capture

Conversation

@vanceingalls

@vanceingalls vanceingalls commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds an experimental frame-capture mode that reads DOM paint records directly via Chrome's canvas.drawElementImage API instead of Page.captureScreenshot. On a GPU it is ~46% faster than screenshot capture and is pixel-identical (PSNR = ∞). It is fully opt-in behind a new CLI flag — default behaviour is byte-unchanged.

hyperframes render --experimental-fast-capture --output out.mp4

How it works

  • New capture mode "drawelement" (alongside "beginframe" / "screenshot"). A <canvas layoutsubtree> is injected around the composition root after page-ready; each frame clears the canvas, calls drawElementImage(root, 0, 0), and reads it back.
  • Format-aware encode — emits JPEG for opaque output (mp4) and PNG for transparent output (mov/webm/png-sequence). This matters: the producer's streaming encoder pipes frames to ffmpeg as mjpeg, so opaque frames must be JPEG.
  • SwiftShader routingdrawElementImage on a transparent canvas is broken on SwiftShader (Chromium bug; promoted sub-layers are dropped). On detecting SwiftShader (Docker), transparent renders automatically fall back to Page.captureScreenshot. Opaque renders use drawElement on SwiftShader fine.
  • Page-side compositing — drawElement and page-side shader compositing are mutually incompatible capture strategies, so resolveConfig auto-disables enablePageSideCompositing when fast-capture is on (shader transitions fall back to the Node-side layered blend rather than dropping).

Wiring

CLI flag → env → engine config:

--experimental-fast-capturePRODUCER_EXPERIMENTAL_FAST_CAPTUREEngineConfig.useDrawElementresolveConfig → capture session. Threaded through the Docker render path (buildDockerRunArgs) as well.

Testing

  • Unit (drawElementService.test.ts, config.test.ts): SwiftShader detection, capture-mode routing, env resolution, page-side auto-disable. Full engine suite green (697 pass).
  • Docker / SwiftShader (validated with a local harness): T1 opaque drawElement vs screenshot baseline — PSNR = ∞; T2 transparent + SwiftShader → screenshot fallback (no blank frames); opaque + SwiftShader → drawelement.
  • End-to-end: a real producer render (--experimental-fast-capture → mp4) logs drawElement canvas injected and produces a valid mp4 — exercises env → resolveConfig → captureCfg → createCaptureSession → drawelement → ffmpeg encode. This e2e caught (and this PR fixes) the PNG-vs-JPEG encode bug above.

Scope / follow-up

  • Validation harnesses live under spikes/ (untracked, per repo convention) — not committed.
  • Cloud (cloud/render.ts) and Lambda (lambda/render.ts) render surfaces do not carry the flag yet — wired in a stacked follow-up PR. The flag silently no-ops there until then.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

…-fast-capture

Add an experimental frame-capture mode that reads DOM paint records directly
via Chrome's canvas.drawElementImage API instead of Page.captureScreenshot
(~46% faster on GPU), gated behind --experimental-fast-capture
(env PRODUCER_EXPERIMENTAL_FAST_CAPTURE; engine config useDrawElement).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vanceingalls vanceingalls force-pushed the drawelement-fast-capture branch from e8cec72 to d2ef1f2 Compare June 9, 2026 08:27

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid experimental feature. The fallback routing logic (SwiftShader detection, transparent + GPU vs Docker paths, supersampling guard, page-side compositing auto-disable) is well thought through, and the format-aware JPEG/PNG encode choice is the right call — I can see from the PR description that the PNG-vs-JPEG bug was caught by the e2e run, which is exactly why it matters. A few small gaps worth addressing before this graduates out of --experimental.

P2 — captureDrawElementFrame: split(",") is fragile

packages/engine/src/services/drawElementService.ts, the base64 extraction:

const base64 = dataUrl.split(",")[1];

Base64 can't contain commas, so this is practically safe for a toDataURL response — but the intent is to grab everything after the first comma, and split(",")[1] silently truncates if the payload ever has one. Prefer:

const commaIdx = dataUrl.indexOf(",");
if (commaIdx === -1) throw new Error("drawElement: toDataURL returned malformed data URL");
const base64 = dataUrl.slice(commaIdx + 1);

Worth fixing before graduation: it's one line and removes any ambiguity.

P3 — BeginFrame response discarded in drawelement branch

packages/engine/src/services/frameCapture.ts, captureFrameCore:

await client.send("HeadlessExperimental.beginFrame", { ... });
// response ignored

Discarding the response is intentional (we don't rely on BeginFrame for the screenshot here), but beginFrameHasDamageCount/beginFrameNoDamageCount won't increment for drawElement renders. Diagnostics and any tooling that inspects those counters post-render will silently under-report. At minimum log hasDamage so render telemetry stays accurate; ideally update the session counters so the existing diagnostic surface works.

P3 — injectDrawElementCanvas idempotency path uncovered

drawElementService.test.ts mocks page.evaluate but never exercises the early-return branch (document.getElementById("__hf_de_canvas") already exists). Add a mock test: call the function twice and assert page.evaluate was called only once.

P3 — Supersampling fallback untested

The initDrawElementOrTransparentBackground path for useDrawElement=true + deviceScaleFactor>1 (logs the warning and falls back to screenshot capture) has no unit coverage. Since the function is private, this would need a thin integration shim or an indirect test via initializeSession. Minor, but worth noting before promotion.


Cloud/Lambda follow-up is fine — the flag silently no-ops there and the description is clear about it.

→ Approve with comments. Fallback logic and encoding are correct; the gaps above are small and appropriate to address before the flag loses the --experimental prefix.

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at d2ef1f2d. Big experimental capture-mode add (449/23, 10 files). Net: architecture is sound, opt-in plumbing is correct, the SwiftShader and page-side-compositing incompatibility cases are handled with two layers of defense (resolveConfig force-off + per-frame skip in prepareFrameForCapture), and the PSNR=∞ result vs captureScreenshot is strong evidence the visual output is byte-identical on the harnessed compositions. No blockers from my pass.

Where I have non-blocker concerns: coverage shape (CI's regression-shards run with the flag OFF, so drawelement is not exercised across the composition matrix), the Docker env-only path silently no-opping, and the <canvas layoutsubtree> HTML-in-canvas pattern (worth flagging for Miguel given his prior investigation, though this use case is render-output capture rather than interactive UI).

Concerns

  • Regression suite runs with fast-capture OFF. The PSNR=∞ result + e2e drawElement canvas injected log come from a css-spinner → mp4 run plus the SwiftShader T1/T2 harness. The CI's regression-shards matrix is great for verifying nothing breaks when the flag is off but doesn't exercise the new drawelement mode across the diverse compositions (sub-compositions, iframes, HDR, font-variation, raf-ball, etc.). The composition-root reparenting in injectDrawElementCanvas (parent.insertBefore(canvas, root); canvas.appendChild(root);) changes the root's layout-parent — compositions with position: fixed descendants, viewport-relative units referencing the prior ancestor, or that contain iframes (the iframe-render-compat style is the canonical case) could subtly diverge in a way the single-composition harness wouldn't catch. Is the plan to add a fast-capture variant of the regression matrix in a follow-up before promoting beyond EXPERIMENTAL, or is the PSNR=∞ on css-spinner considered representative? Either is defensible — just want to set expectations on coverage.

  • Docker render silently drops env-only opt-in. buildDockerRunArgs builds positional argv only — it does not forward host env vars to the container. The CLI plumbing at render.ts:610 uses args["experimental-fast-capture"] === true, which is false for an absent flag even when PRODUCER_EXPERIMENTAL_FAST_CAPTURE=true is set in the host shell. So an operator who sets only the env var and runs hyperframes render against the Docker path gets a normal screenshot render. The flag description ("Env: PRODUCER_EXPERIMENTAL_FAST_CAPTURE.") doesn't qualify that the env fallback is in-process only. This is consistent with the --low-memory-mode idiom the PR cites, so it may be intentional — but the symmetric --low-memory-mode description has the same gap. Two reasonable resolutions: (a) document that the env fallback is in-process only and Docker needs the CLI flag, or (b) read env in the Docker option-prep path and synthesize the flag when present. Either is fine; (a) is cheaper if the existing pattern is the intent.

  • <canvas layoutsubtree> is HTML-in-canvas. Surfacing for Miguel given his prior investigation on avoiding the pattern. This use case is render-output capture (no DOM events traverse the canvas, no a11y surface, no interactive UI nested inside), so the concerns from that thread probably don't apply directly — but worth a 👀 from him given the team bias against the pattern shape.

Nits

  • captureDrawElementFrame hard-codes the JPEG quality default at 80 (drawElementService.ts:106). The downstream caller is captureFrameCore which passes options.quality ?? 80. If options.quality is ever undefined from an alternate entry point, the default could diverge from pageScreenshotCapture's. Probably fine — the integration harness exercises the real call chain — but a one-line "tracks pageScreenshotCapture default" comment would pin the parity.

  • logInitPhase interpolates session.captureMode at the time of log (frameCapture.ts:997). Pre-helper phases log [initSession:beginframe] or [initSession:screenshot]; the new helper flips to drawelement, and post-helper phases log [initSession:drawelement]. The drawElement canvas injected log fires after the flip, which is the clear signal — but a future log-reader chasing a render issue may misread the prefix change mid-stream as a session restart. A one-line comment on logInitPhase calling out the mid-init flip would save 10 minutes of confused log-grepping later.

  • drawElementService.integration.test.ts is a describe.skip documenting what was validated locally. Discoverable but adds zero CI coverage and can drift silently. Two paths: (a) leave as a validation record (current shape, fine as a doc), (b) gate behind a RUN_BROWSER_INTEGRATION_TESTS=1 env and add a nightly workflow that flips it on. Your call — current shape is honest about what's validated, so this is preference territory.

  • render.ts:610 uses === true for the inline option pass; render.ts:408-412 uses != null for the env override. Asymmetry is correct (env preserves on silence, option forwards false on silence), and it tracks the --low-memory-mode idiom — but a one-liner on the option-pass line explaining why the two checks differ would help the next person threading a new flag.

Questions

  • Composition coverage of the harness. Was the validation harness exercised against sub-composition (sub-composition-video), iframe (iframe-render-compat), HDR (hdr-regression), or font-variation compositions, or only css-spinner? The composition-root reparenting is the place I'd want fidelity evidence on those styles before flipping the flag on for any production-ish surface.

  • Cloud / Lambda env state. The PR body says cloud/render.ts and lambda/render.ts "silently no-op" — but resolveConfig reads PRODUCER_EXPERIMENTAL_FAST_CAPTURE globally, so if those infra surfaces happen to have the env var set (via deploy config), fast capture does activate. Is the "silently no-op" claim about the CLI flag wiring specifically (env-via-deploy-config is a supported operator escape hatch until the follow-up PR lands), or should those surfaces actively gate the env off until then?

  • macOS path. On Mac, headlessShell is null → preMode resolves to "screenshot" regardless. The new helper then flips captureMode to "drawelement" if useDrawElement is on and no SwiftShader. Per-frame branch gates BeginFrame on beginFrameTimeTicks > 0, which stays 0 in the screenshot-launched path — so the compositor advances naturally and the comment at frameCapture.ts:1402-1404 matches the behavior. Has the validation harness exercised macOS GPU, or only the Docker SwiftShader cases + the GPU-host T3?

What I didn't verify

  • Windows render compat. Render on windows-latest + Tests on windows-latest are still in-flight at review time. drawelement doesn't gate on platform, so a Windows session with the flag set would land in the same screenshot-launched + drawelement-mode path as macOS. If Windows CI surfaces something, this'll need a look.

  • useDrawElement: true with forceScreenshot: true simultaneously. resolveConfig doesn't appear to reconcile these — forceScreenshot is honored at preMode resolution, useDrawElement is honored at init-time. A session could land in drawelement mode on a screenshot-launched browser. That's actually the design for macOS GPU, so it's not necessarily wrong — flagging as "haven't dug into the operator-confusion surface."

  • --enable-features=CanvasDrawElement interaction with multiple feature-flag groups. Verified the flag is added globally in buildChromeArgs:554 and is the only --enable-features= on the command line (the other flag is --disable-features=… which is a separate category). Chrome's behavior for multiple --enable-features= flags is version-dependent, but since there's only one, this is unambiguous.

  • The integration record's PSNR=∞ claims — taken at the integration test file's word.

Clean execution; leaving as a comment.

Review by Rames D Jusso

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants