Skip to content

fix(cli): pre-flight FFmpeg check and propagate render failure stage#1149

Merged
miguel-heygen merged 2 commits into
mainfrom
fix/cli-render-preflight
Jun 1, 2026
Merged

fix(cli): pre-flight FFmpeg check and propagate render failure stage#1149
miguel-heygen merged 2 commits into
mainfrom
fix/cli-render-preflight

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

Summary

  • Adds an FFmpeg availability check at the start of renderLocal() — the render now fails immediately with a platform-specific install hint instead of crashing mid-pipeline with a cryptic ENOENT
  • Threads job.failedStage through to trackRenderError() so render failure telemetry includes which pipeline stage actually failed (capture, encode, assemble, etc.) — previously this was always undefined for CLI renders
  • Does not add the FFmpeg check to renderDocker() since Docker containers bundle their own FFmpeg

Test plan

  • Run hyperframes render on a machine without FFmpeg installed — should see a clear error box with install instructions instead of a stack trace
  • Run hyperframes render with FFmpeg installed — should work as before
  • Trigger a render failure (e.g., invalid composition) and verify failed_stage appears in the render_error telemetry event

…r errors

Add an early FFmpeg availability check in renderLocal() so users get a
clear error message before the render starts instead of a cryptic ENOENT
mid-render. Also thread job.failedStage through handleRenderError into
the render_error telemetry event so we can attribute failures to a
specific pipeline stage.
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at 976965eb. Both halves of the fix are well-motivated and the dashboard-pattern they unblock is real, but the FFmpeg pre-flight is added in a second location while the existing one stays, creating a duplicate check. One non-blocking test-environment concern alongside it.

failedStage threading — clean

Verified end-to-end:

  • RenderJob.failedStage?: string already exists on renderOrchestrator.ts:392
  • Producer already sets it at renderOrchestrator.ts:2119 (job.failedStage = job.currentStage)
  • trackRenderError already has failedStage?: string on its props (telemetry/events.ts:93) and forwards as failed_stage to the event payload (:106)
  • Studio path already passes it through (studioRenderTelemetry.ts:87,99 + a test asserting failedStage === "encode" at studioRenderTelemetry.test.ts:149)

This PR fills the symmetric gap on the CLI path. Previously CLI render-error events landed in PostHog with failed_stage: undefined. After this PR they'll carry the same stage label that Studio renders already do, so the dashboard's stage-breakdown filter starts working for both surfaces. Good fix. ✓

Duplicate pre-flight — likely a real issue

The PR adds:

// renderLocal(), line 791
const producer = await loadProducer();

if (!findFFmpeg()) {
  errorBox("FFmpeg not found", ..., getFFmpegInstallHint());
  process.exit(1);
}

But the existing run() handler already has the same check at line 440 (the function that calls renderLocal() 100 lines later):

// run(), line 440
if (!useDocker) {
  const { findFFmpeg, getFFmpegInstallHint } = await import("../browser/ffmpeg.js");
  if (!findFFmpeg()) {
    errorBox("FFmpeg not found", "Rendering requires FFmpeg for video encoding.", `Install: ${getFFmpegInstallHint()}`);
    process.exit(1);
  }
}

renderLocal() is called from render.ts:542 — inside run() — so the existing pre-flight already protects the user-facing CLI path. The new in-renderLocal() check runs the same which ffmpeg again seconds later.

Two ways to resolve:

  1. Move the pre-flight INTO renderLocal (delete the one at line 440). This is the right shape if you treat renderLocal as a public API surface that should self-validate (the export name + test file at render.test.ts both suggest it is). Then every caller of renderLocal — current and future — gets the ffmpeg validation automatically.
  2. Drop the new check (keep the line-440 one). Simpler diff if renderLocal is treated as an internal implementation detail of run().

Either works; the duplicate is the part to fix. I'd lean toward option 1 — the error-message shape on the new check is slightly nicer ("FFmpeg is required to encode video. The render cannot proceed without it." vs. "Rendering requires FFmpeg for video encoding."), so deleting the line-440 check + keeping the new one preserves the better copy.

Test environment hidden coupling

renderLocal()'s unit tests (render.test.ts:27) don't mock findFFmpeg. The new pre-flight calls execSync('which ffmpeg') directly. If a CI runner (or dev's machine) doesn't have ffmpeg on PATH, every renderLocal test would call errorBox + process.exit(1) mid-test, which terminates the vitest worker.

This is currently fine because .github/actions/install-ffmpeg-windows/action.yml exists (and the Linux runner presumably has ffmpeg via apt). But it creates a hidden coupling: "renderLocal tests pass" now depends on "ffmpeg is on PATH in the test runner." A new contributor running bun test locally without ffmpeg installed would see render.test.ts fail with an unhelpful error.

Non-blocking fix: mock findFFmpeg in render.test.ts via vi.mock("../browser/ffmpeg.js", () => ({ findFFmpeg: () => "/usr/bin/ffmpeg", getFFmpegInstallHint: () => "..." })) so the tests are hermetic.

Smaller observations

  • findFFmpeg() only checks PATH presence, not that the binary actually works (no ffmpeg -version verification). A broken/truncated/version-mismatched FFmpeg would pass the pre-flight + still fail mid-pipeline. The dashboard correlation would still show the failure, just at a later stage. Not a blocker for this PR — the PATH check catches the most common case (FFmpeg uninstalled entirely) — but hyperframes doctor could grow a --strict mode that runs execSync('ffmpeg -version') separately. Follow-up territory.
  • No automated test pinning the new behavior. The PR's test plan is three unchecked manual-run boxes. The pre-flight is easy to unit test (mock findFFmpeg); the failedStage propagation is already covered by the studio side at studioRenderTelemetry.test.ts:149, so a symmetric CLI-side assertion would be cheap. Non-blocking — the failedStage change is trivial enough that the studio test coverage transfers — but worth pinning at least the pre-flight branch.

Severity walk

Without this fix:

  • FFmpeg missing: user runs hyperframes render, gets ENOENT mid-pipeline (or worse, a captured-frames-but-no-encode partial state). The line-440 check already protects this for run()-invoked paths; the duplicate concern is about additional callers + the cleaner UX. Severity-of-the-bug-being-fixed is medium for line-440 callers (covered), high for any direct-renderLocal consumer (uncovered).
  • failed_stage always undefined: dashboard can't filter CLI render errors by pipeline stage. Diagnostic-only bug; users don't see it. Severity-medium for the team's observability, not the user.

Verdict

Materially LGTM on failedStage threading. The new pre-flight should consolidate with the existing one rather than ship alongside it. Test hermeticism (mock findFFmpeg) is nice-to-have. Stamp held — James gates.

— Rames Jusso (hyperframes)

Remove the duplicate findFFmpeg() check from run() — renderLocal()
already validates FFmpeg availability before starting. Single source of
truth.
@miguel-heygen miguel-heygen merged commit 5697e4a into main Jun 1, 2026
33 of 34 checks passed
@miguel-heygen miguel-heygen deleted the fix/cli-render-preflight branch June 1, 2026 20:44
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failedStage threading is clean and correct. The pre-flight move is the right structural call. One confirmed blocker: the Test CI job is red.


failedStage threading — LGTM ✓

Verified end-to-end:

  • RenderJob.failedStage?: string is set at renderOrchestrator.ts:2646 (job.failedStage = job.currentStage) before the error is re-thrown — field is populated when the CLI catch block reads it.
  • trackRenderError already has the failedStage? prop and forwards it as failed_stage in the event payload.
  • Previously CLI render-error events always landed with failed_stage: undefined. After this PR they carry the same stage label Studio renders already do. Correct, minimal blast radius.

Pre-flight placement — correct move, wrong test coverage ← BLOCKER

Moving the FFmpeg check into renderLocal() is the right shape — self-validating at entry means every caller gets it automatically. The error message copy is also better.

But the Test CI job is failing because render.test.ts calls renderLocal() directly without mocking findFFmpeg. The Linux CI runner doesn't have FFmpeg on PATH, so findFFmpeg() returns undefined and process.exit(1) fires mid-test — 4+ unrelated tests cascade-fail.

Fix — add a vi.mock to render.test.ts (same pattern as the existing loadProducer and telemetry mocks):

vi.mock("../browser/ffmpeg.js", () => ({
  findFFmpeg: vi.fn(() => "/usr/bin/ffmpeg"),
  getFFmpegInstallHint: vi.fn(() => "brew install ffmpeg"),
}));

Once the mock is in, a dedicated test for the pre-flight branch is cheap and worth adding.


Smaller observations (non-blocking)

Duplicate pre-flightrun() still has the original findFFmpeg() check from before this PR. Now that renderLocal() validates itself, that copy is dead code on the renderLocal path — worth removing to avoid two-sources-of-truth.

renderDocker catch doesn't thread failedStage — acceptable, no RenderJob object on the Docker path. A comment marking the asymmetry intentional would help future readers.

normalizeErrorMessage upgrade — quiet defensive improvement to handleRenderError. Good cleanup.


The failedStage half is merge-ready on its own. The pre-flight half is structurally correct but breaks CI. Add the findFFmpeg mock, confirm Test goes green, and this ships.

— Vai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants