feat(distributed): add optional cfr flag for exact constant frame rate output#1054
Conversation
Distributed-render output today uses -c:v copy through concat → mux → faststart, which means PTS timestamps from each chunk pass through unchanged. Container r_frame_rate is exact (#1040 + this PR's parent), but stream-level avg_frame_rate stays PTS-derived and can land on fractional rationals like 27648000/921677 over a 60s render. Same for sub-ms duration drift. This is the achievable bar within -c copy stream-copy concat. For most consumers (browser playback, YouTube, etc.) the difference is invisible. For downstream tools that strict-check avg_frame_rate or ms-precision duration (broadcast workflows, frame-accurate compositors, some third-party transcoders), it matters. Adds an opt-in cfr config flag (default false). When true, the assemble step's final pass re-encodes with -fps_mode cfr -r <fps> instead of -c copy, producing exact CFR output. Trade-off: ~2-5x the stitch time for a 60s 1080p clip; second-generation H.264 quality loss is negligible at -crf 18 but is non-zero.
miguel-heygen
left a comment
There was a problem hiding this comment.
Review: Magi
Well-designed opt-in feature — flag threading, format guard, test coverage, and default preservation are all solid. Two issues worth addressing before merge:
Blocking: cfr re-encode hardcodes libx264, silently transcodes h265 to h264
The cfr re-encode step hardcodes -c:v libx264. DistributedRenderConfig supports codec: "h265" which encodes chunks with libx265. When a user configures { format: "mp4", codec: "h265", cfr: true }, the chunks are h265 but the cfr step transcodes them to h264 with no warning.
Simplest fix — throw a typed error when cfr + h265, similar to the webm/mov guard:
const encoderJson = JSON.parse(readFileSync(join(planDir, "meta", "encoder.json"), "utf-8"));
if (options?.cfr && encoderJson.encoder === "libx265-software") {
throw new Error(
`[assemble] cfr=true with libx265 is not yet supported. Use codec="h264" or cfr=false.`
);
}Non-blocking: missing -an flag in cfr re-encode
The cfr args specify -c:v libx264 but no explicit -an. The concat output is currently video-only so this is safe, but if a future change introduced audio in the intermediate, ffmpeg would try default audio encoding. Adding -an makes the video-only intent explicit.
Everything else looks good
-crf 18is appropriate for a second-gen re-encode (< 0.5 dB PSNR loss, below perceptual threshold)-preset mediumis the right balance for a one-shot stitch step-fps_mode cfr+-r <fps>is the correct flag combo for exact constant frame timing- mp4-only guard is correct and well-placed
planHashexclusion is correct — chunks render identically regardless of cfr- Tests assert both
r_frame_rateandavg_frame_rate==="30/1"with ms-precision duration
— Magi
The cfr re-encode pass hardcodes `-c:v libx264`. Pairing it with `codec: "h265"` would silently transcode the h265 chunks to h264. Detect the encoder discriminant in `meta/encoder.json` and throw a typed error parallel to the existing non-mp4 format guard, so callers surface the conflict instead of producing a wrong-codec deliverable. — Rames Jusso
|
@miguel-heygen — addressed the blocking item in 270b891.
Leaving the non-blocking — Rames Jusso |
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-review: Magi
Both issues from my first review are addressed:
-
h265 guard (was blocking) — Fixed.
assemble()now readsmeta/encoder.jsonand throws a typed error whencfr=true+libx265-software, preventing silent h264 transcode. The error message is clear and actionable. Test pins it with a plan staged aslibx265-software.buildPlanDirhelper extended withencoderparam — clean. -
-anflag (was non-blocking) — Not added, which is fine. The concat intermediate is video-only by construction.
Everything else from the first review still holds: crf 18, preset medium, fps_mode cfr, format guard, planHash exclusion, threading — all solid.
LGTM ✓
— Magi
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-approving after rebase — same verdict as my re-review. h265 guard, format guard, cfr threading, tests all look good. LGTM.
— Magi
What
Adds an optional
cfrboolean to the render config. Whentrue, the assemble step re-encodes with-fps_mode cfrinstead of-c copy, producing output with exactavg_frame_rate+ exact duration. Defaultfalsepreserves current behavior.Why
The v0.6.39 fix (#1040) tightened
r_frame_rate(container metadata) at the concat-demuxer-copy boundary, which is the achievable bar within stream-copy concat. Stream-levelavg_frame_rate(PTS-derived) and sub-ms duration drift are inherent to-c copy-chunk-stitching and can't be eliminated without re-encoding.For most consumers — browser playback, video-sharing platforms, human-facing pipelines — the difference between
r_frame_rate=30/1, avg_frame_rate=27648000/921677andr_frame_rate=30/1, avg_frame_rate=30/1is invisible. For consumers that strict-checkavg_frame_rate(broadcast workflows, frame-accurate compositors, some third-party transcoders), it matters.This PR makes the trade-off opt-in:
cfr: false): current behavior. Fast stitch (~1-3s for 60s 1080p),-c:v copyretention, PTS-level drift accepted.cfr: true): re-encode at the assemble stage. ~2-5x the stitch time (~6-18s for 60s 1080p), exact CFR output, second-generation H.264 quality loss negligible at-crf 18but non-zero.How
cfr?: booleanfield onDistributedRenderConfig(producer) andAssembleEvent(aws-lambda).assemble()accepts a newoptions.cfrflag.cfr: true, after the concat / single-chunk step, an additional ffmpeg pass re-encodes with-c:v libx264 -preset medium -crf 18 -pix_fmt yuv420p -fps_mode cfr -r <fps>before audio mux and faststart.avg_frame_rate, so cfr on those formats throws a typed error rather than silently re-encoding.event.Cfr === trueand forwards to the primitive.Test plan
cfr: trueoutput hasavg_frame_rate === r_frame_rateexact (and equals the requested fps), and duration matchesframes / fpswithin 1ms.cfr: trueon non-mp4 formats throws a clear error.cfr: false/ unset tests still pass.Notes
This PR pairs with #1053 (single-chunk pass-through
-r <fps>fix), which ensuresr_frame_rateis exact even on 1-chunk renders. With both PRs landed, the achievable exactness across chunk counts + opt-in modes is:r_frame_rateavg_frame_ratecfr: false) — multi-chunkcfr: false) — 1-chunkcfr: true) — any chunk count