fix(moq-gst): follow catalog updates dynamically in moqsrc#1627
Conversation
moqsrc read the hang catalog exactly once and built all its pads from that single snapshot. Reactive publishers (the browser via @moq/hang) announce an initial catalog *before* their encoder has configured, so the first frame can carry zero renditions, with the populated catalog arriving a beat later as a new group. moqsrc latched the empty snapshot, created no pads, and produced no output until the pipeline timed out -- which is exactly why every `js-* -> gst` cell in the interop smoke matrix failed while `rust -> gst` / `python -> gst` (fully-populated catalog up front) passed. Loop on catalog updates and wait for the first catalog that actually advertises a video or audio rendition, and select on shutdown so we don't hang if the publisher closes without announcing a track. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthroughrun_session is rewritten to continuously consume catalog updates and reconcile the set of desired renditions to an active pump map keyed by rendition name. New renditions get new pad ids; disappeared or changed renditions cancel and (if needed) recreate pumps with fresh ids. Pad handles are evicted when pads drop. spawn_track_pump/run_track_pump accept a rendition-specific cancel watch and a done sender; pumps exit on cancel or global shutdown and send TrackDone(rendition name, id) back to the session for completion reconciliation. 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Build on the catalog-wait fix: instead of reading the catalog once and freezing that snapshot, follow it for the whole session and reconcile pads against every update. This handles the empty-initial-catalog race (browser/@moq/hang) and also renditions that appear, disappear, or change codec/resolution mid-stream. - Reconcile loop diffs the catalog's renditions against the live pumps each update: spawn pads for new renditions, tear down vanished ones, and recreate any whose caps or container changed. - Pads are now named video_<id>/audio_<id> from a process-unique counter (so a recreated rendition never collides with the pad still being torn down) and the Drop handler evicts its map entry so long sessions don't leak handles. - Per-track cancel + a completion channel (correlated by pad id) let pumps be stopped individually and report when they end, without a stale completion from a replaced rendition evicting its successor. - Honor the catalog's per-track container (legacy/cmaf/loc) instead of assuming legacy, via the existing moq_mux Container::try_from. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@rs/moq-gst/src/source/imp.rs`:
- Around line 475-503: When the catalog track closes in run_session
(catalog_consumer.next() -> None) do not cancel active track pumps; instead stop
reconciling new catalogs and exit the catalog-processing loop without iterating
the active.drain() cancel path so existing pumps (run_track_pump) can reach
their natural Ok(None) -> PadMessage::Eos path; keep reacting to global
shutdown/pump_shutdown/done_rx so pumps still stop on those signals. Concretely:
modify the branch that currently breaks and then drains/cancels active to simply
break from the catalog loop (or set a flag like catalog_closed) and remove the
unconditional active.drain() cancel loop; ensure run_track_pump still handles
cancel.changed() producing PadMessage::Drop only for explicit cancellation, and
only send cancel.send(true) when doing a full shutdown. Also ensure reconcile
and any control_tx logic remain functional so pads can still be dropped/added
while pumps finish.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4aa606e0-f5a8-48de-866d-4865dbe7908b
📒 Files selected for processing (1)
rs/moq-gst/src/source/imp.rs
Previously a closed catalog track (catalog_consumer.next() -> None) broke the session loop and fell into the unconditional drain that cancels every pump, sending PadMessage::Drop without the PadMessage::Eos that the natural Ok(None) end path emits -- and catalog close can race ahead of the media tracks ending, so downstream could lose a clean EOS. Now catalog close just sets a flag (guarded so the closed track doesn't spin the loop) and stops reconciling; the loop keeps reacting to pump completions and global shutdown until the pumps drain themselves via their EOS path. Explicit cancellation is reserved for full shutdown and for renditions reconcile removes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
moqsrc follows the catalog for the life of the session, so a later update can add a rendition (and thus a pad) at any time -- emitting no_more_pads after the first catalog promises a complete pad set we can't honor. The Sometimes pads link on pad-added without it, and the only point we'd genuinely know no more pads are coming (the catalog track closing) is also when the streams EOS, so it carries no useful signal. Drop the emission and the now-unused ControlMessage::NoMorePads variant/handler. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g the session A rendition with an unsupported codec (or malformed CMAF init) now logs a warning and is skipped during reconcile, rather than propagating an error that tears down the whole session. Since moqsrc follows the catalog for the session's life, one bad rendition in a later catalog update could otherwise kill playback of the renditions already working. Caps and the wire container are resolved up front per rendition, so a bad one is rejected before a pad is ever requested. Also drop two em dashes from a doc comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ba6edd0 to
d399080
Compare
Verified end-to-end: this fixes
|
Problem
The interop smoke matrix (moq-dev/smoke) is red on every channel:
js-vite -> gst,js-esbuild -> gst, andjs-jsdelivr -> gstall fail, whilerust -> gstandpython -> gstpass and every other subscriber reads the browser broadcasts fine. Thegstcells fail as a silent ~30s timeout (no bytes), not a fast error.Root cause
moqsrc(rs/moq-gst/src/source/imp.rs) read the hang catalog exactly once and built all its pads from that single snapshot, then never looked at the catalog again:Reactive publishers — the browser via
@moq/hang— announce an initial catalog before the WebCodecs encoder has configured (camera/display dimensions unresolved), so the first catalog frame carries zero renditions; the populatedvideo/hd+video/sdcatalog arrives a beat later as a new group. Every other consumer follows catalog updates and recovers;moqsrclatched the empty snapshot, created no pads, and produced no output → the pipeline sits with no data until it times out.Matches all the evidence:
rust/pythonemit a fully-populated catalog up front (pass); browser publishers race and lose (fail); ~30s timeout not fast-fail (the catalog read succeeds, there are just no renditions); identical across apt/brew/nix/cargo (platform-independent logic bug). Independent of themoq-gst 0.2.4release, which was the zigbuild/glib packaging fix that finally ships the plugin — themoqsrclogic was unchanged.Fix
Rather than reading the catalog once,
moqsrcnow follows the catalog for the whole session and reconciles its pads against every update. This fixes the reported race and also makes the source handle catalogs changing in general:no_more_padsuntil the first real one), so the browser's empty-first-frame is handled naturally.video_<id>/audio_<id>from a process-unique counter (matching the%utemplates) instead of after the track name, so a recreated rendition never collides with the pad still being torn down. TheDrophandler also evicts its map entry so long sessions don't leak handles.moq_muxContainer::try_from, instead of hardcodingLegacy.Test
cargo check -p moq-gst✅cargo clippy -p moq-gst✅ (clean)cargo fmt -p moq-gst --check✅js-* -> gstcells (needs system GStreamer + the headless-Chromium publisher); not run locally.🤖 Generated with Claude Code