Skip to content

fix(moq-gst): follow catalog updates dynamically in moqsrc#1627

Merged
kixelated merged 5 commits into
mainfrom
fix/moqsrc-catalog-wait-for-renditions
Jun 6, 2026
Merged

fix(moq-gst): follow catalog updates dynamically in moqsrc#1627
kixelated merged 5 commits into
mainfrom
fix/moqsrc-catalog-wait-for-renditions

Conversation

@kixelated
Copy link
Copy Markdown
Collaborator

@kixelated kixelated commented Jun 4, 2026

Problem

The interop smoke matrix (moq-dev/smoke) is red on every channel: js-vite -> gst, js-esbuild -> gst, and js-jsdelivr -> gst all fail, while rust -> gst and python -> gst pass and every other subscriber reads the browser broadcasts fine. The gst cells fail as a silent ~30s timeout (no bytes), not a fast error.

Root cause

moqsrc (rs/moq-gst/src/source/imp.rs) read the hang catalog exactly once and built all its pads from that single snapshot, then never looked at the catalog again:

let catalog = catalog.next().await?.context("catalog missing")?.clone();
for (track_name, config) in catalog.video.renditions {}

Reactive publishers — the browser via @moq/hang — announce an initial catalog before the WebCodecs encoder has configured (camera/display dimensions unresolved), so the first catalog frame carries zero renditions; the populated video/hd + video/sd catalog arrives a beat later as a new group. Every other consumer follows catalog updates and recovers; moqsrc latched the empty snapshot, created no pads, and produced no output → the pipeline sits with no data until it times out.

Matches all the evidence: rust/python emit a fully-populated catalog up front (pass); browser publishers race and lose (fail); ~30s timeout not fast-fail (the catalog read succeeds, there are just no renditions); identical across apt/brew/nix/cargo (platform-independent logic bug). Independent of the moq-gst 0.2.4 release, which was the zigbuild/glib packaging fix that finally ships the plugin — the moqsrc logic was unchanged.

Fix

Rather than reading the catalog once, moqsrc now follows the catalog for the whole session and reconciles its pads against every update. This fixes the reported race and also makes the source handle catalogs changing in general:

  • Reconcile loop diffs the catalog's renditions against the live pumps on each update: spawn pads for newly announced renditions, tear down vanished ones, and recreate any whose caps or container changed. An empty catalog simply yields no pads (and we hold off no_more_pads until the first real one), so the browser's empty-first-frame is handled naturally.
  • Unique pad ids: pads are named video_<id>/audio_<id> from a process-unique counter (matching the %u templates) instead of after the track name, so a recreated rendition never collides with the pad still being torn down. The Drop handler also evicts its map entry so long sessions don't leak handles.
  • Per-track teardown: each pump gets its own cancel signal and reports completion over a channel correlated by pad id, so a pump can be stopped individually and a stale completion from a replaced rendition can't evict its successor.
  • Honor the per-track container (legacy/cmaf/loc) from the catalog via the existing moq_mux Container::try_from, instead of hardcoding Legacy.

Test

  • cargo check -p moq-gst
  • cargo clippy -p moq-gst ✅ (clean)
  • cargo fmt -p moq-gst --check
  • End-to-end confirmation is the smoke matrix's js-* -> gst cells (needs system GStreamer + the headless-Chromium publisher); not run locally.

🤖 Generated with Claude Code

moqsrc read the hang catalog exactly once and built all its pads from that
single snapshot. Reactive publishers (the browser via @moq/hang) announce an
initial catalog *before* their encoder has configured, so the first frame can
carry zero renditions, with the populated catalog arriving a beat later as a
new group. moqsrc latched the empty snapshot, created no pads, and produced no
output until the pipeline timed out -- which is exactly why every `js-* -> gst`
cell in the interop smoke matrix failed while `rust -> gst` / `python -> gst`
(fully-populated catalog up front) passed.

Loop on catalog updates and wait for the first catalog that actually advertises
a video or audio rendition, and select on shutdown so we don't hang if the
publisher closes without announcing a track.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

run_session is rewritten to continuously consume catalog updates and reconcile the set of desired renditions to an active pump map keyed by rendition name. New renditions get new pad ids; disappeared or changed renditions cancel and (if needed) recreate pumps with fresh ids. Pad handles are evicted when pads drop. spawn_track_pump/run_track_pump accept a rendition-specific cancel watch and a done sender; pumps exit on cancel or global shutdown and send TrackDone(rendition name, id) back to the session for completion reconciliation.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(moq-gst): follow catalog updates dynamically in moqsrc' directly describes the main change: moqsrc now follows catalog updates instead of reading once.
Description check ✅ Passed The description comprehensively explains the problem (catalog read once, empty renditions on browser publishers), root cause, fix implementation details, and testing performed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/moqsrc-catalog-wait-for-renditions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Build on the catalog-wait fix: instead of reading the catalog once and freezing
that snapshot, follow it for the whole session and reconcile pads against every
update. This handles the empty-initial-catalog race (browser/@moq/hang) and also
renditions that appear, disappear, or change codec/resolution mid-stream.

- Reconcile loop diffs the catalog's renditions against the live pumps each
  update: spawn pads for new renditions, tear down vanished ones, and recreate
  any whose caps or container changed.
- Pads are now named video_<id>/audio_<id> from a process-unique counter (so a
  recreated rendition never collides with the pad still being torn down) and the
  Drop handler evicts its map entry so long sessions don't leak handles.
- Per-track cancel + a completion channel (correlated by pad id) let pumps be
  stopped individually and report when they end, without a stale completion from
  a replaced rendition evicting its successor.
- Honor the catalog's per-track container (legacy/cmaf/loc) instead of assuming
  legacy, via the existing moq_mux Container::try_from.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kixelated kixelated changed the title fix(moq-gst): wait for a non-empty catalog in moqsrc fix(moq-gst): follow catalog updates dynamically in moqsrc Jun 4, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rs/moq-gst/src/source/imp.rs`:
- Around line 475-503: When the catalog track closes in run_session
(catalog_consumer.next() -> None) do not cancel active track pumps; instead stop
reconciling new catalogs and exit the catalog-processing loop without iterating
the active.drain() cancel path so existing pumps (run_track_pump) can reach
their natural Ok(None) -> PadMessage::Eos path; keep reacting to global
shutdown/pump_shutdown/done_rx so pumps still stop on those signals. Concretely:
modify the branch that currently breaks and then drains/cancels active to simply
break from the catalog loop (or set a flag like catalog_closed) and remove the
unconditional active.drain() cancel loop; ensure run_track_pump still handles
cancel.changed() producing PadMessage::Drop only for explicit cancellation, and
only send cancel.send(true) when doing a full shutdown. Also ensure reconcile
and any control_tx logic remain functional so pads can still be dropped/added
while pumps finish.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4aa606e0-f5a8-48de-866d-4865dbe7908b

📥 Commits

Reviewing files that changed from the base of the PR and between 0cf7625 and f543129.

📒 Files selected for processing (1)
  • rs/moq-gst/src/source/imp.rs

Comment thread rs/moq-gst/src/source/imp.rs Outdated
kixelated and others added 3 commits June 4, 2026 17:01
Previously a closed catalog track (catalog_consumer.next() -> None) broke the
session loop and fell into the unconditional drain that cancels every pump,
sending PadMessage::Drop without the PadMessage::Eos that the natural
Ok(None) end path emits -- and catalog close can race ahead of the media tracks
ending, so downstream could lose a clean EOS.

Now catalog close just sets a flag (guarded so the closed track doesn't spin the
loop) and stops reconciling; the loop keeps reacting to pump completions and
global shutdown until the pumps drain themselves via their EOS path. Explicit
cancellation is reserved for full shutdown and for renditions reconcile removes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
moqsrc follows the catalog for the life of the session, so a later update can
add a rendition (and thus a pad) at any time -- emitting no_more_pads after the
first catalog promises a complete pad set we can't honor. The Sometimes pads
link on pad-added without it, and the only point we'd genuinely know no more
pads are coming (the catalog track closing) is also when the streams EOS, so
it carries no useful signal. Drop the emission and the now-unused
ControlMessage::NoMorePads variant/handler.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g the session

A rendition with an unsupported codec (or malformed CMAF init) now logs a warning
and is skipped during reconcile, rather than propagating an error that tears down
the whole session. Since moqsrc follows the catalog for the session's life, one bad
rendition in a later catalog update could otherwise kill playback of the renditions
already working.

Caps and the wire container are resolved up front per rendition, so a bad one is
rejected before a pad is ever requested. Also drop two em dashes from a doc comment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kixelated
Copy link
Copy Markdown
Collaborator Author

Verified end-to-end: this fixes browser → gst

Ran the actual js-* → gst cells locally (the ones the PR description couldn't run), using the in-tree smoke harness from #1639 with MOQ_GST_PLUGIN_DIR pointed at a moqsrc built from this branch. Same machine, same nix GStreamer 1.26.11, same published @moq/publish browser packages, same relay/cli — the only variable is moqsrc.

Baseline (main, no fix):

rust -> gst       PASS
js-vite -> gst     FAIL   (silent ~30s timeout, no bytes)

This branch (d399080f):

rust -> gst        PASS
js-vite -> gst     PASS
js-esbuild -> gst  PASS
js-jsdelivr -> gst PASS

Matches the diagnosis exactly: the reconcile loop picks up the populated catalog the browser announces a beat after its empty first frame, so moqsrc now builds pads and produces output instead of latching the empty snapshot. The rust -> gst control still passes, so nothing regressed for publishers that emit a full catalog up front.

Once this merges and a moq-gst release ships, browser -> gst goes green in the nightly matrix (#1639), and that cell can come back into the PR-smoke set.

(Written by Claude)

@kixelated kixelated merged commit 28d9423 into main Jun 6, 2026
2 checks passed
@kixelated kixelated deleted the fix/moqsrc-catalog-wait-for-renditions branch June 6, 2026 21:35
@moq-bot moq-bot Bot mentioned this pull request Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant