Summary
Two related A/V sync gaps in moq-rtc, both verified on main:
-
Ingest ignores RTCP Sender Reports. IngestClock anchors each track to the arrival Instant of its first frame and thereafter follows that track's own RTP deltas. str0m surfaces MediaData::last_sender_info, but it is never read. Audio (48 kHz) and video (90 kHz) are cross-aligned purely by first-packet arrival time, so differential network jitter between the first audio packet and the first (large, paced) video keyframe is baked in as a fixed lip-sync offset for the entire session, typically tens of ms, with no re-sync ever.
-
Egress uses dequeue time as the RTCP SR wallclock. session.rs calls egress::dispatch(&mut rtc, req, Instant::now()) (main rs/moq-rtc/src/session.rs:222), and that now() becomes the wallclock str0m uses to build Sender Reports. Bursty MoQ group delivery dispatches several frames back-to-back at nearly the same now() while their rtp_time spans 100+ ms, skewing the receiver's SR-derived clock model.
Suggested fix
On ingest, correlate each track's RTP timeline to NTP via last_sender_info when available (fall back to arrival anchoring until the first SR). On egress, derive the SR wallclock from the frame's presentation time against a session epoch instead of dispatch time.
Found during a full review of the dev branch.
(Written by Claude Fable 5)
Summary
Two related A/V sync gaps in moq-rtc, both verified on main:
Ingest ignores RTCP Sender Reports.
IngestClockanchors each track to the arrivalInstantof its first frame and thereafter follows that track's own RTP deltas. str0m surfacesMediaData::last_sender_info, but it is never read. Audio (48 kHz) and video (90 kHz) are cross-aligned purely by first-packet arrival time, so differential network jitter between the first audio packet and the first (large, paced) video keyframe is baked in as a fixed lip-sync offset for the entire session, typically tens of ms, with no re-sync ever.Egress uses dequeue time as the RTCP SR wallclock.
session.rscallsegress::dispatch(&mut rtc, req, Instant::now())(mainrs/moq-rtc/src/session.rs:222), and thatnow()becomes thewallclockstr0m uses to build Sender Reports. Bursty MoQ group delivery dispatches several frames back-to-back at nearly the samenow()while theirrtp_timespans 100+ ms, skewing the receiver's SR-derived clock model.Suggested fix
On ingest, correlate each track's RTP timeline to NTP via
last_sender_infowhen available (fall back to arrival anchoring until the first SR). On egress, derive the SR wallclock from the frame's presentation time against a session epoch instead of dispatch time.Found during a full review of the dev branch.
(Written by Claude Fable 5)