Simulcasting to RtpReceiver and switched stream rapid switches and clipping #175

Closed
robin-raymond opened this Issue Feb 7, 2015 · 7 comments

Projects

None yet

2 participants

@robin-raymond
Contributor

Issues:
A single RtpReciever can be configured to receive a simulcast stream (i.e. receives from potentially multiple stream sources but one one is received and rendered at a time from the RtpReceiver which outputs a single Media Stream Track).

When an RtpReceiver is configured for simulcasting certain configurations can lead to a rapid switch from one RtpSender's stream to another and then back which can be confusing to a receiver on how to demux the RTP packets properly.

Scenario A:

  1. RtpReceiver is set to receive a simulcasting stream for SSRC 5 and SSRC 6 (kind might be audio or video)
  2. RtpReceiver starts receiving from SSRC 5 and renders the stream.
  3. RtpReceiver starts receiving from SSRC 6 (and not SSRC 5 as a switch has occurred) and renders that stream.
  4. RtpReceiver starts receiving SSRC 5 again after a short period of time (due to a switch back to SSRC 5).

Scenario B:

  1. RtpReceiver is set to receive a simulcasting stream for SSRC 5 and SSRC 6 (kind might be audio or video)
  2. RtpReceiver starts receiving from SSRC 5 and renders the stream.
  3. RtpReceiver starts receiving from SSRC 6 (and not SSRC 5 as a switch has occurred) and renders that stream.
  4. RtpReceiver starts receiving SSRC 5 again after a short period because some backlogged network packets arrived later from SSRC 5.

The problem:
The difference from scenario A and B are difficult to determine from an engine perspective. Are the packets arriving on the original SSRC 5 because of a switch back to 5 or because they were late to arrive? Also, if SSRC 5 had some late packets arriving and the switch to SSRC 6 occurred then the remaining SSRC 5 packets would get clipped (potentially cutting the end of an audio stream slightly short).

Possible solutions:
(a) Have a set of timing rules that can be applied to determine scenario A versus B to resolve the ambiguity.
(b) Render in two (or more) hidden RtpReceivers with individual tracks being output from each where the simulcast RtpReceiver is the rendered output of the combined audio for audio and the active video for video.
(c) Do not allow simulcasting and require separate RtpReceivers where the Media Stream Tracks indicate their activity (active/inactive) state allowing switching from an application between the streams (as well as sending all audio to render so that it doesn't matter which stream is output).
(d) Do a simple method of "last packet wins" and watch the jitter happen :)

Personally, I think (c) must always be an option to the programmer if they prefer to switch to active Media Stream Tracks manually in the application layer thus I don't see dropping support for simulcast in the RtpReceiver as advantageous, which eliminates (c) as the "solution" for me. I assume that Media Stream Track already has the ability to fire an "active / inactive" state to know when a stream is actively receiving or has become inactive [sending party has stopped transmitting](but I've not verified this is actually true or not that such an event exists).

I don't like (d) because I think the user experience will be bad.

I think (b) is a lot of additional work for a marginal improvement in rendering where (c) could already be used if an application programmer cared about those marginal improvements.

That leaves (a) for me. The question is what set of rules / timings would need to be used?

[cross posting to ortc list so please reply on the list]

@robin-raymond robin-raymond added the 1.1 label Feb 7, 2015
@aboba
Contributor
aboba commented Oct 30, 2015

For WebRTC 1.0, simulcast functionality has been restricted to the situation where an RtpSender sends multiple streams, but the RtpReceiver only receives a single stream. The assumption is that the SFU receives multiple simulcast streams from participants and selects between the simulcast streams, providing the receiver with a single stream that switches resolutions and/or frame rates based on SFU switching decisions. To date, my understanding is that SFUs supporting the interoperable WebRTC video codecs (e.g. VP8, VP9, H.264/AVC) all support this model.

@aboba aboba added the PR exists label May 10, 2016
@aboba aboba added a commit that referenced this issue May 10, 2016
@aboba aboba Non-support for simulcast reception and SVC codecs with MRST
Fix for Issue #175
ccbf133
@aboba
Contributor
aboba commented Jun 3, 2016

@robin-raymond In practice, support for RFC 6051 is often used to align the timestamps of the received streams. This allows the receiver to distinguish Scenarios A and B. The other (codec-specific) way that the streams can be aligned is through the Decoding Order Number (DON) field in H.264/SVC.

Without RFC 6051 or DON, heuristics can be used, but this will be prone to error. For example, the implementation could switch from SSRC 5 to SSRC 6 and assume that the last received SSRC 5 timestamp represents the switching point. However, if delayed packets subsequently arrive on SSRC 5 they might need to be thrown away. Whereas with RFC 6051, the implementation might realize that there is a gap between the last received packets in SSRC 5 and the newly receive packets in SSRC 6, and wait for that gap to be filled in by late arriving SSRC 5 packets.

@robin-raymond
Contributor

ORTC Lib solved this internally and locks onto the latest active stream and obsolete packets do not causes switching.

@robin-raymond
Contributor

ORTC Lib does a mixture of a) and b).

@robin-raymond
Contributor

TODO: add a spec note to the spec to describe how to be a good implementation and handle this... 6051 is great if possible but a) and b) can be done in that absence.

@aboba
Contributor
aboba commented Jul 13, 2016

Edge supports the H.264/SVC DON field to help line up the received multiple streams without support for RFC 6051 rapid sync, but that is a codec-specific approach (e.g. no equivalent of DON in VP8/VP9).

@aboba aboba self-assigned this Jul 13, 2016
@aboba aboba added a commit that referenced this issue Jul 14, 2016
@aboba aboba Simulcasting to RtpReceiver and switched stream rapid switches and cl…
…ipping

Fix for Issue #175
25b7d13
@robin-raymond
Contributor

Fixed with note that RFC6051 was not used in any implementation and added note how ORTC Lib handles this particular case as an example.

@aboba aboba was unassigned by dontcallmedom Aug 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment