How does a developer decide on a value for `playoutDelay` ? #46

padenot · 2020-07-17T14:54:52Z

In general, APIs that take a numerical value for a tradeoff are bad, because it's hard to determine the threshold values between the various use-cases. How does a developer find what the best value is for a particular use-case?

If the best value for a use-case is fixed (which seems to be the case, looking at #8), then an enum is better. If not, then other APIs that allow determining this value for a particular environment must be available.

jan-ivar · 2020-07-17T18:54:59Z

I think these are valid concerns. There's also a precedent here in latencyHint. Following that model we could imagine:

enum RTCRtpReceiverLatencyCategory {
  "balanced",
  "interactive",
  "playback",
};

partial interface RTCRtpReceiver {
  attribute (RTCRtpReceiverLatencyCategory or double) playoutDelay;
};

This should be more web compatible, and remove the onus on web developers to discover the minimum allowed delay of each browser and work around it with tables of different values for well-known browsers.

Happy to bikeshed enum names based on use-cases. This should also hopefully produce a healthy discussion about what the default value should be (presumably web conferencing is "interactive" not "balanced", right?)

henbos · 2020-07-20T09:30:23Z

I think the number you would use is the number of seconds of delay you are happy with under good network conditions, it translates very well into how interactive you want it to be. I don't remember the details of the discussions since this was quite a long time ago now, but we flip-flopped between enums and numbers several times and at the end of the day we thought that a number is more well-defined than an enum, because otherwise it begs the question "how interactive is 'interactive'?" and then we'd have to arbitrarily pick a number of seconds that corresponds to something being "interactive".

I think one of the major issues in terms of interop is with the implementation, that for audio tracks it can take several minutes before the delay you ask for to be achieved. I didn't realize this when I wrote the spec; I had assumed that you get what you ask for within a few seconds.

I don't think an enum is more web compatible because interactive could then mean milliseconds on one browser but a second on another browser?

henbos · 2020-07-20T09:32:56Z

That said I think there are good arguments for doing things with enum as well, but I do think that is would most likely map into some amount of seconds internally when you do the implementation.

henbos · 2020-07-20T09:34:31Z

The pro with an enum is you could adapt the amount of seconds over time the more confident you get about the stability of the network, but that seems like the 2.0 implementation of an API for playout delay.

padenot · 2020-07-20T09:45:49Z

That said I think there are good arguments for doing things with enum as well, but I do think that is would most likely map into some amount of seconds internally when you do the implementation.

Yes, but the browser knows the network / local resources more than the web app developer.

henbos · 2020-07-20T09:58:10Z

I agree with that, to allow a more powerful implementation that does tradeoffs I think an enum helps make it clear that the UA can be flexible, but it also makes it less testable, which might be OK. playoutDelay was originally named playoutDelayHint to allow the UA to override the decision, but to make things more testible I think it evolved into a more explicit delay knob and renamed playoutDelay.

If there is interest in implementing an enum I'd support that, or in otherwise revisiting the definition. But if implementations are basic "delay by X seconds" though I think the current API is more well-defined, despite the issue about not knowing how to pick the best number of seconds. @jan-ivar Is Firefox interested in an API like this?

padenot · 2020-07-20T12:46:23Z

A numeric value is OK if there is a feedback mechanism to inform developers programmatically that the value has been used as-is, or (for example) it's been clamped or changed. Is this the case? This would be useful for testing as well, of course. Knowing the depth of the jitter buffer is necessary for A/V sync (especially when it can be set to very high value).

Short of having this, an enum is preferable, but having an information about the amount of buffering is necessary anyways.

It needs to be a hint (which is fine with a way to know the value). Throwing for values more than 4.0 is arbitrary, and is not discoverable programmatically.

murillo128 · 2020-07-20T21:08:37Z

My two cents.

I am currently working on an use case that requires synchronized playback of a remote stream in two different devices. We add a custom delay on the primary node via web audio and then adjust the secondary via playoutDelayHint and adjust this based on the rtt. So, having a numeric value makes sense, at least for us.

Regarding the time required to adjust the delay to the new value, it is an implementation detail of NetEq. We have modified it so it converges faster (whiting 1 second) to the value set by js by adding silence or dropping packets instead of the default behavior of NetEq.

On a side note, I would be awesome if we could add more parameters to control the jitter buffer behavior (or even completely replace it) as, at least NetEq, is not tuned correctly for several use cases.

padenot · 2020-07-21T10:40:06Z

My two cents.

So, having a numeric value makes sense, at least for us.

Yes, but in your case you've answered the question: you're using the rtt to change the value of the jitter buffer depth. For your use-case it makes sense. With your changes that make it converge faster, it's probably OK for A/V sync as well (if needed). What is missing is a way for regular apps to determine the best value for the jitter buffer depth, based on something (network condition, machine load, etc.), and to know the duration of the jitter buffer, for A/V sync. This is what this issue is about.

If it's fixed for an app (as it seems to be for e.g. Meet), then a numerical value is bad and an enum is superior.

On a side note, I would be awesome if we could add more parameters to control the jitter buffer behavior (or even completely replace it) as, at least NetEq, is not tuned correctly for several use cases.

This will have to happen as a natural consequence of the abstraction level lowering that seem to happen in 2.0.

murillo128 · 2020-07-21T10:44:32Z

just one thing, i am not against the enum values, but keeping the numeric values too. El mar., 21 jul. 2020 12:40, Paul Adenot <notifications@github.com> escribió:

…

My two cents. So, having a numeric value makes sense, at least for us. Yes, but in your case you've answered the question: you're using the rtt to change the value of the jitter buffer depth. For your use-case it makes sense. With your changes that make it converge faster, it's probably OK for A/V sync as well (if needed). What is missing is a way for regular apps to determine the best value for the jitter buffer depth, based on something (network condition, machine load, etc.), and to know the duration of the jitter buffer, for A/V sync. This is what this issue is about. If it's fixed for an app (as it seems to be for e.g. Meet), then a numerical value is bad and an enum is superior. On a side note, I would be awesome if we could add more parameters to control the jitter buffer behavior (or even completely replace it) as, at least NetEq, is not tuned correctly for several use cases. This will have to happen as a natural consequence of the abstraction level lowering that seem to happen in 2.0. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#46 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIFN433AW3VWF4YIWCMISTR4VWBNANCNFSM4O6MJYMQ> .

padenot · 2020-07-21T10:52:00Z

The numeric value is not the problem. The absence of way to determine what value is best is the problem.

henbos · 2020-07-21T11:06:19Z

Regarding the time required to adjust the delay to the new value, it is an implementation detail of NetEq. We have modified it so it converges faster (whiting 1 second) to the value set by js by adding silence or dropping packets instead of the default behavior of NetEq.

On a side note, I would be awesome if we could add more parameters to control the jitter buffer behavior (or even completely replace it) as, at least NetEq, is not tuned correctly for several use cases.

That's interesting to hear. I would like to see Chrome's NetEq implementation to converge that fast too, but have no idea about the tradeoffs of that. @minyuel FYI an external developer has made playoutDelay more responsive for audio receivers.

AndrewJDR · 2020-07-24T08:41:59Z

I wanted to weigh in in favor of some more fine grained control than just an enum. As an application developer, I may have a different opinion about what constitutes glitch resilience (which I guess would be the playback enum value above) than a browser developer does. Also, WebRTC has a stats API -- it's not like app developers are flying completely blind.

What if i think >= 1 dropped frame and >= 10 nacks every minute means that not enough resilience is being provided?
10 dropped frames and 100 nacks?
An RTT of > 100ms?
Or some other combination of these things that I came up with through a lot of experimentation with my specific application (with its own specific resolution/fps/etc parameters)?

And what if I'm willing to give up some latency for some greater glitch resilience, but I have a hard cutoff on how much latency I'm willing to give up? (e.g. 300ms, 500ms, 700ms, etc)

And if I identify a technique for scaling up the delay at a rate of my own choosing, also found through experiments with my own application's specific use case?

This smells like something that needs a scalar value.

Also providing an enum sounds fine, though.

AndrewJDR · 2020-07-24T08:47:37Z

P.S. If the assertion is that there is not enough data available to the application to build a good heuristic for adjusting this value, let's beef up the stats, not water down the ability to adjust the value!

henbos · 2020-07-24T08:57:51Z

I don't think the intent of this API was ever that you would be that fine of a control knob. Generally speaking, the internal engine should be in the best position to know how to adjust the delay as to minimize poor quality and is the one in control of the jitter buffer. The problem is that the assumption was previously always that we wan to play out received media as soon as possible because that optimizes interactiveness, even if a shorter buffering necessarily entails a risk of reducing the quality when packets are dropped or don't arrive on time.

The intent of the playoutDelay was to give the application the power to say "you don't need to push the playout delay to below this point, because the interactiveness of my application's use case can loosen up these constraints... even if conditions are pretty good, I don't mind a bit of extra delay if that increases the odds of better quality".

henbos · 2020-07-24T09:00:02Z

Example use case: I'm passively listening to a presentation, I don't care if I get the presentation a couple of seconds later because I'm not interacting with that content in real time. playoutDelay = 2. Later, there's a Q&A session, and now interactiveness is important. playoutDelay = 0.

henbos · 2020-07-24T09:01:38Z

If your application grades how interactive you require the content to be, I think that is a better guide to what values to use for playoutDelay than you trying to do the job of the internal engine and predict network quality.

AndrewJDR · 2020-07-24T09:41:28Z

The problem is that the assumption was previously always that we wan to play out received media as soon as possible because that optimizes interactiveness, even if a shorter buffering necessarily entails a risk of reducing the quality when packets are dropped or don't arrive on time.

For what it's worth, this API has me excited because in our application, while interactivity is quite important, smoothness is also fairly important -- typically important enough that usually, we can tolerate between 100-500ms of latency if that's what it takes for smoothness, above which it's probably too much latency to be acceptable. At other, rarer moments, higher latencies up to 2 seconds are acceptable in the name of smoothness . At still other rare moments, interactivity is crucial above all-else, so playoutDelay = 0. This API seems to allow me to express that, and that maps well to what you described, so I think we're on the same page on that part, and I'm glad it supports that usecase.

I still think that application developers, including myself, want the more fine grained control that I described above. Ultimately, full control over the jitter buffer would be even better, but... baby steps. I think it's just key to remember that webrtc is being used for so many different things outside of video/audio conferencing, like remote rendering, gaming, AR/VR, animation playback, remote production work, etc... and developers out there are going to want to do things like grow their jitter buffer incredibly fast at the first sign of trouble (perhaps far faster than the browser's jitter buffer heuristic or enum presets would deem suitable) while still trying to keep it small if the connection is good, or things like that. While I understand it wasn't the ultimate goal of this API to allow for that kind of thing, personally I see it as a step in the right direction rather than sullying the API. Folks doing cool unexpected things with the API is not necessarily a bad thing.

murillo128 · 2020-07-24T10:09:42Z

IMHO this api would be quite useless if we allow to set a playoutDelayHint of 50ms and we end with a jitterbuffer delay 2s for several minutes. That would still be the case if we set it to interactive and get 2s delays because internally the neteq decides that it is better to converge slowly than drop packets.

We could state that the hint is the minimum value that we want the jitter buffer can take, but again neteq can decide to slow ramp up and take several minutes until that delay is achieved.

While a bit more complicated, I think the best alternative is to be able to define min and max values for the jitter buffer that are strictly endorsed by the jitter buffer is set. If jitter is lower than the min value then it should buffer packets until the min is reached before starting playback. Also, if jitter is avobe max, packets should be dropped so delay is never bigger than the max value.

henbos · 2020-07-24T10:15:56Z

Yeah I definitely think the implementation needs to apply the playoutDelay faster. Maybe it is quicker at speeding things up again than slowing things down? Otherwise it would be quite "dangerous" when you become interactive again

murillo128 · 2020-07-24T10:37:56Z

jitter should never be less than min/hint values, that would be quite safe to implement. Regarding max, if only plaoutDelayHint is defined, then I agree that neteq would have trouble deciding whats the best behaviour for the app. That's why i think that adding a max value will allow the apo to set the desired behaviour without neteq having to guess it. El vie., 24 jul. 2020 12:16, henbos <notifications@github.com> escribió:

…

Yeah I definitely think the implementation needs to apply the playoutDelay faster. Maybe it is quicker at speeding things up again than slowing things down? Otherwise it would be quite "dangerous" when you become interactive again — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#46 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIFN46YIHWY5BWNYK2RSKTR5FNO3ANCNFSM4O6MJYMQ> .

jan-ivar · 2020-07-24T15:46:57Z

To help separate discussion, let's discuss min max over in #28.

dontcallmedom-bot · 2023-04-19T14:27:16Z

This issue was mentioned in WEBRTCWG-2023-04-18 (Page 81)

jan-ivar mentioned this issue Apr 5, 2023

Should we clarify playoutDelay value is jitter buffer depth? #156

Closed

jan-ivar mentioned this issue Apr 19, 2023

Change playoutDelay to jitterBufferTarget. #160

Merged

jan-ivar closed this as completed in #160 Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does a developer decide on a value for `playoutDelay` ? #46

How does a developer decide on a value for `playoutDelay` ? #46

padenot commented Jul 17, 2020

jan-ivar commented Jul 17, 2020 •

edited

Loading

henbos commented Jul 20, 2020

henbos commented Jul 20, 2020

henbos commented Jul 20, 2020

padenot commented Jul 20, 2020

henbos commented Jul 20, 2020

padenot commented Jul 20, 2020

murillo128 commented Jul 20, 2020

padenot commented Jul 21, 2020

murillo128 commented Jul 21, 2020 via email

padenot commented Jul 21, 2020

henbos commented Jul 21, 2020

AndrewJDR commented Jul 24, 2020 •

edited

Loading

AndrewJDR commented Jul 24, 2020

henbos commented Jul 24, 2020

henbos commented Jul 24, 2020

henbos commented Jul 24, 2020

AndrewJDR commented Jul 24, 2020 •

edited

Loading

murillo128 commented Jul 24, 2020

henbos commented Jul 24, 2020

murillo128 commented Jul 24, 2020 via email

jan-ivar commented Jul 24, 2020

dontcallmedom-bot commented Apr 19, 2023

How does a developer decide on a value for playoutDelay ? #46

How does a developer decide on a value for playoutDelay ? #46

Comments

padenot commented Jul 17, 2020

jan-ivar commented Jul 17, 2020 • edited Loading

henbos commented Jul 20, 2020

henbos commented Jul 20, 2020

henbos commented Jul 20, 2020

padenot commented Jul 20, 2020

henbos commented Jul 20, 2020

padenot commented Jul 20, 2020

murillo128 commented Jul 20, 2020

padenot commented Jul 21, 2020

murillo128 commented Jul 21, 2020 via email

padenot commented Jul 21, 2020

henbos commented Jul 21, 2020

AndrewJDR commented Jul 24, 2020 • edited Loading

AndrewJDR commented Jul 24, 2020

henbos commented Jul 24, 2020

henbos commented Jul 24, 2020

henbos commented Jul 24, 2020

AndrewJDR commented Jul 24, 2020 • edited Loading

murillo128 commented Jul 24, 2020

henbos commented Jul 24, 2020

murillo128 commented Jul 24, 2020 via email

jan-ivar commented Jul 24, 2020

dontcallmedom-bot commented Apr 19, 2023

How does a developer decide on a value for `playoutDelay` ? #46

How does a developer decide on a value for `playoutDelay` ? #46

jan-ivar commented Jul 17, 2020 •

edited

Loading

AndrewJDR commented Jul 24, 2020 •

edited

Loading

AndrewJDR commented Jul 24, 2020 •

edited

Loading