Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate estimation is fundamentally broken #3563

Closed
4 of 5 tasks
kanongil opened this issue Mar 4, 2021 · 3 comments · Fixed by #4825
Closed
4 of 5 tasks

Rate estimation is fundamentally broken #3563

kanongil opened this issue Mar 4, 2021 · 3 comments · Fixed by #4825

Comments

@kanongil
Copy link
Contributor

kanongil commented Mar 4, 2021

Low latency streams are permanently stuck in low bitrates due to bad rate estimation.

There are 2 underlying causes for this. First the fragment bitrate does not take latency (or server delay) into consideration. This could probably be fixed by using loading.first instead of loading.start for calculating transfer duration.

The other issue, is regarding the input for the the EWMAs. Here the logic is fundamentally flawed, as the transfer duration is used both for the weight as well as the value when taking a sample:

sample(durationMs: number, numBytes: number) {
durationMs = Math.max(durationMs, this.minDelayMs_);
const numBits = 8 * numBytes;
// weight is duration in seconds
const durationS = durationMs / 1000;
// value is bandwidth in bits/s
const bandwidthInBps = numBits / durationS;
this.fast_.sample(durationS, bandwidthInBps);
this.slow_.sample(durationS, bandwidthInBps);
}

Here both durationS and bandwidthInBps is a product of durationMs. This causes low duration value to not have any impact, and the EWMAs will effectively just use the old value (ie. fast transfers are heavily undervalued). An attempt to hack around this issue is done with the minDelayMs_ clamping, mitigating the issue a bit while causing underestimation for < 50ms transfers. Eg. a 10ms transfer will be underestimated by 5x.

I'm not sure exactly how to fix this later issue, but I know that the transfer duration should not be used for the weight. Probably segment/part duration is more appropriate. Regardless of how this is solved, it will mess with how the values of any abrEwma<X> options are used, and should likely be considered a breaking change.

This is probably related to the issue in #2173.

What version of Hls.js are you using?

1.0.0-rc2

What browser and OS are you using?

Any

Test stream:

https://stream.sob.m-dn.net/live/sb1-ll/index.m3u8

https://hls-js-dev.netlify.app/demo/?src=https%3A%2F%2Fstream.sob.m-dn.net%2Flive%2Fsb1-ll%2Findex.m3u8&demoConfig=eyJlbmFibGVTdHJlYW1pbmciOnRydWUsImF1dG9SZWNvdmVyRXJyb3IiOnRydWUsInN0b3BPblN0YWxsIjpmYWxzZSwiZHVtcGZNUDQiOmZhbHNlLCJsZXZlbENhcHBpbmciOi0xLCJsaW1pdE1ldHJpY3MiOi0xfQ==

Checklist

  • The stream has correct Access-Control-Allow-Origin headers (CORS)
  • There are no network errors such as 404s in the browser console when trying to play the stream

Steps to reproduce

  1. Play stream on a high bandwidth connection
  2. Wait

Expected behavior

Switch to highest quality.

Actual behavior

Stuck at level 1.

@robwalch
Copy link
Collaborator

robwalch commented Mar 4, 2021

Hi @kanongil,

Thanks for submitting this issue and weighing in on the abr-controller. I don't think estimation has changed in the new version. The current method is slow and far less effective with smaller fragments, so I understand the need to improve and appreciate the specifics you've pointed to.

That being said. It's not a priority or requirement for v1.0.0 at this time. I would gladly accept contributions. My resources are limited and I am focused on known issues and stability. If this is critical to you in the near future, please consider submitting a PR.

@kanongil
Copy link
Contributor Author

kanongil commented Mar 4, 2021

How is this not a bug? ABR LL-HLS content does not play as intended.

The implementation is indeed the same as before. The difference is that the small part sizes clearly exposes a fundamental shortcoming of the algorithm. To more clearly illustrate the issue, if the minDelayMs hack is removed, a sudden change to infinite bandwidth would effectively result in discarding the new measurement, since the transfer would take 0 seconds causing the weight to be 0. This is clearly not right.

I will gladly take a pass at fixing it, but I can’t do it a way that retains the current meaning of the abrEwma options, or that is thoroughly researched or validated.

@kanongil
Copy link
Contributor Author

kanongil commented Mar 6, 2021

I am working on improving this, but issues like #3578 make it really hard to validate any implementation.

@robwalch robwalch added the ABR label Apr 19, 2021
breunigs added a commit to breunigs/veloroute that referenced this issue Dec 4, 2021
robwalch added a commit that referenced this issue Aug 4, 2022
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Aug 4, 2022
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Aug 5, 2022
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Nov 28, 2022
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
silltho pushed a commit to silltho/hls.js that referenced this issue Dec 2, 2022
…ents and higher TTFB

Fixes video-dev#3578 (special thanks to @Oleksandr0xB for submitting video-dev#4283)
Fixes video-dev#3563 and Closes video-dev#3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Jan 9, 2023
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Jan 12, 2023
…ents and higher TTFB

Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)
robwalch added a commit that referenced this issue Jan 18, 2023
* Improve bandwidth estimation and adaptive switching with smaller segments and higher TTFB
Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)

* Load rate and loaded delay calculation fixes

* Convert ttfbEstimate to seconds

* Include main variant init segments in TTFB sampling

* Use ttfb estimate in abandon rules down-switch timing
robwalch added a commit that referenced this issue Jan 26, 2023
* Improve bandwidth estimation and adaptive switching with smaller segments and higher TTFB
Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283)
Fixes #3563 and Closes #3595 (special thanks to @kanongil)

* Load rate and loaded delay calculation fixes

* Convert ttfbEstimate to seconds

* Include main variant init segments in TTFB sampling

* Use ttfb estimate in abandon rules down-switch timing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants