-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bandwidth sampling for small transfers and fast connections #3595
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kanongil,
Thanks for the change suggestion. I have to think about this one a bit more. I can't accept it based on how it would impact all streams. For now, a custom abrController is probably the best solution for your use-case until we can address the issue holistically.
That is the point of this PR. To fix the most egregious issue of #3563. Did you see the detailed note in the commit, that gives an example of just how broken the current estimator is?
Ie. a sampling with a 10x bandwidth increase can mean just a 1.35x increase over a 5 second interval, while a 100x bandwidth increase only makes it 1.39x !!! An estimation rework is essential to ever get LL-HLS to work with ABR switching. This PR fixes the fundamental issue, and high-bandwidth estimation issues with the current implementation. I really hope you will prioritise a fix before 1.0.0. Note that the new default |
The estimator factored the transfer duration into the weight. This means that very fast transfers will have negligible influence on the measured average. The minDelayMs_ was used to cap this effect, but it also capped the bitrate. This meant that a 1000 byte transfer, transferred in 10ms, would register at a rate of 160Kbps instead of 800Kbps (on top of the low weight). The fix is to use the duration of the transferred fragment / part for the weight. This preserves the effect of samples that match the line rate, while significantly increasing the effect of fast transfers (due to small size, or from a fast connection). With 5s segments of size 100,000: Start 10x 5 sec transfers => 160Kbps (reference) Rate change (old): 1x 0.5 sec transfer (1.6Mbps) => 216Kbps vs 1x 0.05 sec transfer (16Mbps) => 222Kbps (10x faster is only 4% more!!) Rate change (new): 1x 0.5 sec transfer (1.6Mbps) => 627Kbps vs 1x 0.05 sec transfer (16Mbps) => 5.3Mbps
a0028b0
to
3065413
Compare
I had another look at the estimation, and found that it can be simplified, and work better, if the weight is always a fixed value for each sample. So my initial patch tried to use the fragment / part duration for the weight, which makes some sense, and certainly works a lot better than the current logic. However, it meant that the halfLife needed to be quite different for LL-HLS vs normal content. I came to realise that there are essentially 2 modes when estimating, part loading vs. fragment loading, and both needs to adjust for bandwidth changes in sample time. Ie. when close to live edge, both modes have ~2 parts/fragments time to react to a bandwidth change. Based on this realisation, I changed the As part of this revision I also removed the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This issue has been automatically closed because it has not had recent activity. If this issue is still valid, please ping a maintainer and ask them to label it accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What needs to happen is we must create separate estimates for round trip time (RTT or time to first byte depending on Browser JS API availability) and bandwidth. RTT will remain consistent between small and large segments from the same origin, and will inversely impact the estimate of time to load/append/play created by the ABR controller and used for variant switching. Splitting out these estimates based on available loader data and using them in our current stall/switch timing calculations should improve selection for streams with shorter (partial) segments.
This PR has been replaced by #4825 |
…ents and higher TTFB Fixes video-dev#3578 (special thanks to @Oleksandr0xB for submitting video-dev#4283) Fixes video-dev#3563 and Closes video-dev#3595 (special thanks to @kanongil)
* Improve bandwidth estimation and adaptive switching with smaller segments and higher TTFB Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283) Fixes #3563 and Closes #3595 (special thanks to @kanongil) * Load rate and loaded delay calculation fixes * Convert ttfbEstimate to seconds * Include main variant init segments in TTFB sampling * Use ttfb estimate in abandon rules down-switch timing
* Improve bandwidth estimation and adaptive switching with smaller segments and higher TTFB Fixes #3578 (special thanks to @Oleksandr0xB for submitting #4283) Fixes #3563 and Closes #3595 (special thanks to @kanongil) * Load rate and loaded delay calculation fixes * Convert ttfbEstimate to seconds * Include main variant init segments in TTFB sampling * Use ttfb estimate in abandon rules down-switch timing
This PR will...
Fix bandwidth sampling weight for small transfers and/or fast connections.
Why is this Pull Request needed?
The current logic is broken for fast transfers (due to small transfer sizes, or from a fast connection). See #3563 for details.
Are there any points in the code the reviewer needs to double check?
This changes the exact meaning of the
abrEwma<X>
options, which could be considered a breaking change.I have also had to significantly lower the
abrEwmaFastLive
default to accommodate LL-HLS level downswitches. The old value meant that it takes too long to discover a new lowered bitrate, and could cause multiple successive emergency level switches and stalls. Note that the new value could make a temporary connection issue more likely to cause temporary downswitches. It might make sense to only use this low value for LL-HLS content, but that is outside the scope of this patch.Resolves issues:
Partly fixes #3563. With this PR, the ABR algorithm is more likely to switch up from a low level (still only when there is sufficient bandwidth). Low latency content on high latency links are still unable to measure a suitable bitrate.
Checklist