Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network stall detection #3227

Merged
merged 11 commits into from
Mar 30, 2021

Conversation

avelad
Copy link
Collaborator

@avelad avelad commented Mar 15, 2021

Closes: #1582

@TheModMaker
Copy link
Contributor

Joey's suggestion in that issue was if we went a certain time without getting more data, we'd stall. This cancels the timer after the first progress, so this is basically a connection timeout. If you want that, I'd suggest naming it connectionTimeout instead, or change to reset the timer on each progress to keep it a stall timeout.

Also, since the code is basically the same between the two handlers, I'd suggest moving to NetworkingEngine. This also allows app-provided handlers to get this too without having to implement it.

@avelad
Copy link
Collaborator Author

avelad commented Mar 15, 2021

I am going to implement both suggestions. Thanks!

@avelad
Copy link
Collaborator Author

avelad commented Mar 15, 2021

I have tried to move it to NetworkingEngine, but if I abort the operation from there, it had an unwanted recursion, I have not managed to fix it, so in the end I have left it in each plugin.

I have fixed the use of stallTimeout and added connectionTimeout.

@TheModMaker can you review again?

@avelad
Copy link
Collaborator Author

avelad commented Mar 17, 2021

This is really very beneficial, after testing internally for hours, we have seen that this actually makes buffering less times, so the user experience is better. Note: in our case 'streaming.retryParameters.maxAttempts' is 15.

@avelad
Copy link
Collaborator Author

avelad commented Mar 18, 2021

@TheModMaker can you review again?

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like @TheModMaker's opinion again on doing this generically in NetworkingEngine vs in each plugin. If the plugin produces progress events, I feel as if you could do it generically. If the plugin does not produce progress events, though, it may always trigger connection or stall timeouts. Could NetworkingEngine tell which plugins would generate those events? My instinct is "no", but I haven't analyzed it.

@@ -15,6 +15,8 @@ identical:
```js
retryParameters: {
timeout: 0, // timeout in ms, after which we abort; 0 means never
stallTimeout: 0, // stall timeout in ms, after which we abort; 0 means never
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you set these in the docs to their default values, please?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, you want the default value in the ShakaPlayer configuration to use 0 for both new configurations, is that correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry. I mean that the docs should show the default values. You gave non-zero defaults, and I agree with those defaults.

Copy link
Collaborator Author

@avelad avelad Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking time to respond, Friday was bank holiday in Spain. I'll fix it now.

lib/net/http_fetch_plugin.js Outdated Show resolved Hide resolved
lib/net/http_xhr_plugin.js Outdated Show resolved Hide resolved
@@ -90,6 +90,8 @@ describe('NetworkingEngine', /** @suppress {accessControls} */ () => {
backoffFactor: 0,
fuzzFactor: 0,
timeout: 0,
stallTimeout: 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a new test case make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I valued it, but the problem is that there is no timeout test to be able to compare :S

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. It's tough sometimes to figure out new test cases without a similar example to work from. Please file an issue to follow-up on this with tests, and we'll try to add some later. Don't worry about it for now. The code looks clear and short enough to me. (I hope I don't regret saying that some day! 😛)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once merged I create the issue and link it here.

@avelad
Copy link
Collaborator Author

avelad commented Mar 22, 2021

@joeyparrish I have moved the logic to NetworkingEngine. There is nothing that is exported to know if the plugin emits progress events or not, but since both xhr and fetch emit it, I think that right now it is not necessary to add this logic.

@avelad avelad requested a review from joeyparrish March 22, 2021 08:04
@joeyparrish
Copy link
Member

There is nothing that is exported to know if the plugin emits progress events or not, but since both xhr and fetch emit it, I think that right now it is not necessary to add this logic.

That could still cause problems for custom plugins. For example, there are apps and other libraries which plug into the networking engine in order to load content over peer-to-peer connections. If those have not implemented progress events, we would see them as having stalled, correct?

Here's a nice compromise: We can keep the centralized code for stalls and timeouts, and add a flag to network plugins to declare that they do emit progress events. Then you can add this flag to the fetch & XHR plugins. This would allow us to have this feature without breaking existing plugins.

What do you think?

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart from the need for a flag to avoid breaking old plugins

@avelad
Copy link
Collaborator Author

avelad commented Mar 22, 2021

I think we can add a method in the plugins that tells us if they support progress events or not. Now in a few minutes I upload a commit with this change.

@avelad
Copy link
Collaborator Author

avelad commented Mar 22, 2021

@joeyparrish I have already added what you have commented, can you review again?

@avelad avelad requested a review from joeyparrish March 22, 2021 17:18
Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this!

lib/net/networking_engine.js Outdated Show resolved Hide resolved
lib/net/http_fetch_plugin.js Outdated Show resolved Hide resolved
@avelad avelad requested a review from joeyparrish March 22, 2021 17:51
@avelad
Copy link
Collaborator Author

avelad commented Mar 24, 2021

@joeyparrish Do you think you could have time today to approve and merge this PR?

@avelad
Copy link
Collaborator Author

avelad commented Mar 30, 2021

@joeyparrish can you run the Shaka-bot?

@michellezhuogg
Copy link
Contributor

@joeyparrish can you run the Shaka-bot?

Running that now :)

@shaka-bot
Copy link
Collaborator

All tests passed!

@avelad
Copy link
Collaborator Author

avelad commented Mar 30, 2021

Thanks @michellezhuogg !

@joeyparrish joeyparrish merged commit 863e345 into shaka-project:master Mar 30, 2021
@avelad avelad deleted the network-stall-detection branch March 31, 2021 09:41
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Jul 25, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Network stall detection
6 participants