Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix hung launcher on shutdown of Safari #38

Merged
merged 2 commits into from
Apr 20, 2022

Conversation

joeyparrish
Copy link
Member

In some cases (notably macOS Safari under GitHub Actions), a call to
forceKill() would be triggered during another call to forceKill().
This could cause the second Promise to go unresolved, leading to a
hang. On GitHub, eventually, the workflow would be cancelled.

This fixes the nested calls to forceKill by making them both resolve
on the same event (the shutdown triggered by the first call).

This also adds a timeout for shutting down a WebDriver session.
Although this does not appear to be the root cause of the hang we
were experiencing in GitHub Actions workflows, it should be safer to
have this timeout. If we can't stop a WebDriver session gracefully,
we will timeout after 5s and end the launcher anyway.

Closes #24

In some cases (notably macOS Safari under GitHub Actions), a call to
forceKill() would be triggered during another call to forceKill().
This could cause the second Promise to go unresolved, leading to a
hang.  On GitHub, eventually, the workflow would be cancelled.

This fixes the nested calls to forceKill by making them both resolve
on the same event (the shutdown triggered by the first call).

This also adds a timeout for shutting down a WebDriver session.
Although this does not appear to be the root cause of the hang we
were experiencing in GitHub Actions workflows, it should be safer to
have this timeout.  If we can't stop a WebDriver session gracefully,
we will timeout after 5s and end the launcher anyway.
@joeyparrish
Copy link
Member Author

joeyparrish commented Apr 20, 2022

I was led to this change by lots of console.log-based debugging and tracing through both Karma and the launcher. With the logging still in place, I can see that the mitigations are active.

With this change, I've been able to run Shaka's tests on Safari, in the GitHub Actions environment, 18 24 times in a row (and counting) without any hanging at the end.

index.js Show resolved Hide resolved
@@ -116,9 +119,18 @@ const LocalWebDriverBase = function(baseBrowserDecorator, args, logger) {
await this.stopWebdriver_();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say, what's the difference between the kill and forceKill operations? They both just call stopWebdriver_.
Is kill called in such a way that we don't need to worry about them being nested, or about a forceKIll being nested inside a kill?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill() and forceKill() are part of the launcher API, but in my logs, I don't see kill() being called. In a purely external-process-based launcher, the base classes would send different POSIX signals to the subprocess for kill() and forceKill() if I recall correctly.

I initially thought forceKill() should never fail or be infinitely delayed, so I tried adding a timeout at the level of forceKill(). But that didn't solve the issue. In the end, I added the timeout to the WebDriver quit command, since that's the only thing that can fail or be delayed.

So after all the debugging and logging, I concluded that kill() and forceKill() can still effectively be the same, except for the state tracking part.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I was going to suggest moving the new logic for saving the operation as a promise into stopWebdriver_, instead of forceKill. But if kill isn't being called I guess it doesn't matter.

@joeyparrish joeyparrish merged commit adc7898 into shaka-project:main Apr 20, 2022
@joeyparrish joeyparrish deleted the force-kill-timeout branch April 20, 2022 23:50
joeyparrish added a commit to shaka-project/shaka-player that referenced this pull request Apr 21, 2022
joeyparrish added a commit to shaka-project/shaka-player that referenced this pull request Apr 21, 2022
joeyparrish added a commit to shaka-project/shaka-player that referenced this pull request Apr 21, 2022
joeyparrish added a commit to shaka-project/shaka-player that referenced this pull request Apr 21, 2022
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Aug 17, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Safari doesn't always shut down properly
2 participants