Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential lock up #36

Closed
mstoykov opened this issue Mar 27, 2023 · 0 comments · Fixed by #42
Closed

Potential lock up #36

mstoykov opened this issue Mar 27, 2023 · 0 comments · Fixed by #42
Labels
bug Something isn't working

Comments

@mstoykov
Copy link
Collaborator

We have received info that sometimes using this module leads to k6 not stopping the process after the test duration is over.

Unfortunately there isn't all that much concrete info. The only particular interesting thing is that there are a lot of websocket connectiongs per VU 500+.

I was thinking it might be related to

case <-ctx.Done():
// unfortunately it is really possible that k6 has been winding down the VU and the above code
// to close the connection will just never be called as the event loop no longer executes the callbacks.
// This ultimately needs a separate signal for when the eventloop will not execute anything after this.
// To try to prevent leaking goroutines here we will try to figure out if we need to close the `done`
// channel and wait a bit and close it if it isn't. This might be better off with more mutexes
timer := time.NewTimer(time.Millisecond * 250)
but looking at the code - there is no way this code will miss to stop the execution. The problem with it, it might stop it too soon.

But it might also be some other corner case, and unfortunately it might even be in some other part that just gets trickered (more easily) by this module.

But the first step will be to make this at all reproducible I guess.

@mstoykov mstoykov added the bug Something isn't working label Mar 27, 2023
mstoykov added a commit that referenced this issue Apr 3, 2023
mstoykov added a commit that referenced this issue Apr 3, 2023
mstoykov added a commit that referenced this issue Apr 4, 2023
Before the update of k6 it was possible that a websocket connection will start
stopping due to context Done. That will use the eventloop to synchronise
the update of the statuses and the closing of the done channel.

And everything will work *except* if the eventloop was not already
stopping due to unhandled exception (for example).

In that case the eventloop will just not run the call to stop the
connection.

Detecting that is inherently hard and will require to then have
a separate mechanism to close and stop the websocket connection that is
only used in this particular case.

The test provided also caught leakage of goroutines which is now caught
and fixed.

fixes #36
mstoykov added a commit that referenced this issue Apr 5, 2023
Before the update of k6 it was possible that a websocket connection will start
stopping due to context Done. That will use the eventloop to synchronise
the update of the statuses and the closing of the done channel.

And everything will work *except* if the eventloop was not already
stopping due to unhandled exception (for example).

In that case the eventloop will just not run the call to stop the
connection.

Detecting that is inherently hard and will require to then have
a separate mechanism to close and stop the websocket connection that is
only used in this particular case.

The test provided also caught leakage of goroutines which is now caught
and fixed.

fixes #36

Co-authored-by: Oleg Bespalov <oleg.bespalov@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant