-
Notifications
You must be signed in to change notification settings - Fork 18k
x/net/http2: a canceled request context can deadlock other requests on the same connection #52853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @neild |
@neild What is the current status here? This issue is currently in the 1.19 milestone. Should it move to 1.20? To Backlog? Thanks. |
We currently assume that Request.Body.Close will complete reasonably quickly. As demonstrated here, there are circumstances where we call Close with connection or stream mutexes held, which can cause general badness if Close hangs. We should probably avoid holding locks while calling Request.Body.Close. I think it's okay if a slow Close blocks some operations (for example, ClientConn.Close will block until all in-flight request bodies are closed, and that's probably okay), but one slow request body shouldn't interfere with other requests on a connection. |
Change https://go.dev/cl/424755 mentions this issue: |
This comment was marked as duplicate.
This comment was marked as duplicate.
do you know if the fix will be backported to Go 1.19 in a point release? |
You can use |
The existing implementation holds a lock on a connection which causes issues on a slow Request.Body.Close call. Unlock before Request.Body.Close call. The abortStream closes the request body after unlocking a mutex. The abortStreamLocked returns reqBody as io.ReadCloser if the caller needs to close the body after unlocking the mutex. Fixes golang/go#52853 Change-Id: I0b74ba5263f65393c0e69e1c645d10c4db048903 Reviewed-on: https://go-review.googlesource.com/c/net/+/424755 Reviewed-by: Damien Neil <dneil@google.com> Run-TryBot: Damien Neil <dneil@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, this issue reproduces with
golang.org/x/net v0.0.0-20220425223048-2871e0cb64e4
(see reproduction below)What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We enabled HTTP/2 for an internal proxy written in Go, and noticed that after 10 minutes, every connection into the proxy seemed to be hanging. A goroutine dump revealed that many goroutines were blocked on acquiring the ClientConn mutex (e.g. see https://github.com/golang/net/blob/2871e0cb64e47e47ba7466fad6e11562caf99563/http2/transport.go#L790).
One goroutine, however, had acquired the mutex and was blocked in
abortStreamLocked
trying to close the request body: https://github.com/golang/net/blob/2871e0cb64e47e47ba7466fad6e11562caf99563/http2/transport.go#L379. In our case this seemed to be indefinitely blocked trying to read from a network socket. We haven't figured out why yet, but that's a separate issue.I've knocked up a very crude and simple reproduction here - I'll try and come back and tidy this up a bit later: https://github.com/milesbxf/go-http2-request-close-deadlock/blob/main/deadlock_test.go
For this to occur:
Is this actually expected/intended behaviour? I.e. do we require that closing the request body should always be non-blocking? If so, I guess we could mitigate by increasing the concurrent connection count, but it feels like we could still hit the deadlock across multiple connections
What did you expect to see?
Other requests be able to make progress whilst the first request is blocked
What did you see instead?
All requests on the same client connection were blocked.
Other things
I'm very happy to attempt a fix, but I'm a bit of a novice in this part of the codebase!
The text was updated successfully, but these errors were encountered: