-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: bundled x/net/http2.responseWriter hangs forever #42534
Comments
Possible related issue #23559 |
cc @fraenkel |
@dtelyukh We are going to need something that can reproduce the issue. It would also help to enable http2 debug and a thread dump when it hangs. |
Your title says the bundled version of http2 has this issue. Are you implying that if you use the latest x/net/http2 you dont? |
It's not easy to prepare code, which can reproduce this problem with 100% guarantee, but I think I could try.
h2_bundle.go used by third-party code. I didn't try to use x/net/http2 directly. Do you mean that I should do that? |
Don't worry. I am going to need something that can reproduce this issue.
Notice the stream for the PUSH is 1511 but the above is the first time I see that Framer. And the rest are in the 300s. I don't exactly see how this happened. |
I truncated log-file after each successful request until it was hanged. Maybe that is why the log-file was broken. I attach here other log-file, which was made when I caught problem from the first time. This log-file was never truncated. |
@fraenkel, we prepared test application for problem reproducing. My apologies for so complicated app. We cannot extract some small piece of code, because we don't know where is the problem exactly.
To have more chances to catch the problem it should to remove proxy cache:
To patch or debug server:
|
@dtelyukh I did find a way to cause the hang locally, from my machine it would never happen. |
Never mind, I got it working again.... |
So one thing I did verify is that using the latest golang/x/net/http2 code does not cause the hang I see with my simple testcase. |
@fraenkel, what can I help? |
You can see the one line change I made to caddyserver with a |
@fraenkel, this new version is never hangs for us. And also we noticed that app become faster. 👍 Thank you! |
Full page load time (with all resources) is 2% less than with old http/2, and median absolute deviation is 1% less too. So it's both faster, and shows more stable performance. |
@fraenkel, should I close this ticket? |
yes, given there is a solution and this should be fixed in 1.16 although one should verify that is true. |
* implement default values for header directive closes #3804 * remove `set_default` header op and rely on "require" handler instead This has the following advantages over the previous attempt: - It does not introduce a new operation for headers, but rather nicely extends over an existing feature in the header handler. - It removes the need to specify the header as "deferred" because it is already implicitely deferred by the use of the require handler. This should be less confusing to the user. * add integration test for header directive in caddyfile * bubble up errors when parsing caddyfile header directive * don't export unnecessarily and don't canonicalize headers unnecessarily * fix response headers not passed in blocks * caddyfile: fix clash when using default header in block Each header is now set in a separate handler so that it doesn't clash with other headers set/added/deleted in the same block. * caddyhttp: New idle_timeout default of 5m * reverseproxy: fix random hangs on http/2 requests with server push (#3875) see golang/go#42534 * Refactor and cleanup with improvements * More specific link Co-authored-by: Matthew Holt <mholt@users.noreply.github.com> Co-authored-by: Денис Телюх <telyukh.denis@gmail.com>
@fraenkel, this bug is still exist. But we found more clear way to reproduce it. An issue in Caddy's repository: caddyserver/caddy#3896 How to reproduceIt depends on the proxied website and caddy config, and some random factors, thus it occurs with different frequency on different hardware. The steps are:
|
I reproduced a hang but its using the bundle http2 stack.
|
Write me, please, to fix it should bundle a recent version of http2-library? Or does this library itself need to be fixed? |
Looks like with this hack it never hangs |
The problem is not solved neither by 1.16rc1, nor by 1.15.8. Easy steps for reproducing are here: caddyserver/caddy#3896 WIthout explicit usage of /x/net/http2 |
due to lots of issues with x/net/http2, as well as the hundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
* fix: metacache should only rename entries during cleanup (minio#11503) To avoid large delays in metacache cleanup, use rename instead of recursive delete calls, renames are cheaper move the content to minioMetaTmpBucket and then cleanup this folder once in 24hrs instead. If the new cache can replace an existing one, we should let it replace since that is currently being saved anyways, this avoids pile up of 1000's of metacache entires for same listing calls that are not necessary to be stored on disk. * turn off http2 for TLS setups for now (minio#11523) due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now * fix: save ModTime properly in disk cache (minio#11522) fix minio#11414 * fix: osinfos incomplete in case of warnings (minio#11505) The function used for getting host information (host.SensorsTemperaturesWithContext) returns warnings in some cases. Returning with error in such cases means we miss out on the other useful information already fetched (os info). If the OS info has been succesfully fetched, it should always be included in the output irrespective of whether the other data (CPU sensors, users) could be fetched or not. * fix: avoid timed value for network calls (minio#11531) additionally simply timedValue to have RWMutex to avoid concurrent calls to DiskInfo() getting serialized, this has an effect on all calls that use GetDiskInfo() on the same disks. Such as getOnlineDisks, getOnlineDisksWithoutHealing * fix: support IAM policy handling for wildcard actions (minio#11530) This PR fixes - allow 's3:versionid` as a valid conditional for Get,Put,Tags,Object locking APIs - allow additional headers missing for object APIs - allow wildcard based action matching * fix: in MultiDelete API return MalformedXML upon empty input (minio#11532) To follow S3 spec * Update yaml files to latest version RELEASE.2021-02-14T04-01-33Z * fix: multiple pool reads parallelize when possible (minio#11537) * Add support for remote tier management With this change MinIO's ILM supports transitioning objects to a remote tier. This change includes support for Azure Blob Storage, AWS S3 and Google Cloud Storage as remote tier storage backends. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io> Co-authored-by: Krishnan Parthasarathi <kp@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Poorna Krishnamoorthy <poornas@users.noreply.github.com> Co-authored-by: Shireesh Anjal <355479+anjalshireesh@users.noreply.github.com> Co-authored-by: Anis Elleuch <vadmeste@users.noreply.github.com> Co-authored-by: Minio Trusted <trusted@minio.io> Co-authored-by: Krishna Srinivas <krishna.srinivas@gmail.com> Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io>
due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. golang/go#23559 golang/go#42534 golang/go#43989 golang/go#33425 golang/go#29246 With collection of such issues present, it make sense to remove HTTP2 support for now
I was able to reproduce this with the reverse proxy Traefik, and can confirm that directly calling the |
👋 Any update on this issue? Any idea how and when this might be resolved? |
I hope it will help: #45435 |
I believe this is fixed with Go 1.17 considering that the bundled http2 library was updated to a snapshot of x/net/http2 from May of this year. @dtelyukh Can you give 1.17 a try with your test case (without manually using x/net/http2)? |
@ReillyBrogan , I tried 1.17.3 - the problem still exists. I think, 1.18 could help. |
Potentially fixed with #49921. It's part of the Go 1.17.6 release. |
This is still an issue on Go 1.18 |
I had the same problem using Go 1.19. |
any progress now?i had the same problems with golang.org/x/net v0.0.0-20210813160813-60bc85c4be6d in branch internal-branch.go1.18-vendor |
There seems to be a problem with the http.Server for http2 that could cause the stream to hang. - golang/go#42534 Proxy projects ([caddy](https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/app.go#L412)/[traefik](https://github.com/traefik/traefik/blob/bd93e224deb434b4cd8ffe5c85e52c8b0f54f57a/pkg/server/server_entrypoint_tcp.go#L678)) switched over to use `http2.Server` with success.
* implement default values for header directive closes #3804 * remove `set_default` header op and rely on "require" handler instead This has the following advantages over the previous attempt: - It does not introduce a new operation for headers, but rather nicely extends over an existing feature in the header handler. - It removes the need to specify the header as "deferred" because it is already implicitely deferred by the use of the require handler. This should be less confusing to the user. * add integration test for header directive in caddyfile * bubble up errors when parsing caddyfile header directive * don't export unnecessarily and don't canonicalize headers unnecessarily * fix response headers not passed in blocks * caddyfile: fix clash when using default header in block Each header is now set in a separate handler so that it doesn't clash with other headers set/added/deleted in the same block. * caddyhttp: New idle_timeout default of 5m * reverseproxy: fix random hangs on http/2 requests with server push (#3875) see golang/go#42534 * Refactor and cleanup with improvements * More specific link Co-authored-by: Matthew Holt <mholt@users.noreply.github.com> Co-authored-by: Денис Телюх <telyukh.denis@gmail.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
So, let's begin
High-Level Problem Description
I'm creating reverse proxy with Caddy Server and my own plugin. I use http/2 and Server Push. Sometimes requests hang forever. Here is screenshot from Chrome DevTools:
Low-Level Problem Description
So, I started to debug this situation. I found that my code execution stuck at
(http.responseWriter).Write()
, which is an instance of http2responseWriter.With help of
pprof
I found that lockup happens in two functions: http2serverConn.writeHeaders and http2serverConn.writeDataFromHandler - endless waiting of data fromdone
channel.Here is an illustration from
pprof
:Next I built
go
from source with adding some debug messages and start to dive deeper.I found a problem with frames are sent to output. At this line
N
frames were pushed: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4692. Afterpush
-functionscheduleFrameWrite
-function is called. I watched into it and found that it often exit here: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4817. And onlyM
(M
<N
) frames were popped from queue here: https://github.com/golang/go/blob/go1.15.4/src/net/http/h2_bundle.go#L4837Pushed Frames
Popped Frames
What did you expect to see?
No lockups.
What did you see instead?
Random lockups.
The text was updated successfully, but these errors were encountered: