-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158
Comments
CC @tombergan |
Thanks for the playground code. The relevant RFC is here: You've already pointed out the main challenge, which is determining if the POST had reached the server. This is impossible to know precisely. It looks like Chrome assumes the POST had not reached the server if the connection was reused and gets a RST before any response bytes are received. We could do the same thing. |
Interesting, thank you for the RFC reference! The heuristic from Chrome seems fairly reasonable. Presumably if it has been in use there the risk is less. I also found #15723 and #18241 which sound similar in that they are about non-idempotent requests. From the latter I had the idea to try setting |
I think it's too late to change to the Chrome behavior for Go 1.10, but we should probably do it for Go 1.11. Leaving NeedsInvestigation but @tombergan please feel free to make a decision and switch to NeedsFix. |
Kindly pinging you @horgh @tombergan @bradfitz. Should we move this to Go 1.12 or might we be able to do something here? |
I'll move it to Go 1.12. |
In Go 1.12 we now have the fix for #19943 (comment) ....
That should be sufficient for this bug. See the docs on the Transport type at https://tip.golang.org/pkg/net/http/#Transport |
Chromium checked whether socket reused before sending retry request, I wonder why it is necessary for an ECONNRESET error, if the server is not handling on the port, you will get an ECONNREFUSED error instead, so maybe we can retry every time running into an ECONNRESET error. Or ECONNREFUSED is just Linux implementing detail? I didn't find any RFC distinguish it with ECONNRESET, they are both based on the TCP RST flag. Correct me if I'm wrong. |
Got explain from chromium dev
|
What version of Go are you using (
go version
)?go version go1.9.1 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
I performed an HTTP POST using an
*http.Client
that had keepalives enabled and there was an idle connection available for re-use.Specifically, I encountered this by using an
*http.Client
with anhttp.Transport
set to have anIdleConnTimeout
higher than the server's. Even a very small duration above can cause it. From my testing, I believe the connection gets taken for re-use just as the server closes it.This program demonstrates the problem: https://play.golang.org/p/GTj6FEFWk-
Note if we change the request to be a GET, we don't see a problem.
What did you expect to see?
The request succeed. In particular, I would expect the
*http.Client
, being a connection pool and supporting connection re-use, would not raise an error when it finds a dead connection, but use a different connection. This assumes there was confidence the request indeed did not go through.What did you see instead?
The sample program repeatedly makes HTTP requests. Sometimes they succeed, sometimes they fail with:
Or:
For example:
Further information
I searched for issues before reporting this. I found #8946 which I believe to be about this exact issue, except its example was an HTTP GET request. I believe the fix for #8946 was done for #4677 in 5dd372b.
From reading #4677, I believe the reason HTTP GET works is we retry in that case. Retrying an HTTP POST is unsafe depending on whether the request reached the server. But in the situation I'm describing, it didn't (although I don't know if we could reliably tell). See here in transport.go (comments closely below this line appear to be talking about this).
I noticed in #4677 that some of the basis for the behaviour was based on what Chromium does. I found its relevant code that handles what I think is going on here, and I don't see it talking about particular HTTP verbs (though of course I may not be looking in the right spot).
For my application, I've worked around this by ensuring the client's
IdleConnTimeout
is lower than the server's equivalent. I also added a retry at that level. I think a retry alone would not be sufficient, as presumably the retry could hit this very condition again.It may be that nothing can be done. In that case, it might be good to document that this is expected, as it surprised me a little.
Thank you!
The text was updated successfully, but these errors were encountered: