net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158

horgh · 2017-10-05T20:57:59Z

What version of Go are you using (`go version`)?

go version go1.9.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/vagrant/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build936671241=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

I performed an HTTP POST using an *http.Client that had keepalives enabled and there was an idle connection available for re-use.

Specifically, I encountered this by using an *http.Client with an http.Transport set to have an IdleConnTimeout higher than the server's. Even a very small duration above can cause it. From my testing, I believe the connection gets taken for re-use just as the server closes it.

This program demonstrates the problem: https://play.golang.org/p/GTj6FEFWk-

Note if we change the request to be a GET, we don't see a problem.

What did you expect to see?

The request succeed. In particular, I would expect the *http.Client, being a connection pool and supporting connection re-use, would not raise an error when it finds a dead connection, but use a different connection. This assumes there was confidence the request indeed did not go through.

What did you see instead?

The sample program repeatedly makes HTTP requests. Sometimes they succeed, sometimes they fail with:

Post http://localhost:9999/test: EOF

Or:

Post http://localhost:9999/test: read tcp 127.0.0.1:36589->127.0.0.1:9999: read: connection reset by peer

For example:

$ ./test-connection-reuse 
2017/10/05 20:22:39 request succeeded
2017/10/05 20:22:40 request succeeded
2017/10/05 20:22:41 request succeeded
2017/10/05 20:22:42 request failed: error performing request: Post http://localhost:9999/test: EOF
2017/10/05 20:22:43 request succeeded
2017/10/05 20:22:44 request succeeded
2017/10/05 20:22:45 request failed: error performing request: Post http://localhost:9999/test: EOF
2017/10/05 20:22:46 request succeeded
2017/10/05 20:22:47 request succeeded
2017/10/05 20:22:48 request failed: error performing request: Post http://localhost:9999/test: read tcp 127.0.0.1:36563->127.0.0.1:9999: read: connection reset by peer
2017/10/05 20:22:49 request succeeded
2017/10/05 20:22:50 request failed: error performing request: Post http://localhost:9999/test: read tcp 127.0.0.1:36589->127.0.0.1:9999: read: connection reset by peer
2017/10/05 20:22:51 request succeeded

Further information

I searched for issues before reporting this. I found #8946 which I believe to be about this exact issue, except its example was an HTTP GET request. I believe the fix for #8946 was done for #4677 in 5dd372b.

From reading #4677, I believe the reason HTTP GET works is we retry in that case. Retrying an HTTP POST is unsafe depending on whether the request reached the server. But in the situation I'm describing, it didn't (although I don't know if we could reliably tell). See here in transport.go (comments closely below this line appear to be talking about this).

I noticed in #4677 that some of the basis for the behaviour was based on what Chromium does. I found its relevant code that handles what I think is going on here, and I don't see it talking about particular HTTP verbs (though of course I may not be looking in the right spot).

For my application, I've worked around this by ensuring the client's IdleConnTimeout is lower than the server's equivalent. I also added a retry at that level. I think a retry alone would not be sufficient, as presumably the retry could hit this very condition again.

It may be that nothing can be done. In that case, it might be good to document that this is expected, as it surprised me a little.

Thank you!

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2017-10-05T21:12:42Z

CC @tombergan

tombergan · 2017-10-06T23:13:07Z

Thanks for the playground code. The relevant RFC is here:
https://tools.ietf.org/html/rfc7230#section-6.3.1

You've already pointed out the main challenge, which is determining if the POST had reached the server. This is impossible to know precisely. It looks like Chrome assumes the POST had not reached the server if the connection was reused and gets a RST before any response bytes are received. We could do the same thing.

horgh · 2017-10-07T18:38:54Z

Interesting, thank you for the RFC reference!

The heuristic from Chrome seems fairly reasonable. Presumably if it has been in use there the risk is less.

I also found #15723 and #18241 which sound similar in that they are about non-idempotent requests. From the latter I had the idea to try setting GetBody on the request in the sample program. The behaviour is the same.

rsc · 2017-11-22T19:46:58Z

I think it's too late to change to the Chrome behavior for Go 1.10, but we should probably do it for Go 1.11. Leaving NeedsInvestigation but @tombergan please feel free to make a decision and switch to NeedsFix.

odeke-em · 2018-07-07T15:24:39Z

Kindly pinging you @horgh @tombergan @bradfitz. Should we move this to Go 1.12 or might we be able to do something here?

bradfitz · 2018-07-09T16:37:21Z

I'll move it to Go 1.12.

bradfitz · 2018-12-04T21:41:22Z

In Go 1.12 we now have the fix for #19943 (comment) ....

Change https://golang.org/cl/147457 mentions this issue: net/http: make Transport respect {X-,}Idempotency-Key header

That should be sufficient for this bug.

See the docs on the Transport type at https://tip.golang.org/pkg/net/http/#Transport

detailyang · 2019-01-04T03:00:30Z

more TCP layer detail, if some guys want to know the reason:)

See the server side send the FIN packet

themez · 2019-10-14T07:23:30Z

Chromium checked whether socket reused before sending retry request, I wonder why it is necessary for an ECONNRESET error, if the server is not handling on the port, you will get an ECONNREFUSED error instead, so maybe we can retry every time running into an ECONNRESET error. Or ECONNREFUSED is just Linux implementing detail? I didn't find any RFC distinguish it with ECONNRESET, they are both based on the TCP RST flag. Correct me if I'm wrong.

themez · 2019-10-18T03:14:47Z

Got explain from chromium dev

Servers can timeout idle connections at their leisure and restart at their leisure. As a result, even if network conditions are perfect, reusing a stale socket can get you ERR_CONNECTION_RESET - so if we ever want to be able to reuse sockets, we need to retry on resets. A server returning ERR_CONNECTION_RESET on a fresh socket, however, means that there is some sort of server (or middlebox) issue. And yes, we can get a reset at that point (buggy SSL implementations seem to be big fans of resetting sockets during SSL negotiation when they're sad, for instance).

ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 5, 2017

ianlancetaylor added this to the Go1.10 milestone Oct 5, 2017

tombergan self-assigned this Oct 6, 2017

cloudkite mentioned this issue Oct 9, 2017

ContentLength with Body length 0 errors apex/up#276

Closed

stephenchu mentioned this issue Nov 14, 2017

Getting 'Error finding Wavefront Alert' sometimes spaceapegames/terraform-provider-wavefront#16

Closed

rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

bradfitz changed the title ~~net/http: http.Client returns errors when re-using connections if the connection closes at the wrong time~~ net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time Jul 9, 2018

bradfitz modified the milestones: Go1.11, Go1.12 Jul 9, 2018

bradfitz unassigned tombergan Jul 9, 2018

bradfitz closed this as completed Dec 4, 2018

themez mentioned this issue Oct 1, 2019

http: add reusedSocket property on client request nodejs/node#29715

Closed

4 tasks

golang locked and limited conversation to collaborators Oct 17, 2020

gopherbot added the FrozenDueToAge label Oct 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158

net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158

horgh commented Oct 5, 2017

ianlancetaylor commented Oct 5, 2017

tombergan commented Oct 6, 2017

horgh commented Oct 7, 2017

rsc commented Nov 22, 2017

odeke-em commented Jul 7, 2018

bradfitz commented Jul 9, 2018

bradfitz commented Dec 4, 2018

detailyang commented Jan 4, 2019

themez commented Oct 14, 2019 •

edited

Loading

themez commented Oct 18, 2019

net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158

net/http: Client returns errors on POST if keep-alive connection closes at unfortunate time #22158

Comments

horgh commented Oct 5, 2017

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Further information

ianlancetaylor commented Oct 5, 2017

tombergan commented Oct 6, 2017

horgh commented Oct 7, 2017

rsc commented Nov 22, 2017

odeke-em commented Jul 7, 2018

bradfitz commented Jul 9, 2018

bradfitz commented Dec 4, 2018

detailyang commented Jan 4, 2019

themez commented Oct 14, 2019 • edited Loading

themez commented Oct 18, 2019

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

themez commented Oct 14, 2019 •

edited

Loading