New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: transport MaxIdleConnsPerHost does not work as expected #13801

Open
sschepens opened this Issue Jan 2, 2016 · 7 comments

Comments

Projects
None yet
6 participants
@sschepens

sschepens commented Jan 2, 2016

I recently had an issue which made Go consume all ephemeral ports and start failing to do requests.
The problem was triggered because I was doing 10 concurrent requests with the DefaultMaxIdleConnsPerHost so Transport was only reusing 2 connections and closing the other 8 as soon as a request was finished, just to reestablish a connection again.
This seems a really bad behavior and never happened to me on any other language or http client.
A connection is usually considered idle when it was not used for a period of inactivity, maybe a whole keepalive period or so, not after a request has been sent successfully.
The behavior one would expect is to have 10 established connections and, after say 30 seconds of not issuing any request having only 2 connections established as per DefaultMaxIdleConnsPerHost.

Here is a gist that triggers this behavior.

I'm running go 1.5.2 on Ubuntu Precise 12.04, kernel: Linux e-0000e768 3.8.0-44-generic #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

@ianlancetaylor ianlancetaylor changed the title from HTTP Transport MaxIdleConnsPerHost does not work as expected to net/http: transport MaxIdleConnsPerHost does not work as expected Jan 3, 2016

@ianlancetaylor ianlancetaylor added this to the Go1.6Maybe milestone Jan 3, 2016

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jan 3, 2016

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jan 4, 2016

This has been like this forever, so it's not Go 1.6 material. It's also somewhat involved.

@bradfitz bradfitz modified the milestones: Go1.7, Go1.6Maybe Jan 4, 2016

@gopherbot

This comment has been minimized.

gopherbot commented Jan 5, 2016

CL https://golang.org/cl/18282 mentions this issue.

bradfitz added a commit that referenced this issue Jan 5, 2016

net/http: tighten protocol between Transport.roundTrip and persistCon…
…n.readLoop

In debugging the flaky test in #13825, I discovered that my previous
change to tighten and simplify the communication protocol between
Transport.roundTrip and persistConn.readLoop in
https://golang.org/cl/17890 wasn't complete.

This change simplifies it further: the buffered-vs-unbuffered
complexity goes away, and we no longer need to re-try channel reads in
the select case. It was trying to prioritize channels in the case that
two were readable in the select. (it was only failing in the race builder
because the race builds randomize select scheduling)

The problem was that in the bodyless response case we had to return
the idle connection before replying to roundTrip. But putIdleConn
previously both added it to the free list (which we wanted), but also
closed the connection, which made the caller goroutine
(Transport.roundTrip) have two readable cases: pc.closech, and the
response. We guarded against similar conditions in the caller's select
for two readable channels, but such a fix wasn't possible here, and would
be overly complicated.

Instead, switch to unbuffered channels. The unbuffered channels were only
to prevent goroutine leaks, so address that differently: add a "callerGone"
channel closed by the caller on exit, and select on that during any unbuffered
sends.

As part of the fix, split putIdleConn into two halves: a part that
just returns to the freelist, and a part that also closes. Update the
four callers to the variants each wanted.

Incidentally, the connections were closing on return to the pool due
to MaxIdleConnsPerHost (somewhat related: #13801), but this bug
could've manifested for plenty of other reasons.

Fixes #13825

Change-Id: I6fa7136e2c52909d57a22ea4b74d0155fdf0e6fa
Reviewed-on: https://go-review.googlesource.com/18282
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
@bradfitz

This comment has been minimized.

Member

bradfitz commented Jan 20, 2016

Related, probably to be addressed at the same time: #13957 (limiting number of dials in-flight at once per host)

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 2, 2016

Related also: we should close idle conns in some defined order, like the oldest use one gets close first.

@bradfitz

This comment has been minimized.

Member

bradfitz commented May 19, 2016

Related also: we should close idle conns in some defined order, like the oldest use one gets close first.

That happened in Go 1.7 at least.

The rest of this bug didn't happen, though.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Nov 1, 2016

Also not happening for Go 1.8.

@bradfitz bradfitz modified the milestones: Go1.9, Go1.8 Nov 1, 2016

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9 May 24, 2017

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

bboreham added a commit to bboreham/prometheus that referenced this issue Dec 17, 2017

Set MaxIdleConnsPerHost alongside MaxIdleConns
Otherwise it defaults to 2, and Go will close extra connections as
soon as a request is finished.
See golang/go#13801

bboreham added a commit to bboreham/prometheus that referenced this issue Dec 17, 2017

Set MaxIdleConnsPerHost alongside MaxIdleConns
Otherwise it defaults to 2, and Go will close extra connections as
soon as a request is finished.
See golang/go#13801

brian-brazil added a commit to prometheus/prometheus that referenced this issue Dec 17, 2017

Set MaxIdleConnsPerHost alongside MaxIdleConns (#3592)
Otherwise it defaults to 2, and Go will close extra connections as
soon as a request is finished.
See golang/go#13801

kahing added a commit to kahing/goofys that referenced this issue Feb 7, 2018

@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018

iferunsewe pushed a commit to EngineerBetter/credhub-cli that referenced this issue Jul 6, 2018

Enable HTTP Keepalive's to reduce number of TLS handshakes
We found that when using Credhub with Concourse, there was a massive spike in
CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and
found that >90% of samples were in TLS handshake methods. We did not profile
credhub, but presumably it was much the same.

This commit ensures HTTP response body's are read to completion, which is
required for net/http to reuse the connection and avoid a costly TLS handshake.
It also increases the maximum number of idle HTTP connections to 100, from the
default of 2, which is required to work around limitations in net/http:

golang/go#13801

Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>

iferunsewe pushed a commit to EngineerBetter/credhub-cli that referenced this issue Jul 6, 2018

Enable HTTP Keepalive's to reduce number of TLS handshakes
We found that when using Credhub with Concourse, there was a massive spike in
CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and
found that >90% of samples were in TLS handshake methods. We did not profile
credhub, but presumably it was much the same.

This commit ensures HTTP response body's are read to completion, which is
required for net/http to reuse the connection and avoid a costly TLS handshake.
It also increases the maximum number of idle HTTP connections to 100, from the
default of 2, which is required to work around limitations in net/http:

golang/go#13801

Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>

iferunsewe pushed a commit to EngineerBetter/credhub-cli that referenced this issue Jul 6, 2018

Enable HTTP Keepalive's to reduce number of TLS handshakes
We found that when using Credhub with Concourse, there was a massive spike in
CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and
found that >90% of samples were in TLS handshake methods. We did not profile
credhub, but presumably it was much the same.

This commit ensures HTTP response body's are read to completion, which is
required for net/http to reuse the connection and avoid a costly TLS handshake.
It also increases the maximum number of idle HTTP connections to 100, from the
default of 2, which is required to work around limitations in net/http:

golang/go#13801

Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>

martys34 added a commit to cloudfoundry-incubator/credhub-cli that referenced this issue Aug 1, 2018

Enable HTTP Keepalive's to reduce number of TLS handshakes
We found that when using Credhub with Concourse, there was a massive spike in
CPU usage. We profiled ATC (Concourse component that interacts with Credhub) and
found that >90% of samples were in TLS handshake methods. We did not profile
credhub, but presumably it was much the same.

This commit ensures HTTP response body's are read to completion, which is
required for net/http to reuse the connection and avoid a costly TLS handshake.
It also increases the maximum number of idle HTTP connections to 100, from the
default of 2, which is required to work around limitations in net/http:

golang/go#13801

Signed-off-by: Ife Runsewe <ife.runsewe@engineerbetter.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment