net/http: transport closes healthy connections when pool limit is reached #42650
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
When the http transport connection pool is full (no idle connections) and new client side requests are attempted, new connections fail (as expected), and long established connections start being closed.
The following code can be used to reproduce the problem
Server side (not really important, you can use any http server so long as the response is enough for the pool to eventually be full)
Sample server code
Client side code - here is where the problem can be seen
Example client side code
What did you expect to see?
Note that I'm setting
numGoRoutines = 5
, matching the number of allowed idle and max connections per hostThis is the healthy and stable case and trace shows initial connection being established for each one allowed
and subsequently, as the program runs,
Reused:true
is always the case for all "Got conn" messageThe output of netstat also shows that the ports in use on both server and client side are always the same.
What did you see instead?
Now, say that
numGoRoutines
is set to 8 (you can also reduce the pool size, higher than the pool size. Anything that forces all connections to be in use), the trace output shows frequent reconnectsLonger trace logs
this goes on and on
The
context deadline exceeded
suggests one cause for the problem.I noticed that goroutines trying to make a request block on the select in
Transport::getConn
, indicating that, by the time a connection is made available, there isn't enough time for the entire operation to complete and the context eventually expires in the write or read loops.This causes a connection that is already in established to be closed and replaced, creating a cycle that rotates connections all the time.
The output of
netstat
confirms this behaviourIs there a way around this? Can we control the timeout waiting for a connection from the pool separately from request/response timeouts?
Or even not blocking when the pool is full? Thanks
The text was updated successfully, but these errors were encountered: