Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

net/http: Default{Client,Transport} need better timeout support when used with connection pooling/redirects #3362

Closed
balasanjay opened this Issue Mar 21, 2012 · 50 comments

Comments

Projects
None yet
Contributor

balasanjay commented Mar 21, 2012

With the new SetDeadline style methods on net.Conns, there is no clean way to set
timeouts for http requests, while taking into account redirects and connection pooling.

See https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/-DE6-aLttM4 for
discussion.
Contributor

ianlancetaylor commented Mar 21, 2012

Comment 1:

Labels changed: added priority-later, packagechange, removed priority-triage.

Contributor

rsc commented Mar 26, 2012

Comment 2:

At least that was true of the old way too.

Comment 3:

heres one way to do it (although admittedly it is not zestfully *clean*)
https://github.com/mreiferson/go-httpclient
Contributor

rsc commented Sep 12, 2012

Comment 4:

Status changed to Accepted.

Contributor

rsc commented Sep 12, 2012

Comment 5:

Labels changed: added go1.1maybe.

stevvooe commented Jan 8, 2013

Comment 6:

Is there any activity on this issue? I'd like to volunteer my time if it's needed. I
would consider this a major stability bug for long running services that make http
requests. A solid http client with timeout support should be considered a critical part
of a "batteries included" standard library.
Contributor

remyoudompheng commented Jan 8, 2013

Comment 7:

You are of course welcome if you propose a fix that suitably addresses the issue.
Owner

bradfitz commented Jan 8, 2013

Comment 8:

There are at least three timeouts that make sense.  We may even want all three:
// HeaderTimeout is the amount of time to wait to get a response header
// after writing the request.  It does not include the response body.
// Zero means no timeout.
HeaderTimeout time.Duration
// ResponseTimeout is like HeaderTimeout, but includes the time to transmit
// the entire body, regardless of size.  If the server is still sending
// the body as ResponseTimeout is hit, the body is interrupted and a Read
// will result in ErrTimeout. 
ResponseTimeout time.Duration
// InactivityTimeout is the amount of time to get any data from the server
// before considering it timed out. Unlike HeaderTimeout and ResponseTimeout,
// this timer resets every time a byte of data is received from the server,
// so a slow trickle of data could take a large amount of time. 
// Zero means no timeout.
InactivityTimeout time.Duration
Contributor

garyburd commented Jan 8, 2013

Comment 9:

How about allowing the application to set the read deadline on the connection?  If an
application can set the read deadline, then the application can easily implement the
three timeouts suggested by Brad.
Owner

bradfitz commented Jan 8, 2013

Comment 10:

You can already do that.
I thought this bug was about making the commonly-requested timeout use cases easy.

stevvooe commented Jan 9, 2013

Comment 11:

The common use case is probably centered around what bradfitz has labeled as the
ResponseTimeout. HeaderTimeout and InactivityTimeout seem to be relatively uncommon, and
would require that a particular request has claimed a connection as its own out of a
pool until the corresponding timeout has passed or the request has completed. Its not
unrealistic to expect applications that rely on this level of detail to implement their
own RoundTripper, as connection pooling may not be as critical for APIs that have
disparate requirements around the timing of header responses and byte level responses.
The way I envisioned this was to simply have the timeout based on the total running time
of a call to Tranport.RoundTrip, including any redirects. This is relatively
straightforward and should meet the needs of most applications.
Contributor

garyburd commented Jan 9, 2013

Comment 12:

HeaderTimeout and InactivityTimeout are bounded by the lifetime of a round trip. There's
no need for a request to make a claim on the connection beyond the claim made by the
round trip.
Streaming APIs like Twitter's are a use case for HeaderTimeout and InactivityTimeout.
With these APIs, the client receives a stream of data and keep-alive messages in the
response. The header timeout detects problems connecting to the server. The inactivity
timeout detects the case where the client is not receiving data or keep-alive messages.
The read deadline can be set via the Transport Dial function, but not in ways that the
application might want.  For example, it's not possible to implement ResponseTimeout
when a connection is reused.
Contributor

rsc commented Jan 9, 2013

Comment 13:

The three values Brad suggested seem fine to me. I would s/Inactivity/Idle/.
Owner

bradfitz commented Jan 11, 2013

Comment 14:

I actually don't like the name ResponseTimeout anymore, because we name the result
parameter from Do/Get/Post (*Response, error), and I don't want people assuming that
ResponseTimeout means that Response.
BodyTimeout doesn't work well as an alternative to the name ResponseTimeout, because it
implies that it's only for a body, not the headers + an optional body.
So I'm leaning towards shortening "ResponseTimeout" to just Timeout.
So we'll have:
1) Timeout (the whole thing)
2) HeaderTimeout (just the headers)
3) IdleTimeout (what I called InactivityTImeout above)
That should be sufficient.
If we really need more control later, we could provide a method on *Response (assuming
you get a *Response after the HeaderTimeout) to plumb through a deadline to the
underlying connection, but I don't think it will be necessary and I don't want to do it
now.  I recognize that 1) and 3) would be unnecessary if we had 2) and a *Response
SetDeadline method, but I'd prefer to provide solutiosn to the common use cases, rather
than provide a bunch of mechanisms.
Any final thoughts or objections before I start writing this?

Comment 15:

LGTM
Contributor

garyburd commented Jan 11, 2013

Comment 16:

Are these timeouts per request or set once on the transport?
Another approach is to specify a minimum data transfer rate for both the request and the
response. This approach adapts automatically to the size of the request and response
bodies.
Owner

bradfitz commented Jan 11, 2013

Comment 17:

A timeout is a time.Duration.  A deadline is a time.Time.  I'm going to let you set
timeouts, which will turn into deadlines at the right places.
I'd prefer people do bandwidth calculations on their own.
Owner

bradfitz commented Jan 17, 2013

Comment 18:

More questions: does the HeaderTimeout begin once the request has been written
(including any POST body), or at the beginning of the RoundTrip?
Both seem wrong (or at least potentially surprising) for different reasons.
Let's assume you don't set the net.Conn's ReadDeadline until after the request has been
sent.  In most cases, this will be fine, but if the remote server disappears off the
network and you write a large HTTP request body such that Write blocks, the RoundTrip
will never timeout if it never gets to http.ReadRequest -> bufio.Read -> net.Conn.Read.
But if you include the time to send a request body, then some POST requests are now too
large to be sent in time.  And the name "HeaderTimeout" is now confusing.
So back to assuming HeaderTimeout is as documented above, and only for reading the
response headers: really the user would need to set both IdleTimeout (for the request
write phrase, say 5 seconds to get TCP progress) + HeaderTimeout (say, 30 seconds). 
That seems annoying to document.
Do we want to make a zero IdleTimeout mean the minimum of HeaderTimeout and Timeout?  So
if the user only specifies a 30 second HeaderTimeout, that also means that at least the
server needs to respond to TCP during the request write phase at least once every 30
seconds.  That would prevent offline servers from hanging a Go program.
None of this is hard to implement. The only hard part of this bug is naming and
documentation.

Owner changed to @bradfitz.

Comment 19:

Perhaps some of the complexity here is because of the attempt to simplify to fewer
variables than there are phases in an HTTP request.  It seems like we're asking this
proposal to satisfy two very different scenarios.
1) typical HTTP requests between services (regardless of size of request/response body)
where you want to control expected performance of both sides... sending (connect +
write)/response (headers + body)
2) streaming requests where connection/request write/response header/idle timeouts want
to be bounded but there should be no overall timeout to the request.
It seems like one answer might be to cover each of the phases with a timeout and allow
the sum of their parts to compose the behavior one would expect (each being optional):
// set on any read/write to guarantee TCP progress
IdleTimeout time.Duration
// includes connection time (may or may not be cached) + writing request (headers + body)
SendTimeout time.Duration
// includes time spent waiting for response headers after sending the request
HeaderTimeout time.Duration
// includes time spent waiting for the response body after reading response headers
BodyTimeout time.Duration
However, the counter argument is that this is too much control and all you really want
is to be able to bound the overall request time (at all) regardless of which part is
taking the bulk of the time.  
Although, for production debugging I could imagine there being a huge benefit to having
the granularity above to understand/control what phase isn't performing.  This could be
resolved independently if the library provided means to introspect the recorded time
spent in each phase of the request.
I tend to side with others comments above that this change should perhaps *not* satisfy
the needs of streaming/long lived requests (being less common and having very specific
requirements).  I feel like those may be better implemented with a custom RoundTripper,
etc.  IMO, you could drop IdleTimeout from your proposal entirely without making much
difference.
In that case, my vote would be for either of the following:
SendTimeout, HeaderTimeout, and BodyTimeout (as described above)
or
ConnectTimeout, Timeout (which includes all phases of the request)
For the record, I realize the proposed suggestion hasn't been including connect time.  I
think that the goal of this change is to make the standard library httpclient easy and
obvious to use (for this use case) and not including connect time would result in
unexpected behavior.  In particular, supporting request timeouts that don't include
connect time (and thus still require a custom Dial function to get DialTimeout) would
feel a bit odd and inconsistent.  Perhaps documentation could resolve this, though.

Comment 20:

There seems to be a serious challenge to associating timeouts to different phases of the
request when there aren't exposed hooks to the associated phases.
What if the semantic issue is avoided by bunching the concerns of the call to RoundTrip
into a single timeout?
// RoundTripTimeout is the amount of time that RoundTripper.RoundTrip will
// block before returning a timeout error. This includes, but is not limited
// to, the time that it may take to connect, send the request, any request
// body and receive any associated headers. The phase in which the request
// times out is returned as part of the timeout error.
RoundTripTimeout time.Duration
// ReadTimeout is the amount of time that calls to Read on Response.Body will
// block before returning a timeout error.
ReadTimeout time.Duration
The main advantages to this are that the abstraction is maintained, issues with down or
overloaded servers are addressed within the bounds of RoundTripTimeout, and there is a
reasonable hook to ensure that Body.Read doesn't block goroutines forever. In addition,
timeouts are defined by the boundaries of the API function calls, rather than by the
boundaries of various phases of the request, which is much easier to test, verify and
reason about. At the same time, the actual causes of the timeouts can be inspected via
the returned errors of each associated function call, be it RoundTripper.RoundTrip or
Response.Body.Read.
Owner

bradfitz commented Feb 26, 2013

Comment 21:

I've sent https://golang.org/cl/7369055 to add ResponseHeaderTimeout.  It's
purely the time after writing the full request (including optional body), until we get
response headers.
I'm focusing first on the primitives you can't do otherwise.  Dial timeouts can already
be done with the Transport.Dial hook.
Timeouts reading the response.Body can be done purely by Go code, especially once Issue
3672 is fixed to just kill the TCP connection.
The one that remains then is timeouts writing the request Body, where we might want
something like IdleTimeout.
I'm also considering whether we might want:
func (t *Transport) StopRequest(r *Request)
... so people implementing timeouts in their own programs have a reliable mechanism to
shut down a request's TCP connection, since the Transport knows which request went to
which connection.
If we had that, maybe we wouldn't need the request Body timeout.

Comment 22:

What would be the proposed way to timeout reading response body?
Unless I'm missing something it would probably involve a goroutine spawn, timer, and
some channel glue, right?
Owner

bradfitz commented Feb 26, 2013

Comment 23:

Re #22: yes.
That'd would likely be the implementation one way or another, as a net.Conn's
ReadDeadline can only be adjusted for future reads, not in-progress reads, and the http
Transport code is often blocked in a read.  A good way to unblock it is to close the
connection.

Comment 24:

Can we instead wait on the fact that there is a request sent on that conn instead of
sitting in a Peek()?  This would allow the implementation to be simplified, standard
library side, by setting the ReadDeadline on the net.Conn before waiting on a Read (and
wouldn't incur a goroutine spawn per request which is expensive at high volume).
Was sitting in Peek() used to attempt detection of connections that are closed on you? 
Isn't that sorta racey?
Owner

bradfitz commented Feb 27, 2013

Comment 25:

This issue was updated by revision 839d47a.

R=golang-dev, adg, rsc
CC=golang-dev
https://golang.org/cl/7369055
Owner

bradfitz commented Feb 27, 2013

Comment 26:

Re #24: we're only sitting in a Peek when the connection is idle, or when we're
anticipating a response.  We need to be actively reading from the connection to know
when a peer closes its keep-alive connection on us so we don't re-use it.  It works most
of the time but yes, it's inherently racy, but that's the protocol and reality's fault,
no Go's. See issue #4677.
Owner

bradfitz commented Feb 27, 2013

Comment 27:

CL out for review to add Transport.CancelRequest: https://golang.org/cl/7372054/
Owner

bradfitz commented Feb 27, 2013

Comment 28:

This issue was updated by revision 11776a3.

Update issue #3672
R=golang-dev, rsc, adg
CC=golang-dev
https://golang.org/cl/7372054
Owner

bradfitz commented Mar 9, 2013

Comment 29:

Now that we have revision 02ee8f7b9a17, revision bcecef1cb2fd, and revision d4e1ec84876c
in, I'm tempted to change this bug to not be Go1.1 anymore.  I think we now have all the
necessary low-level mechanisms to let Go programs implement their own policies, without
adding dozens of new knobs.
If anybody wants to argue for things we should have for Go 1.1 (either low-level
mechanisms or higher-level knobs), speak now.  The cut-off date is approaching.
Contributor

rsc commented Mar 12, 2013

Comment 30:

[The time for maybe has passed.]

Labels changed: removed go1.1maybe.

Contributor

rsc commented Jul 30, 2013

Comment 31:

Brad, what's left here?

Labels changed: added go1.2maybe.

Contributor

rsc commented Jul 30, 2013

Comment 32:

Labels changed: added feature.

Contributor

robpike commented Aug 30, 2013

Comment 33:

Labels changed: removed go1.2maybe.

Contributor

jbardin commented Sep 19, 2013

Comment 34:

I'd like to make a case for keeping this, since it seems that it may get lost.
The primary issue I have with the CancelRequest method, is that if it is used for a
timeout, the caller gets a "use of closed network connection" error.
I strongly feel that a timeout situation should return a net.Error with TImeout() ==
true, which would be the case when any of the other deadlines are reached. 
Another more minor detail is that this requires one to setup a timer of some sort for
each request. This would idiomatically be in another goroutine with a time.Timer, which
adds control channels and goroutine overhead for each call.
Contributor

rsc commented Nov 27, 2013

Comment 35:

Labels changed: added go1.3maybe.

Contributor

rsc commented Nov 27, 2013

Comment 36:

Labels changed: removed feature.

Contributor

rsc commented Dec 4, 2013

Comment 37:

Labels changed: added release-none, removed go1.3maybe.

Contributor

rsc commented Dec 4, 2013

Comment 38:

Labels changed: added repo-main.

Comment 39 by danny.gottlieb:

With respect to dial timeouts and the `Transport.Dial` hook, there's an edgecase not
covered when dealing with HTTPS requests. Using a custom dial method to invoke
`DialWithTimeout` works for a connect timeout, but none of the TLS handshaking has
happened when that returns. Currently none of the timeouts address a TLS handshake
timing out.
If a connection is to be used for exactly one request, a deadline can be set, but with
re-use of working connections, I don't think there's a way to update the deadline before
every HTTP request; unless I'm mistaken, the underlying `net.Conn` structures are never
exposed outside of the http package other than through the custom Dial function.
Contributor

niemeyer commented Feb 20, 2014

Comment 40:

Issue raised on golang-dev: http://goo.gl/Wqhhqy
Owner

bradfitz commented Feb 24, 2014

Comment 41:

This issue was updated by revision 49beb23.

Also set a 30 second timeout, instead of relying on the
operating system's timeout, which if often but not always 3
minutes.
LGTM=crawshaw
R=rsc, crawshaw
CC=golang-codereviews
https://golang.org/cl/68330046
Owner

bradfitz commented Feb 25, 2014

Comment 42:

This issue was updated by revision fd4b4b5.

LGTM=agl
R=agl
CC=golang-codereviews
https://golang.org/cl/68150045
Owner

bradfitz commented Feb 25, 2014

Comment 43:

I've made two changes for Go 1.3 for this issue:
1) TCP Keep-Alives are now used by default, so at least dead connections will eventually
give a read error and go away.
2) There is now a TLSHandshakeTimeout and is set by default. I picked 10 seconds for
DefaulTransport pretty arbitrarily and conservatively.
The issue has gotten to the point where it's almost too large to read anymore and it's
mostly musing and not proposals.  If somebody could summarize here what's still missing
that's not doable outside of net/http, that would be helpful.  I believe timeouts
spanning requests involving redirects are doable with the Client.CheckRedirect hook to
make a map from original request => subsequent requests, and then using
Transport.CancelRequest.  
To add to the musing, we could add:
  type Client struct {
    // ....
    // Timeout, if non-zero, gives the total time allowed for all phases of a request
    // including any of its subsequent redirect requests. This includes the connection
    // time, writing the request(s), and reading the entire response body(ies).
    // If the underlying RoundTripper implementation of the Client's Transport
    // does not support CancelRequest, this Timeout is ignored.
    Timeout time.Duration
  }
But like the comment says, this would be based on CancelRequest and tracking redirects
to cancel the right request when the timeout occurs, so it could be confusing to people
if used with a RoundTripper that doesn't support CancelRequest.  Maybe that's an error,
though, and we fail fast.  Do we then need a public interface for RequestCanceler, or
can it be unexported?  This support would also wrap the return Response Body's
io.ReadCloser to one that note the EOF/Close and stops the timer.
I don't see the need to rush that in, unless everybody agrees that it's needed,
especially since it can be implemented by hand anyway.
Unless there are specific suggestions this week, we're probably done for Go 1.3.
Please speak up soon if you want more.
Contributor

niemeyer commented Feb 25, 2014

Comment 44:

Thanks for the recent improvements, Brad.
It would be superb if we could have such a straightforward Timeout parameter as you
suggest, as I've been observing broken usage of the http package every other day. The
fact we can say "you can tune it yourself to have sane behavior" feels like a departure
from what we preach elsewhere, and people find they're missing details in the worst
possible way.
Unprompted conversation from my IRC log, *today*, as evidence:
Feb 25 11:22:40  niemeyer: (...) this is the branch I was talking about,
setting up http Client is around line 232...
Feb 25 11:25:43  pedronis: The code in there is even more problematic
Feb 25 11:25:58  pedronis: As it's timing out only if the connection fails
altogether
Feb 25 11:26:15  niemeyer: well it fails if the header are not sent in time
Feb 25 11:26:24  pedronis: Not really
Feb 25 11:26:43  niemeyer: you are telling me that the client code is
completely broken?
Feb 25 11:26:50  pedronis: Yes, kind of.. sorry
Feb 25 11:27:19  pedronis: Please bitch here: http://golang.org/issue/3362

Comment 45:

+1 on an end-to-end request timeout in the standard library
Contributor

jbardin commented Feb 25, 2014

Comment 46:

yes, Thank you Brad.
I'm still going to push for the overall Timeout setting. The timeout method used by
github.com/mreiferson/go-httpclient is now pretty slick, but it definitely not obvious
for most how that would been implemented (took a couple iterations in that library
itself). We could better promote helper libs like this, but people are always going to
miss out on them. 
I also still would like to see a actual Timeout error returned instead of the closed
connection error, but I think the proper reference implementation is still more
important.

Comment 47 by danny.gottlieb:

I'm also in favor of an end to end timeout. In an effort to be clear about some edge
cases, e.g. redirection, what's most important in my applications is when using
`keep-alive` and calling `response, err := httpClient.Get(...)` I can express a
guarantee that the calling code will eventually have control returned to it. Meaning
connection creation, TLS handshaking, redirects, POST data to send (obviously not for a
`Get`) and receiving the body from the server needs to be covered by a single, or some
combination of timeouts.

Comment 48 by samuele.pedroni:

if there is not a proper implementation of a end-to-end timeout, then some example of
how CancelRequest should be used ought to be added to net/http docs ... coming from
other languages/libraries one expects a supported/ easy way to fully bound in time http
requests
Owner

bradfitz commented Feb 28, 2014

Comment 49:

Sent https://golang.org/cl/70120045
Please review.
Owner

bradfitz commented Mar 3, 2014

Comment 50:

This issue was closed by revision 2ad72ec.

Status changed to Fixed.

@gopherbot gopherbot locked and limited conversation to collaborators Jun 24, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.