New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: document errors more (*Transport, Client's wrapper errors, etc), how to check canceled, ... #9424

Open
bcmills opened this Issue Dec 22, 2014 · 30 comments

Comments

Projects
None yet
10 participants
@bcmills
Member

bcmills commented Dec 22, 2014

I've been doing some experimenting with Go HTTP clients and servers under load.
One of the curious things I've discovered is that calls to (*http.Client).Get occasionally return unusual errors.

The documentation at http://golang.org/pkg/net/http/#Client says:
"An error is returned if the Client's CheckRedirect function fails or if there was an HTTP protocol error."

I've been leaving CheckRedirect nil and testing against a local server that never serves redirects. From the comment, that would imply that I should only receive HTTP protocol errors - which would correspond to the type *http.ProtocolError.

For typical socket errors, the clients in fact return a *url.Error - but the only place in the net/http documentation where url.Error is even mentioned is for the client.CheckRedirect field, which is completely unrelated to the errors I'm testing.

But that's not all!

The *url.Error usually wraps a net.OpError, which seems reasonable enough. In fact, net.Error is what I would have expected in the first place, since I didn't know about url.Error at all before I started these experiments.

But instead, the *url.Error occasionally wraps io.EOF. I'm guessing that happens when the socket happens to close at exactly the wrong moment, and because of the poor documentation it's not at all clear to me whether that's the intended behavior of the Client methods or an outright bug in the library.

But that's not all either!

The error wrapped by the OpError, one would expect, is a net.Error describing the underlying network error for the op. But that's not the case either - it's often a syscall.Errno instead, and syscall.Errno does not implement net.Error. So for temporary conditions, such as EPIPE or ECONNRESET, the net package's preferred mechanism for indicating that the condition is temporary does not work as expected.

So I end up needing a big pile of user code to sort through the whole stack of errors, find the root error, and check for particular syscall.Errno values, and that whole big pile of code is now relying on undocumented error spaces that could theoretically change completely with the next Go release.

To summarize: error handling in the http package is a mess. Someone familiar with the intended behavior of the package should clarify the documentation at least to the point where it's sensible to say whether the more subtle behaviors (e.g. io.EOF in a url.Error) are bugs or not.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 23, 2014

On further investigation, it appears that the ECONNRESET considers itself Temporary after all - but EPIPE does not.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Dec 23, 2014

@bcmills see also #9405.

As an editorial, the general policy of Go APIs is to not specify the error types they return. The philosophy is, it either worked or it didn't, trying to sniff the specifics is brittle. In this specific case, #9405 may be an improvement.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 23, 2014

Do we expect programmers to memorize every error in the standard library, then? Or to read the implementation of every package to figure out whether there are certain errors they should be on the lookout for?

For example: without knowing about url.Error, it's easy for one to see net.Error and try to use that for detecting temporary errors. But that doesn't work because url.Error doesn't implement net.Error.

In order to handle errors - and the existence of net.Error is a witness to the fact that there are useful properties of errors needed to handle them correctly! - the user must know that they exist in the first place.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 23, 2014

Or, to put it a different way: handling errors should not put an O(N^2) burden on the programmer.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Dec 23, 2014

In random order:

  • url.Error should not wrap an io.EOF. That should've been turned into an ErrUnexpectedEOF by whoever wrapped it I imagine. I don't know where that is, though.
  • I don't read the Client comments about "HTTP protocol error" as meaning ProtocolError specifically. I can update those docs.
  • We don't document the error types in general in Go (some things do, like os.PathError), but in general once we start returning a concrete type instead of an errors.New/fmt.Errorf one, then we kinda need to stick with it, in case people depend on it. We try to add tests for these so we don't regress, but not well enough.
  • http.ProtocolError is kinda old and weird. In retrospect we tend to document sentinel exported ErrFoo = errors.New("...") variables instead, and document that those are returned.
  • likewise, *net.OpError is old. It exists to implement the net.Error interface, but it need not be exposed itself probably. But we're stuck with it now.
  • net/http just upstream of lots of legacy. It's not surprising that the various experiments in error behaviors all manifest themselves through net/http's higher level interfaces.

In summary, we can improve some docs (where improve likely means make more vague), but for compatibility there's little we can do.

@bradfitz bradfitz added the Go2 label Dec 23, 2014

@davecheney

This comment has been minimized.

Contributor

davecheney commented Dec 23, 2014

I believe it is the opposite. The general principal is the error value, if not nil is opaque, the caller is not supposed to assign any meaning to it other than a failure.

If, and these are the exceptions, a method identifies that the error value returned conforms to an interface, say providing a Temporary() bool method, then that should be called out and is part of the contract for the method. Again, this is the exception, not the rule. I've only seen that pattern in the net package, and close derivatives of that package.

On 23 Dec 2014, at 12:05, Bryan C. Mills notifications@github.com wrote:

Do we expect programmers to memorize every error in the standard library, then? Or to read the implementation of every package to figure out whether there are certain errors they should be on the lookout for?

For example: without knowing about url.Error, it's easy for one to see net.Error and try to use that for detecting temporary errors. But that doesn't work because url.Error doesn't implement net.Error.

In order to handle errors - and the existence of net.Error is a witness to the fact that there are useful properties of errors needed to handle them correctly! - the user must know that they exist in the first place.


Reply to this email directly or view it on GitHub.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Dec 23, 2014

Yes, if a func returns error, all we really technically promise (and API check) is that it's either nil or it has an Error() string method.

But because we're so nice and care about compatibility, we also try pretty hard to not change the underlying type if it was exported in weekly-tagged release in 2010 and people are probably still relying on it somewhere. And sometimes we even add tests for such cases, especially if somebody asks if they can depend on it, or report that we broke them.

Stability sucks, I know.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 23, 2014

@davecheney It is not possible to use http.Client robustly without being able to check for (and retry) temporary errors, which are not part of the method's signature. Unless you are proposing a broader refactoring to move the retry logic into the Client implementation, documenting errors is about the best we can do.

I can understand not wanting to promise that http.Client returns only certain errors (e.g. only url.Error), but it should at least have some indication in that direction. Without knowing to check for that and net.Error, one cannot implement anything resembling a reasonable caller-side retry loop.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Dec 23, 2014

Use a type assertion to see if the error value implements Temporary() bool
or Timeout() bool, and if it does, use that method to choose to retry or
not.
On 23 Dec 2014 13:48, "Bryan C. Mills" notifications@github.com wrote:

@davecheney https://github.com/davecheney It is not possible to use
http.Client robustly without being able to check for (and retry) temporary
errors, which are not part of the method's signature. Unless you are
proposing a broader refactoring to move the retry logic into the Client
implementation, documenting errors is about the best we can do.

I can understand not wanting to promise that http.Client returns only
certain errors (e.g. only url.Error), but it should at least have some
indication in that direction. Without knowing to check for that and
net.Error, one cannot implement anything resembling a reasonable
caller-side retry loop.


Reply to this email directly or view it on GitHub
#9424 (comment).

@minux

This comment has been minimized.

Member

minux commented Dec 23, 2014

I tend to think that the retry should be done by the the net/http package
itself,
because most clients need that capability (unless turned off explicitly)

@bradfitz

This comment has been minimized.

Member

bradfitz commented Dec 23, 2014

@minux references the good ol' Issue #4677

@bcmills, in general, if it's possible to retry safely, the logic is roughly:

     for {
          res, err := client.Do(....)
          if err == nil || res.StatusCode == 200 {
                something(res)
                break
          }
          time.Sleep(something)
     }

The exact err rarely matters if you didn't even get an HTTP response at all.

If it's an idempotent GET or HEAD (as they all should be), then just retry away. But if it's a POST, you're already kinda screwed on knowing whether it got there.

Which specific errors do you care about, ambiguities in net/http etc's return values notwithstanding?

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 23, 2014

@davecheney A simple type-assertion doesn't work - while most of the errors in the "net" package implement a Temporary() method, url.Error - the one returned most of the type by (http.Client).Get - does not. In order to unpack the errors correctly, you need to know about both url.Error (for its Err field) and net.Error (for the Temporary method). And neither of those is prominently mentioned in the net/http docs.

@minux I tend to agree, but getting to that state would require being much more careful about which errors to retry. (For example: the spurious io.EOF errors would need to be fixed, and some more syscall errors - EPIPE in particular - would probably need to be marked Temporary.) Documenting and fixing the errors seems like a prerequisite for retrying them.

@bradfitz Yes, that's basically what I'm doing - but I'm not retrying most of the HTTP status codes, because most of them indicate permanent errors for typical HTTP usage. And note that there's a bit more subtlety to it if you want to handle the Retry-After header correctly.

The problem is, there is a class of net errors that are not temporary - why else would there be a Temporary method on net.Error? - and it seems wrong to retry in those cases. (For example, what if the URL I passed in was not even syntactically valid?)

@bradfitz

This comment has been minimized.

Member

bradfitz commented Dec 23, 2014

You are correct that there's basically no consistency on the use of url.Error, net.Error, http.ProtocolError, etc, or which takes precedence when multiple apply.

This is kinda one of those "Unfortunate" situations.

One could imagine a new func in net/http like func ShouldRetry(err error) (bool, time.Duration) but that just adds more complexity and maintenance. Perhaps. Or maybe is useful. Or useful but not worth it. (putting all the various error logic into ShouldRetry, like os.IsNotExist etc, but with more logic)

I don't have solid suggestions. If anything, I would file lots of separate concrete bugs and address them one at a time. Bugs like this risk getting depressing and non-actionable, which is further depressing in that I used "actionable".

@davecheney

This comment has been minimized.

Contributor

davecheney commented Dec 23, 2014

On 23 Dec 2014 15:58, "Bryan C. Mills" notifications@github.com wrote:

@davecheney A simple type-assertion doesn't work - while most of the
errors in the "net" package implement a Temporary() method, url.Error - the
one returned most of the type by (http.Client).Get - does not. In order to
unpack the errors correctly, you need to know about both url.Error (for its
Err field) and net.Error (for the Temporary method). And neither of those
is prominently mentioned in the net/http docs.

I believe that returning the *url.Error was fixed in #9405.

@minux I tend to agree, but getting to that state would require being
much more careful about which errors to retry. (For example: the spurious
io.EOF errors would need to be fixed, and some more syscall errors - EPIPE
in particular - would probably need to be marked Temporary.) Documenting
and fixing the errors seems like a prerequisite for retrying them.

I'm a bit confused where the EPIPE entered the conversation, but IMO that
is not a temporary error.

@bradfitz Yes, that's basically what I'm doing - but I'm not retrying
most of the HTTP status codes, because most of them indicate permanent
errors for typical HTTP usage. And note that there's a bit more subtlety to
it if you want to handle the Retry-After header correctly.

The problem is, there is a class of net errors that are not temporary -
why else would there be a Temporary method on net.Error? - and it seems
wrong to retry in those cases. (For example, what if the URL I passed in
was not even syntactically valid?)


Reply to this email directly or view it on GitHub.

@bcmills

This comment has been minimized.

Member

bcmills commented Dec 24, 2014

@minux After further thought, I don't think that *http.Client itself should do the retries, at least not without an explicit field to enable them. Generally you don't want lots of layers of competing retry loops - one end-to-end retry loop is sufficient, and it puts less steady-state load on intermediate nodes in the system during the cooldown between retries - only the caller knows whether they are the topmost layer.

@rsc rsc added this to the Unplanned milestone Apr 10, 2015

Dieterbe added a commit to vimeo/bosun that referenced this issue Apr 17, 2015

prevent "use of closed network connection" in graphite GET's
I would sometimes see errors like:
graphiteBand: graphite RequestError (http://....): Get failed: Get
http://... : read tcp 10.90.128.100:80: use of closed network connection

This kind of error is not something that should bubble up to the caller
of a http client library, but it does.
see also:
golang/go#8946
golang/go#9424

there's a bunch more issues about the broken state of error handling in
net/http.
So anyway the http client tries to reuse an existing connection which
has broken.  Somehow this is the caller's problem, so we address it by
not keeping any idle connection and opening a new connection each time.
This should get rid of these errors without adding much overhead.

Note that the used http transport is, other than the
MaxIdleConnsPerHost setting, the same as the default transport.
@davisford

This comment has been minimized.

davisford commented Apr 28, 2015

I have a bit of a conundrum. I am writing code that executes payment transactions against various payment processors. It has strict timeout and other requirements, and I need to re-queue failed transactions for specific types of failures. I really need to know how/why a request failed, not just that it failed, and parsing the error.String() seems like a terrible idea.

It is important for me to know that the request timed out, or even that it was a temporary failure that may benefit from an immediate re-try. Reading through this thread, and via some experimentation, I have this to determine temp/timeout condition (seems only slightly better than parsing the full error.String()

func IsTempOrTimeout(err error) (temp bool, timeout bool) {
  if urlerr, ok := err.(*url.Error); ok {
     return checkNetError(urlerr.Err)
  }
  return chenkNetError(err)
}

func checkNetError(err error) (temp bool, timeout bool) {
  if nerr, ok := err.(net.Error); ok {
    return nerr.Temporary(), nerr.Timeout()
  }
  return false, false
}

I'm looking for feedback on the best way to proceed here:

  • Is this reasonable approach to determine if an error condition was either temporary or a timeout?
  • Is it possible I could get a different set of wrapped errors that may embed an error not covered here which could also indicate a temporary failure or timeout condition?
@bradfitz

This comment has been minimized.

Member

bradfitz commented Apr 28, 2015

@davisford, you didn't mention which mechanism(s) you're using to specify timeouts. There are a number of options.

@davisford

This comment has been minimized.

davisford commented Apr 28, 2015

@bradfitz that's a good point. I'm trying to navigate through them all. I'm actually concerned with all possible timeout values I can set. I have a finite time period by which I can retry a transaction, so I'd like to set every timeout. Also all tx are run through TLS.

But here is what I'm basically doing to the client before sending:

type Config struct {
    ConnectTimeout   time.Duration
    ReadWriteTimeout time.Duration
}

func timeoutDialer(config *Config) func(network, add string) (c net.Conn, err error) {
  return func(network, addr string) (net.Conn, error) {
    conn, err := net.DialTimeout(network, addr, config.ConnectTimeout)
    if err != nil {
      return nil, err
    }
    conn.SetDeadline(time.Now().Add(config.ReadWriteTimeout))
    return conn, nil
  }
}
}

client := &http.Client{
  Transport: &http.Transport{
    Proxy: nil,
    Dial: timeoutDialer(config),
  },
}

Question: would this configuration cover TLS handshake timeout or no?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Apr 28, 2015

You want Client.Timeout probably. It is the nuclear option of timeouts, if you don't want something specific. It is well documented. Keeping this bug discussion on-topic, let me know if there's something specific in the https://tip.golang.org/pkg/net/http/ documentation which is not clear.

@vanhalt

This comment has been minimized.

vanhalt commented Jun 5, 2015

Hi guys!

Just wanted to share that this issue is happening to me. I started receiving as error value "EOF".

go version go1.4.2 darwin/amd64

I copied this Google go-github Do function and made little modifications to it like:

func (c *Client) Do(req *http.Request, v interface{}) (*http.Response, error) {
  tr := &http.Transport{
   TLSClientConfig:    &tls.Config{InsecureSkipVerify: true},
   DisableCompression: true,
  }

  client := &http.Client{Transport: tr}

  resp, err := client.Do(req)
  ...

I have even followed the go-github implementation creating a http.Client and assigning it to a property in the Client struct to see if initializing client just once would make any change. But no matter what I do, I keep receiving that value specifically while checking err returned by Do.

The thing is that this code worked, I am working with an API and I test using curl and it works and somehow I just continued adding more code that worked with that Do method and it just started to return EOF (yeah, I now how stupid that sounds). I am new coding in Go and I don't know how to debug or find what is causing this issue. What's weird for me is this:

  resp, err := c.client.Do(req)
  fmt.Println(resp) // has the response: &{201 Created 201 HTTP/1.0 1 0....
  if err != nil {
    fmt.Println("failed do...")
    return nil, err
  }

The response is actually there, so can you please point me out where could be a good place to start researching about this?

Thanks and sorry for this newbie question, but I am blank right now.

@davecheney

This comment has been minimized.

Contributor

davecheney commented Jun 5, 2015

@vanhalt Please take this question to the mailing list.

@vanhalt

This comment has been minimized.

vanhalt commented Jun 5, 2015

Which one specifically?

@davecheney

This comment has been minimized.

Contributor

davecheney commented Jun 5, 2015

https://groups.google.com/forum/#!forum/golang-nuts

On Fri, Jun 5, 2015 at 12:08 PM, Rafa notifications@github.com wrote:

Which one specifically?


Reply to this email directly or view it on GitHub
#9424 (comment).

@vanhalt

This comment has been minimized.

vanhalt commented Jun 5, 2015

Thank you very much!

@raphael

This comment has been minimized.

raphael commented Jun 25, 2015

@bcmills I would be very interested in seeing the code you came up with to decode the possible errors. Is that available somewhere? (I know it was 6 months ago).

@bcmills

This comment has been minimized.

Member

bcmills commented Jun 25, 2015

I don't have it handy, but to sketch roughly:
You can walk the spine of chained errors by using (reflect.Value).FieldByName("Err"). When you get to the root, you can use a switch to check for well-known errors (such as io.EOF) and/or a type-switch to check for well-known error types (such as net.Error or syscall.Errno).

@raphael

This comment has been minimized.

raphael commented Jun 25, 2015

OK thanks, I was interested in seeing all the possible cases but that puts me on the right track. For the record I agree that having good error categorization for an HTTP client library is important. I'm working on a system that makes generic HTTP requests and it needs to understand the nature of the failure and not just "can it be retried or not?" (i.e. was it a connection error, a timeout before receiving headers, after receiving headers, a disconnection etc...). So it seems I need to go deep inside the rabbit hole...

@rsc rsc changed the title from net/http: Errors from *Client methods are poorly documented. to proposal: net/http: document Errors from *Client methods more clearly Jun 17, 2017

@gopherbot gopherbot added the Proposal label Jun 17, 2017

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jan 9, 2018

Removing the Go2 label. Documentation improvements do not have to wait for Go2. Perhaps there is a larger cleanup of the error handling in net/http that could be done in an overhaul of the overall package.

@ianlancetaylor ianlancetaylor removed the Go2 label Jan 9, 2018

@ianlancetaylor ianlancetaylor modified the milestones: Unplanned, Proposal Jan 9, 2018

@rsc

This comment has been minimized.

Contributor

rsc commented Jan 22, 2018

Maybe we can document for Go 1.11 what we want to guarantee and document what we're not guaranteeing either. There have been some CLs in the past few cycles that didn't attach to this issue, for Client and Transport.

@rsc rsc modified the milestones: Proposal, Go1.11 Jan 22, 2018

@rsc rsc changed the title from proposal: net/http: document Errors from *Client methods more clearly to net/http: document Errors from *Client methods more clearly Jan 22, 2018

@gopherbot

This comment has been minimized.

gopherbot commented Jul 24, 2018

Change https://golang.org/cl/125575 mentions this issue: net/http: document that Client methods always return *url.Error

@bradfitz bradfitz modified the milestones: Go1.11, Unplanned Jul 24, 2018

gopherbot pushed a commit that referenced this issue Jul 24, 2018

net/http: document that Client methods always return *url.Error
Updates #9424

Change-Id: If117ba3e7d031f84b30d3a721ef99fe622734de2
Reviewed-on: https://go-review.googlesource.com/125575
Reviewed-by: Ian Lance Taylor <iant@golang.org>

@bradfitz bradfitz changed the title from net/http: document Errors from *Client methods more clearly to net/http: document errors more (*Transport, Client's wrapper errors, etc) Jul 24, 2018

@bradfitz bradfitz changed the title from net/http: document errors more (*Transport, Client's wrapper errors, etc) to net/http: document errors more (*Transport, Client's wrapper errors, etc), how to check canceled, ... Dec 4, 2018

@bradfitz bradfitz added the NeedsFix label Dec 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment