Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: errors don't implement Temporary interface #35286

Open
kirillx opened this issue Oct 31, 2019 · 8 comments
Open

x/net/http2: errors don't implement Temporary interface #35286

kirillx opened this issue Oct 31, 2019 · 8 comments

Comments

@kirillx
Copy link

@kirillx kirillx commented Oct 31, 2019

What version of Go are you using (go version)?

1.13

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Linux

What did you do?

HTTP client application issuing std POST/GET requests.

What is wrong?

Sometimes http client returns the below errors and there is no good way to verify that they are temporary and might be retried:
a) http.http2GoAwayError (e.g. 'http2: server sent GOAWAY and closed the connection; LastStreamID=8329, ErrCode=NO_ERROR, debug="max_age"'
b) http.http2StreamError (e.g. 'stream error: stream ID 309; INTERNAL_ERROR')

After googling one can even find that some detect such errors and retry on them like in cloud.google.com/go/storage/reader.go:
if strings.HasSuffix(err.Error(), "INTERNAL_ERROR") && strings.Contains(reflect.TypeOf(err).String(), "http2") {

So if there is no objections I would make these 2 errors implement Temporary() interface returning true.

@bradfitz bradfitz changed the title http2 errors don't implement Temporary interface x/net/http2: errors don't implement Temporary interface Oct 31, 2019
@bcmills bcmills added this to the Unplanned milestone Nov 5, 2019
@kirillx

This comment has been minimized.

Copy link
Author

@kirillx kirillx commented Jan 22, 2020

@bcmills what kind of help wanted?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Jan 22, 2020

@kirillx, figure out which of the errors should be considered Temporary and why, and implement the interface accordingly for the ones that are.

(The hard part of the CL to fix it is likely to be the documentation: it will need comments explaining why the errors are, in fact, temporary.)

@kirillx

This comment has been minimized.

Copy link
Author

@kirillx kirillx commented Jan 27, 2020

@bcmills

  1. with http.http2GoAwayError I think justification is trivial, this error happens typically if server decides enough requests per connection serviced or after some timeout or graceful shutdown (https://http2.github.io/http2-spec/#GOAWAY).

  2. with INTERNAL_ERROR it is more complicated to justify as it requires analysis of all 3rd party servers and HTTP2 spec is not 100% clear when it should be used:
    INTERNAL_ERROR (0x2): The endpoint encountered an unexpected internal error.
    e.g. the very same package server may issue INTERNAL_ERROR in clientStream.writeRequestBody() if encodeTrailers() fails, which I believe is not a normal situation.
    I believe spec was meant this error to be used similar to HTTP 500 and real-life experience clearly demonstrates it is retryable (like HTTP 500) and as I mentioned even google SDK does so on the error.

You can also see lots of such reports in github:
#22235
#26338
googleapis/google-cloud-go#987

BTW, why these errors are not exported? They are in x/net/http2 package, but seems like they are made somehow private during bundling to http/h2_bundle.go ... Exporting error types would make it possible checking for it's type at least and codes. Any ideas?

kirillx added a commit to kirillx/net that referenced this issue Jan 27, 2020
... returning true.

Justification from golang/go#35286 (#35286):

1. http.http2GoAwayError (e.g. 'http2: server sent GOAWAY and closed the
connection; LastStreamID=8329, ErrCode=NO_ERROR, debug="max_age"')

This error happens typically if server decides enough requests per connection
serviced or after some timeout or graceful shutdown
(https://http2.github.io/http2-spec/#GOAWAY).

2. http.http2StreamError with INTERNAL_ERROR code (e.g. 'stream error: stream ID 309; INTERNAL_ERROR')

According to HTTP2 SPEC:
INTERNAL_ERROR (0x2): The endpoint encountered an unexpected internal error.

i.e. this error to be used similar to HTTP 500 and real-life experience clearly
demonstrates it is retryable (like HTTP 500).

Google Cloud SDK also implements retries on this error at cloud.google.com/go/storage/reader.go:
if strings.HasSuffix(err.Error(), "INTERNAL_ERROR") && strings.Contains(reflect.TypeOf(err).String(), "http2")

Signed-off-by: Kirill Korotaev <kirillx@gmail.com>
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Jan 27, 2020

Treating INTERNAL_ERROR as “temporary” seems like too much of a stretch.

With human intervention, nearly any error can be “temporary” — the relevant question from an API perspective, I think, is whether the problem is likely to resolve on its own without intervention, and in the case of a 500 status that seems much less likely.

Retrying a 500 without knowing its cause seems dubious at best — sometimes it will help, and sometimes it will only exacerbate a Query of Death. Given that it is easier for callers to add a retry than to remove one, that doesn't seem like a good place for the library to bias in favor of retrying.

@kirillx

This comment has been minimized.

Copy link
Author

@kirillx kirillx commented Jan 27, 2020

  1. Retrying 500 is an official way of handling it in almost any cloud SDK and REST API, including google and AWS ones:

https://cloud.google.com/storage/docs/exponential-backoff

"Clients should use truncated exponential backoff for all requests to Cloud Storage that return HTTP 5xx and 429 response codes..."

https://docs.aws.amazon.com/general/latest/gr/api-retries.html

If you're not using an AWS SDK, you should retry original requests that receive server (5xx) or throttling errors.

  1. To avoid Query of Death one need to use exponential backoff.

  2. Unfortunately it is not easier for callers to add a retry - error is not exported, so it is not possible to check its error type and .Code field.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Jan 27, 2020

Retrying 500 is an official way of handling it in almost any cloud SDK and REST API

“almost any cloud SDK and REST API” is not the same thing as “the HTTP protocol”. The net/http package implements the latter, not the former.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Jan 27, 2020

Unfortunately it is not easier for callers to add a retry - error is not exported, so it is not possible to check its error type and .Code field.

It is exported from golang.org/x/net/http2, just not from net/http.

However, even that could potentially be remedied by exporting the type (or by transforming it to something exported) from within the net/http package, or by adding a new (backward-compatible) field to the http.Response struct for HTTP2 error codes.

@kirillx

This comment has been minimized.

Copy link
Author

@kirillx kirillx commented Jan 28, 2020

I don't really want to argue about retriable errors. Protocol spec doesn't define it well, besides some basics like Retry-After headers and that streams after GoAway are retriable. Yet, none of the POSIX specs define that EMFILE and ETIMEDOUT are temporary errors, which are treated by Go so.
All the timeouts (BTW including temporary tlsHandshakeTimeoutError) - maybe a sign of server being overloaded, yet they are retriable.

From my POV per HTTP2 SPEC INTERNAL_ERROR is not smth which happens normally if all the parties obey the protocol. Thus this error is kind of temporary.

It is exported from golang.org/x/net/http2, just not from net/http.

Does it still help when all the SDKs are using net/http? Am I missing smth? Would be thankful for details on how it helps with 3rd party packages.

However, even that could potentially be remedied by exporting the type (or by transforming it to something exported) from within the net/http package, or by adding a new (backward-compatible) field to the http.Response struct for HTTP2 error codes.

Should I add

type Http2StreamError = http2StreamError

to export it? I can change the patch this way if you feel it is better suited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.