Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: Server rejects Transport with ErrCodeProtocol stream error #47635

Open
bradfitz opened this issue Aug 11, 2021 · 4 comments
Open

x/net/http2: Server rejects Transport with ErrCodeProtocol stream error #47635

bradfitz opened this issue Aug 11, 2021 · 4 comments

Comments

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Aug 11, 2021

We're running a Go 1.17rc2 httputil.ReverseProxy in front of a Go go1.17rc2 net/http.Server.

We were seeing weird ErrCodeProtocol stream errors, so we stopped using the bundled x/net/http2 and switched both to using x/net/http2 explicitly (with http2.ConfigureTransports and http2.ConfigureServer, so the bundled one isn't used).

Our x/net git version is golang/net@aaa1db6, which is today's current master.

The logs we were getting weren't initially enough to tell us where the problem was:

2021/08/10 14:07:20 http: proxy error: stream error: stream ID 1293155; PROTOCOL_ERROR
2021/08/10 14:07:20 http: proxy error: stream error: stream ID 1293157; PROTOCOL_ERROR

But after hacking it up locally to add some expvar counters on which paths in http2.Server were returning the protocol error, we found it was here:

https://github.com/golang/net/blob/aaa1db679c0d7765d2b1cb1f92cac8ebf4d94c53/http2/server.go#L1835

// http://tools.ietf.org/html/rfc7540#section-5.1.2
// [...] Endpoints MUST NOT exceed the limit set by their peer. An
// endpoint that receives a HEADERS frame that causes their
// advertised concurrent stream limit to be exceeded MUST treat
// this as a stream error (Section 5.4.2) of type PROTOCOL_ERROR
// or REFUSED_STREAM.
if sc.curClientStreams+1 > sc.advMaxStreams {
        if sc.unackedSettings == 0 {
                // They should know better.
                return streamError(id, ErrCodeProtocol)
        }

So, the Go http2 client (Transport) and/or server is getting its concurrent/max streams accounting wrong.

Perhaps interesting: the backend (the http2.Server that the ReverseProxy's Transport is hitting) has extremely long-running http.Handlers (like: days or weeks longs) alongside many short ones. That might explain why this bug hasn't been reported by more people.

I see no recent commits that suggest this is a regression. This server is new, so I also can't say whether it worked previously.

/cc @neild @bcmills @fraenkel @dmitshur @dsnet @josharian

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented Aug 11, 2021

possible? #42777

@bradfitz
Copy link
Contributor Author

@bradfitz bradfitz commented Aug 11, 2021

Ah, yes. That's probably what we're hitting.

@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Aug 13, 2021

CC @neild.

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 3, 2021

Change https://golang.org/cl/347033 mentions this issue: http2: make Transport not reuse conns after a stream protocol error

gopherbot pushed a commit to golang/net that referenced this issue Sep 3, 2021
If a server sends a stream error of type "protocol error" to a client,
that's the server saying "you're speaking http2 wrong". At that point,
regardless of whether we're in the right or not (that is, regardless of
whether the Transport is bug-free), clearly there's some confusion and
one of the two parties is either wrong or confused. There's no point
pushing on and trying to use the connection and potentially exacerbating
the confusion (as we saw in golang/go#47635).

Instead, make the client "poison" the connection by setting a new "do
not reuse" bit on it. Existing streams can finish up but new requests
won't pick that connection.

Also, make those requests as retryable without the caller getting an
error.

Given that golang/go#42777 existed, there are HTTP/2 servers in the
wild that incorrectly set RST_STREAM PROTOCOL_ERROR codes. But even
once those go away, this is still a reasonable fix for preventing
a broken connection from being stuck in the connection pool that fails
all future requests if a similar bug happens in another HTTP/2 server.

Updates golang/go#47635

Change-Id: I3f89ecd1d3710e49f7219ccb846e016eb269515b
Reviewed-on: https://go-review.googlesource.com/c/net/+/347033
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants