Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http: PROTOCOL_ERROR with HTTP2 #29125

Open
siscia opened this issue Dec 6, 2018 · 17 comments
Open

x/net/http: PROTOCOL_ERROR with HTTP2 #29125

siscia opened this issue Dec 6, 2018 · 17 comments
Milestone

Comments

@siscia
Copy link

@siscia siscia commented Dec 6, 2018

What version of Go are you using (go version)?

$ go version
go version go1.11.2 linux/amd64

Does this issue reproduce with the latest release?

Yes, which I believe is the one I am running, hence 1.11.2

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/simo/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/simo/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build210197689=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I am downloading tar files from the docker hub, uncompress them using the standard gzip, and passing the result to another software using STDIN.

If is necessary I will try to reproduce this using something simpler like sha256 sum in linux.

What did you expect to see?

I would not expect a PROTOCOL_ERROR

What did you see instead?

A PROTOCOL_ERROR

Below the relevant log using GODEBUG=http2debug=2


INFO[0177] Deleting temporary directory                  action=ingesting target=/tmp/tempCatalog140887158
INFO[0177] Created subcatalog in directory               directory=.layers/bf
INFO[0177] [cvmfs_server ingest --catalog -t - -b .layers/bf/bfb904e99f247ef4b354eab1285ec78bc5bb929dbf134ff69c156b90711216f6/layerfs portals.cern.ch]  action=executing
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
INFO[0180] Process alive
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192

   ... this line repeat for exactly another 450 times, I cut it...

2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=18886
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16393
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=17397
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=12782
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=14630
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8197
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8201
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
ERRO[0180] Error in copying the input into STDIN.        error="stream error: stream ID 13; PROTOCOL_ERROR"
INFO[0210] Process alive
2018/12/06 18:54:36 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:54:36 http2: Transport received RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
INFO[0240] Process alive
INFO[0270] Process alive
INFO[0300] Process alive
2018/12/06 18:56:06 http2: Transport closing idle conn 0xc00047ddc0 (forSingleUse=false, maxStream=27)
2018/12/06 18:56:06 http2: Transport readFrame error on conn 0xc00047ddc0: (*net.OpError) read tcp 128.141.207.159:35866->104.18.123.25:443: use of closed network connection

Maybe is related to #26321 but I don't know how to tell.

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented Dec 6, 2018

Let me add that the problem arises seldom, so some downloads go just fine, while other present this problem.

The overall workflow is to get the tar.gz from the docker hub, pass the byte stream to "compress/gzip" (so something like gzip.NewReader(resp.Body) ) and finally pass the reader created by gzip to the STDIN of another software (using "os.exec" package).

Moreover, if instead of passing the result of the uncompression to the STDIN of another software I just save it on file I cannot reproduce the problem.

@agnivade agnivade changed the title PROTOCOL_ERROR with HTT2 x/net/http: PROTOCOL_ERROR with HTTP2 Dec 7, 2018
@gopherbot gopherbot added this to the Unreleased milestone Dec 7, 2018
@agnivade

This comment has been minimized.

Copy link
Contributor

@agnivade agnivade commented Dec 7, 2018

Hi @siscia - can you post an independent code sample that we can use to reproduce this issue ?

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented Dec 7, 2018

Hi,

I was not able to re-produce without using cvmfs_server, the custom software that read from STDIN.

Maybe is a problem from our side but I don't know what it could be...

Should I make a small code sample that use cvmfs_server and reproduce the problem or I should just close the issue?

@agnivade

This comment has been minimized.

Copy link
Contributor

@agnivade agnivade commented Dec 7, 2018

/cc @bradfitz

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Dec 13, 2018

@fraenkel, so you see anything wrong here?

@siscia, which HTTP/2 server is cvmfs_server using?

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented Dec 13, 2018

So I am trying to find all the places we just report a stream error without any cause. There aren't many.
I did notice when we receive a WindowUpdateFrame and the increment is 0.
The other is parseHeadersFrame where we go negative.

Adding a cause to all places that return an error would at least identify where its coming from.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Dec 13, 2018

@fraenkel I thought this was some mystery server sending us a PROTOCOL_ERROR because it didn't like our Transport's HTTP/2 behavior.

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented Dec 13, 2018

@bradfitz I see them against the docker registries, in particular I can see them against the standard registries hosted by docker hub itself.

I believe they are hosted on AWS.

Just to clarify, the Docker Registries servers are sending data, a go program is piping those data into the STDIN of another program, the cvmfs_server.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented Dec 14, 2018

@bradfitz Sorry I missed the RST_STREAM at the top. I find it odd that all we see are just WINDOW_UPDATE. The fact that "all" or many streams hit this at the same time could mean that we exceeded the connection window? I am guessing only because we are missing information.
What might be useful is to compare a successful run vs an unsuccessful run and see where the messages differ. It might just be in the exchange of WINDOW_UPDATEs but its just a guess.

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented Dec 22, 2018

Do you guys need any more info from my side?

@harshavardhana

This comment has been minimized.

Copy link
Contributor

@harshavardhana harshavardhana commented May 6, 2019

It may be same as #31534 ?

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented May 7, 2019

It may it may not, I worked around the issue and I can't recal the problem!

Feel free to close though!

@harshavardhana

This comment has been minimized.

Copy link
Contributor

@harshavardhana harshavardhana commented May 7, 2019

How did your workaround, if I may ask @siscia ?

@siscia

This comment has been minimized.

Copy link
Author

@siscia siscia commented May 7, 2019

I don't remember honestly, two options:

  1. Just download the file/layers using directly docker instead of an HTTP calls
  2. Swallow the error and just re-trying

I don't remember which option, but nothing technically interesting I am afraid :(

mvo5 added a commit to mvo5/snappy that referenced this issue Aug 12, 2019
We see some test failures (and probably real-world failures as
well) in the downloads with something like:

```
+ snap install test-snapd-content-circular1
error: cannot perform the following tasks:
- Download snap "test-snapd-content-circular2" (2) from channel "stable" (stream error: stream ID 1; PROTOCOL_ERROR)
```

Looking at the logs it seems that we get the protocol error from
fastly and that it is sent to us from the remote side:
```
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Transport failed to get client conn for fastly.cdn.snapcraft.io:443: http2: no cached connection was available
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Transport creating client conn 0xc4200baa80 to 151.101.250.217:443
...
c4204fbce0: wrote SETTINGS len=18, settings: ENABLE_PUSH=0, INITIAL_WINDOW_SIZE=4194304, MAX_HEADER_LIST_SIZE=10485760
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote WINDOW_UPDATE len=4 (conn) incr=1073741824
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote HEADERS flags=END_STREAM|END_HEADERS stream=1 len=869
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read SETTINGS len=6, settings: MAX_CONCURRENT_STREAMS=100
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote SETTINGS flags=ACK len=0
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read WINDOW_UPDATE len=4 (conn) incr=16711681
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read SETTINGS flags=ACK len=0
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read HEADERS flags=END_HEADERS stream=1 len=378
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read RST_STREAM stream=1 len=4 ErrCode=PROTOCOL_ERROR
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote RST_STREAM stream=1 len=4 ErrCode=CANCEL
...
```
There is a bugreport upstream that looks a bit like our issue:
golang/go#29125

and there retrying seemed to have solved the issue. We should
dig a bit deeper and see if we can build a small reproducer
using fastly so that we can figure out if go2 is doing something
that fastly is not expecting or if fastly is the issue. But in
the meantime this hopefully unblocks us.
@stan-sack

This comment has been minimized.

Copy link

@stan-sack stan-sack commented Dec 9, 2019

I am seeing something similar to this. On the client side I see stream error: stream ID XXXXXX; PROTOCOL_ERROR and on the server I see stream error: stream ID XXXXXX; NO_ERROR when I try to read the request body with bodyBytes, err := ioutil.ReadAll(r.Body).

I've been able to get around this by using httpClient.Transport.RoundTrip(req) rather than httpClient.Do(req). This seems to reduce occurrences of the issue to basically zero but it also breaks all the timeouts I have set on the client and transport.

FYI I'm using a non tls http2 client:

	httpClient := &http.Client{
		Transport: &http2.Transport{
			AllowHTTP: true,
			DialTLS: func(netw, addr string, cfg *tls.Config) (net.Conn, error) {
				return dialer.Dial(netw, addr)
			},
		},
		Timeout: time.Duration(time.Second * 5),
	}

I'm unable to provide http2 logs because I only see the issue at scale.

@philippgille

This comment has been minimized.

Copy link

@philippgille philippgille commented Jan 18, 2020

Just in case anyone else finds this GitHub issue after making the same mistake as me:

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

stream error: stream ID 1; PROTOCOL_ERROR

I would love to have amore descriptive error message. Shouldn't it be possible to check against a list of valid (according to specification) HTTP verbs and then say the passed one is not one of them?

Anyway, I hope this helps someone.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented Jan 19, 2020

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

There is http.MethodGet so you don't have to think about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.