Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http: Re-connect with upgraded HTTP2 connection fails to send Request.body #32441

Open
brianfife opened this issue Jun 5, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@brianfife
Copy link

commented Jun 5, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.5 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/bfife/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/bfife/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/x7/yphtc2sj4ml2lrflpwdblvrxsmcq9d/T/go-build801671760=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

This is a follow up of issues #25009 #28206

We are sending requests as follows:

proto.transport = &http.Transport{}
proto.client = &http.Client{Transport: proto.transport}
req, err := http.NewRequest(request.Method, url, bytes.NewBuffer(request.Body))
resp, err := proto.client.Do(req)

What did you expect to see?

Good behaviour

With the above client configuration, I was able to determine the following code was sending out the request using the RoundTrip function noted here:

https://github.com/golang/go/blob/release-branch.go1.12/src/net/http/transport.go#L427

	if t.useRegisteredProtocol(req) {
		altProto, _ := t.altProto.Load().(map[string]RoundTripper)
		if altRT := altProto[scheme]; altRT != nil {
			if resp, err := altRT.RoundTrip(req); err != ErrSkipAltProtocol {
				return resp, err
			}
		}
	}

In our case, for the first and following requests t.useRegisteredProtocol(req) returns true, so altRT.RoundTrip(req) tries to send via h2_bundle's RoundTrip. As no connection exists at this point in time, this will fail due to error "http2: no cached connection was available", bubbling up ErrSkipAltProtocol "net/http: skip alternate protocol", and the code continues on. n.b. subsequent requests will be sent via altRT.RoundTrip(req) if there are cached connections available, however a GoAway received from the server will trigger the same behaviour of advancing to the next section of code.

https://github.com/golang/go/blob/release-branch.go1.12/src/net/http/transport.go#L447

	for {
		select {
		case <-ctx.Done():
			req.closeBody()
			return nil, ctx.Err()
		default:
		}

		// treq gets modified by roundTrip, so we need to recreate for each retry.
		treq := &transportRequest{Request: req, trace: trace}
		cm, err := t.connectMethodForRequest(treq)
		if err != nil {
			req.closeBody()
			return nil, err
		}

		// Get the cached or newly-created connection to either the
		// host (for http or https), the http proxy, or the http proxy
		// pre-CONNECTed to https server. In any case, we'll be ready
		// to send it requests.
		pconn, err := t.getConn(treq, cm)
		if err != nil {
			t.setReqCanceler(req, nil)
			req.closeBody()
			return nil, err
		}

		var resp *Response
		if pconn.alt != nil {
			// HTTP/2 path.
			t.decHostConnCount(cm.key()) // don't count cached http2 conns toward conns per host
			t.setReqCanceler(req, nil)   // not cancelable with CancelRequest
			resp, err = pconn.alt.RoundTrip(req)
		} else {
			resp, err = pconn.roundTrip(treq)
		}
		if err == nil {
			return resp, nil
		}
...

In the above for loop, for the first request without any connections available, a new transport connection is created, and pconn.alt.RoundTrip(req) or pconn.RoundTrip(req) is then used to submit and fulfill the Request, and this likely completes successfully. So far, so good.

As noted above, subsequent requests from our client continue to be sent from altRT.RoundTrip(req), at which point they are returned, without reaching the for loop.

Bad behaviour

After some time, our load balancer may respond with a GoAway response after a full request is transmitted, including data from the Request.Body, as seen with the following debug:

2019/06/04 02:22:26.750150 http2: Framer 0xc000470a80: wrote HEADERS flags=END_HEADERS stream=17 len=12
2019/06/04 02:22:26.750321 http2: Framer 0xc000470a80: wrote DATA stream=17 len=125 data="STRINGOFDATA"
2019/06/04 02:22:26.750386 http2: Framer 0xc000470a80: wrote DATA flags=END_STREAM stream=17 len=0 data=""
2019/06/04 02:22:26.753144 http2: Framer 0xc000470a80: read GOAWAY len=8 LastStreamID=15 ErrCode=NO_ERROR Debug=""
2019/06/04 02:22:26.753217 http2: Transport readFrame error on conn 0xc0001a3e00: (*errors.errorString) EOF
2019/06/04 02:22:26.753291 http2: Transport failed to get client conn for server:443: http2: no cached connection was available

This results in another transport being created in the for loop shown above, however the Request.Body is not being reset before the request is sent on the newly created transport connection.

The rewind code at https://github.com/golang/go/blob/release-branch.go1.12/src/net/http/transport.go#L497 doesn't get executed if pconn.alt.RoundTrip(req) or pconn.RoundTrip(req) are successful in sending a request after a connection is re-established. From my limited testing, I think the only time the rewind is reached is when TLSNextProto is manually set as it is in TestTransportBodyAltRewind, because this skips setting up the altProto map in transport.go/onceSetNextProtoDefaults(), as altRT is nil.

It looks like moving the rewind code from L497 to L455 would satisfy the case where an existing connection receives a GoAway frame and the Body had been sent to the server (as described above), however it would also execute in the case of the initial request where a transport does not exist and the request hasn't been sent (which makes that execution not necessary).

I believe this test case demonstrates the issue described above:

func TestTransportBodyGoAwayReceived(t *testing.T) {
        var err error
        var resp *Response
        req, _ := NewRequest("POST", "https://example.org/", bytes.NewBufferString("request"))
        tr := &Transport{DisableKeepAlives: true }
        tr.RegisterProtocol("https", httpsProto{})
        c := &Client{Transport: tr}
        resp, err = c.Do(req)
        if err != nil {
                t.Error(err)
        }
        if resp != nil {
                defer resp.Body.Close()
                ioutil.ReadAll(resp.Body)
        }
}

type httpsProto struct{}

func (httpsProto) RoundTrip(req *Request) (*Response, error) {
        defer req.Body.Close()
        ioutil.ReadAll(req.Body)
        return nil, ErrSkipAltProtocol
}

output

../bin/go test -v -run=TestTransportBodyGoAwayReceived net/http
=== RUN   TestTransportBodyGoAway
2019/06/04 21:36:24 Error enabling Transport HTTP/2 support: protocol https already registered
--- FAIL: TestTransportBodyGoAway (0.42s)
    transport_internal_test.go:274: Post https://example.org/: http: ContentLength=7 with Body length 0
FAIL
FAIL	net/http	0.495s

Moving the rewind code to the top of the for loop as described above results in success:

./bin/go test -v -run=TestTransportBodyGoAwayReceived net/http
=== RUN   TestTransportBodyGoAway
2019/06/04 21:35:54 Error enabling Transport HTTP/2 support: protocol https already registered
--- PASS: TestTransportBodyGoAway (0.56s)
PASS
ok  	net/http	0.642s

@gopherbot gopherbot added this to the Unreleased milestone Jun 5, 2019

@brianfife

This comment has been minimized.

Copy link
Author

commented Jun 10, 2019

As I am not in a position to sign the CLA and submit a fix, it would be good to get some maintainers to have a look at this issue.

@dmitshur

This comment has been minimized.

Copy link
Member

commented Jun 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.