Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http, x/net/http2: client loses the request body when retrying after HTTP2 server close connection gracefully(GOAWAY) #39086

Closed
answer1991 opened this issue May 15, 2020 · 19 comments

Comments

@answer1991
Copy link

@answer1991 answer1991 commented May 15, 2020

What version of Go are you using (go version)?

$ go version
go version go1.13.7 darwin/amd64

Does this issue reproduce with the latest release?

YES, I run the reproduce code in golang docker container with latest image(golang:1.14.2-buster), still occur this bug.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/Joe/Library/Caches/go-build"
GOENV="/Users/Joe/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/Joe/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/Joe/go/src/golang.org/x/net/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/vk/bngnkbgj09v1cn56dsbrvs540000gn/T/go-build201607743=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. Set up a HTTP2 server with a handler which will send GOAWAY frame
  2. Use a HTTP client to request HTTP2 server in multi goroutine in concurrency
  3. HTTP client may got a 500 HTTP response code with error request declared a Content-Length of N but only wrote 0 bytes

Example codes in Go Playground: https://play.golang.org/p/8eiVRYqRyc6

Already known after I debug

  1. HTTP2 RoundTrip received the request which body has no content, after I re-set req.body = req.GetBody() in HTTP2 RoundTrip, the bug will not occur again. codes like:
// RoundTripOpt is like RoundTrip, but takes options.
func (t *Transport) RoundTripOpt(req *http.Request, opt RoundTripOpt) (*http.Response, error) {
	if !(req.URL.Scheme == "https" || (req.URL.Scheme == "http" && t.AllowHTTP)) {
		return nil, errors.New("http2: unsupported scheme")
	}

	if req.GetBody != nil {
		var ierr error
		req.Body, ierr = req.GetBody()
		if ierr != nil {
			return nil, ierr
		}
	}
  1. Based 1., the higher RoundTrip(golang net/http RoundTrip) lost request body when retry, BUT I found the the RoundTrip has re-wind body when re-try:
    // Rewind the body if we're able to.
    if req.GetBody != nil {
    newReq := *req
    var err error
    newReq.Body, err = req.GetBody()
    if err != nil {
    return nil, err
    }
    req = &newReq
    }
    }

What did you expect to see?

HTTP client should not got error after received GOAWAY when re-try the request

What did you see instead?

@answer1991
Copy link
Author

@answer1991 answer1991 commented May 15, 2020

cc @bradfitz

Need your help, I had already debug with many days but did not found the root cause.

@wojtek-t
Copy link

@wojtek-t wojtek-t commented May 15, 2020

/cc

@lavalamp
Copy link

@lavalamp lavalamp commented May 15, 2020

I ran the reproducer locally and reproduced it.

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 16, 2020

I haven't found where the issue is but I do know we are reusing a req.Body on the http2 transport side. What I have determined is that we read the data once and then we reuse the body which has no more data.

@answer1991
Copy link
Author

@answer1991 answer1991 commented May 16, 2020

we are reusing a req.Body on the http2 transport side

I found http2 transport entrypoint get the request which has no data, and base that I think we are reusing a req.Body on http transport but not http2 transport. I test with adding the following codes in http2 transport entrypoint and got panic:

// RoundTripOpt is like RoundTrip, but takes options.
func (t *Transport) RoundTripOpt(req *http.Request, opt RoundTripOpt) (*http.Response, error) {
	if !(req.URL.Scheme == "https" || (req.URL.Scheme == "http" && t.AllowHTTP)) {
		return nil, errors.New("http2: unsupported scheme")
	}

	// check request body has data, panic if it's no data
	if req.Body != nil && req.GetBody != nil {
		body, err := ioutil.ReadAll(req.Body)
		if err != nil {
			return nil, err
		}

		if len(body) == 0 {
			panic("http2 roundtrip get a requst which body has no data")
		}

		var ierr error
		req.Body, ierr = req.GetBody()
		if ierr != nil {
			return nil, ierr
		}
	}
@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 17, 2020

I found the issue and I have a solution but I need to better understand the flow.
The failure occurs in the http transport when the altRT.RoundTrip is taken. Later on a second RoundTrip occurs, but a new request & body wasn't created.

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 17, 2020

Took a bit of time but I figured out the problem.
An http2 request is attempted via altRT.RoundTrip. The first attempt writes the body but fails because the connection gets a GOAWAY. The http2 side retries but gets a no cached connection error which bubbles up as a ErrSkipAltProtocol.
The http side takes over and tries again but this time the body is already read.

Given we never reset the body in the existing request, I am not certain how to solve this other than no returning the ErrSkipAltProtocol if we have already written the body.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented May 17, 2020

(There was work in this area recently in ea1437a for #34978 and #35113 but I haven't read this bug or those ones enough to know whether they're related.)

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 17, 2020

@bradfitz It doesn't look like it. Its a corner case that has always been broken.
Retrying on the http2 side and then returning an ErrSkipAltProtocol after a failed attempt seems odd.
I just don't know how to fix the implicit constraint that a req's body is only read once given we are allowing the http side to use a request that has already been processed.
One answer would be to always make a new request if the altRT path is taken since we can't tell what if anything has been touched.

@odeke-em odeke-em changed the title HTTP client lost request body when retry the request after HTTP2 server close connection gracefully(GOAWAY) net/http, x/net/http2: client loses the request body when retrying after HTTP2 server close connection gracefully(GOAWAY) May 17, 2020
@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 18, 2020

I do have a fix that returns Get "https://127.0.0.1:45927/put/0": http2: Transport received Server's graceful shutdown GOAWAY rather than the status code: 500, body: request declared a Content-Length of 5 but only wrote 0 bytes. Normally one might see http2: no cached connection was available but that would normally be replaced by the ErrSkipAltProtocol.
The other option would be to always create a new request to allow this request to be reused later if necessary.

@answer1991
Copy link
Author

@answer1991 answer1991 commented May 18, 2020

I misunderstood that http2 request will never take altRT.RoundTrip before.

What about we fix http roundtrip altRT.RoundTrip error process like:

	if altRT := t.alternateRoundTripper(req); altRT != nil {
		if resp, err := altRT.RoundTrip(req); err != ErrSkipAltProtocol {
			return resp, err
		}

		if req.GetBody != nil {
			newReq := *req
			var ierr error
			newReq.Body, ierr = req.GetBody()
			if ierr != nil {
				return nil, ierr
			}
			req = &newReq
		}
	}
@answer1991
Copy link
Author

@answer1991 answer1991 commented May 18, 2020

I do have a fix that returns Get "https://127.0.0.1:45927/put/0": http2: Transport received Server's graceful shutdown GOAWAY rather than the status code: 500, body: request declared a Content-Length of 5 but only wrote 0 bytes.

I don't think return an error is acceptable.

@gopherbot
Copy link

@gopherbot gopherbot commented May 18, 2020

Change https://golang.org/cl/234358 mentions this issue: net/http: rewind the request body if alternate RoundTrip failed.

@answer1991
Copy link
Author

@answer1991 answer1991 commented May 18, 2020

I think I had figured out what happening:

  1. Alternate RoundTrip is configured "https" with golang.org/x/net/http2 by us.
  2. http2 RoundTrip("https" Alternate RoundTrip) has sent request body but received a HTTP2 GOAWAY frame, and then failed re-try to get connection from connection pool. The error is wrapper as ErrSkipAltProtocol.
  3. Then http RoundTrip take over request, and we do not re-set request body before that:

if altRT := t.alternateRoundTripper(req); altRT != nil {
if resp, err := altRT.RoundTrip(req); err != ErrSkipAltProtocol {
return resp, err
}
}

And I think #39086 (comment) would be a good solution.

@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented May 19, 2020

/cc @bcmills

@bcmills
Copy link
Member

@bcmills bcmills commented May 19, 2020

Is this a duplicate of #32441?

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 19, 2020

@bcmills yes it is.

@bcmills
Copy link
Member

@bcmills bcmills commented May 19, 2020

Duplicate of #32441

@bcmills bcmills closed this May 19, 2020
@answer1991
Copy link
Author

@answer1991 answer1991 commented May 20, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.