Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: Transport connection closes sometimes after Request.Cancel, against nginx? #16514

Closed
rayvbr opened this issue Jul 27, 2016 · 18 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@rayvbr
Copy link

rayvbr commented Jul 27, 2016

  1. What version of Go are you using (go version)?
  2. What operating system and processor architecture are you using (go env)?
    go1.6.2 darwin/amd64
    GIT Head (27-07-2016) of github.com/golang/net/http2
  3. What did you do?
transport := &http2.Transport{
        DisableCompression: true,
        AllowHTTP:          true,
        DialTLS: func(network, addr string, cfg *tls.Config) (net.Conn, error) {
            return net.Dial(network, addr)
        },
    }
httpClient := http.Client{Transport: transport}

httpRequest, _ := http.NewRequest("GET", url, nil)
cancel := make(chan struct{})
httpRequest.Cancel = cancel
response, httpError := httpClient.Do(httpRequest)

Then, in another goroutine, before Do(httpRequest) has returned:

cancel.closeChannel()

  1. What did you expect to see?
    httpError returning 'net/http: request canceled', but TCP session remaining open
  2. What did you see instead?
    httpError returning 'net/http: request canceled', but client closing TCP connection
@bradfitz
Copy link
Contributor

Did you hit submit before writing the bug report?

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 27, 2016
@rayvbr
Copy link
Author

rayvbr commented Jul 27, 2016

Yes, accidentally pressed ENTER while the textbox wasn't selected. Should be fixed now.

Anyway, I did some research and it seems this has to do with a errRequestCanceled error being returned from the HTTP/2 transport.go, which HTTPClient.Do interprets as a RoundTrip Failure, which makes it close the TCP connection. This is especially problematic when sending a significant number (e.g. >20) of requests at roughly the same time, which will result in response headers for some of them to take some to return (e.g. >100ms)

@bradfitz bradfitz self-assigned this Jul 27, 2016
@bradfitz bradfitz added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 27, 2016
@bradfitz
Copy link
Contributor

Thanks. I'll take a look.

@rayvbr
Copy link
Author

rayvbr commented Jul 27, 2016

In Nginx, this is reported as:

2016/07/27 14:01:24 [info] 5#5: *3 client canceled stream 7 while sending response to client, client: 139.63.xxx.xxx, server: localhost, request: "GET /requested_file.example HTTP/2.0", host: "REMOVED"
2016/07/27 14:01:24 [info] 5#5: *3 client prematurely closed connection while sending response to client, client: 139.63.xxx.xxx, server: localhost, request: "GET /requested_file.example HTTP/2.0", host: "REMOVED"

@bradfitz bradfitz added this to the Go1.7Maybe milestone Jul 28, 2016
@bradfitz
Copy link
Contributor

I can't reproduce. Do you have a stand-alone test or at least repro instructions? I don't want to configure nginx, unless you have a Docker container or something I can use pre-configured.

But I also just noticed you set "AllowHTTP: true" and your DialTLS isn't dialing TLS, so it seems you're doing something non-standard and less supported. You're at least doing something which doesn't affect the standard library, which doesn't use either of those things.

@bradfitz bradfitz modified the milestones: Unreleased, Go1.7Maybe Jul 28, 2016
@bradfitz bradfitz added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 28, 2016
@rayvbr
Copy link
Author

rayvbr commented Jul 28, 2016

I use the standard nginx docker container and run it as follows:

docker run -v [FOLDER_WITH_WEB_ROOT]:/usr/share/nginx/html -v default.conf:/etc/nginx/conf.d/default.conf -p 80:80 nginx

With default.conf being the following:

error_log /dev/stderr debug;
server {
   listen       80 http2;
   server_name  localhost;
   http2_max_concurrent_streams 512;

   add_header Accept-Ranges bytes;

   location / {
      root   /usr/share/nginx/html;
      index  index.html index.htm;
   }

Then I have 20 goroutines each firing a request via the same HTTPClient (see method above). Immediately after sending the Do, I cancel some of them in another goroutine (before Do has returned). Note that in order to have this work, the RTT to the server should be sufficiently long (so that Do hasn't returned yet). If you want, I can provide an AWS hosted Nginx instance for you to experiment with.

But I also just noticed you set "AllowHTTP: true" and your DialTLS isn't dialing TLS, so it seems you're doing something non-standard and less supported. You're at least doing something which doesn't affect the standard library, which doesn't use either of those things.

I'm not sure I understand. I use AllowHTTP and DialTLS as indicated in the docs at https://godoc.org/golang.org/x/net/http2. The reason I configured them as they are is to disable TLS and use TCP. Are you saying this shouldn't work?

I did some more debugging myself and ended up in 'func (t _Transport) RoundTripOpt(req *http.Request, opt RoundTripOpt) (_http.Response, error)' in transport.go, specifically line 314-316. Given that cc.RoundTrip(req) returns errRequestCanceled because of the request being cancelled, this is interpreted as a Roundtrip failure and seems to stop the TCP connection. Note that if the request is cancelled after Do has returned, this is not the case and everything works fine.

@bradfitz
Copy link
Contributor

I use AllowHTTP and DialTLS as indicated in the docs at https://godoc.org/golang.org/x/net/http2. The reason I configured them as they are is to disable TLS and use TCP. Are you saying this shouldn't work?

They're just lesser-used and thus lesser-tested. It's new code. Those code paths aren't used by the standard library in Go (via net/http) for instance. But it should work. I would just prioritize fixing it lower than something affecting the imminent Go 1.7 release, since x/net can be fixed after, but things included in Go 1.7 cannot easily.

returns errRequestCanceled because of the request being cancelled, this is interpreted as a Roundtrip failure and seems to stop the TCP connection.

Yes, that's what RoundTrip shuld return. It is a failure to RoundTrip when there's a timeout.

The question is why it's closing the TCP connection. I suspect something's returning a ConnectionError(ErrCodeProtocol) or something. Can you run with GODEBUG=http2debug=2 (or http2debug=1 is probably sufficient) and see if the Go side tells you more?

@bradfitz bradfitz added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 30, 2016
@rayvbr
Copy link
Author

rayvbr commented Aug 3, 2016

Can you run with GODEBUG=http2debug=2 (or http2debug=1 is probably sufficient) and see if the Go side tells you more?

The following is the result of http2debug=2

2016/08/03 11:22:25 http2: Transport encoding header ":authority" = "test.tnomedialab.io"
2016/08/03 11:22:25 http2: Transport encoding header ":method" = "GET"
2016/08/03 11:22:25 http2: Transport encoding header ":path" = "tile.pasf"
2016/08/03 11:22:25 http2: Transport encoding header ":scheme" = "https"
2016/08/03 11:22:25 http2: Transport encoding header "range" = "bytes=197184-232501"
2016/08/03 11:22:25 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2016/08/03 11:22:25 http2: Framer 0xc820a269c0: wrote HEADERS flags=END_STREAM|END_HEADERS stream=5 len=59
2016/08/03 11:22:25 http2: Framer 0xc820a269c0: wrote RST_STREAM stream=5 len=4 ErrCode=CANCEL
2016/08/03 11:22:25 RoundTrip failure: net/http: request canceled

The RoundTrip failure correlates with the TCP session being closed. If I cancel the request after Do() has returned, the TCP connection isn't closed and I don't see the RoundTrip failure.

I also checked with Wireshark and it is definitely the Go client that setting the RST flag on the TCP connection, not the server. I tested with both Nginx as well as h2oserver.

gopherbot pushed a commit to golang/net that referenced this issue Aug 3, 2016
For debugging golang/go#16514

Change-Id: I8aa5706eef4e9b4104cab391172b8919601ebf3a
Reviewed-on: https://go-review.googlesource.com/25440
Reviewed-by: Chris Broadfoot <cbro@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@bradfitz
Copy link
Contributor

bradfitz commented Aug 3, 2016

@rayvbr, please go get -u golang.org/x/net/http2 and try again with GODEBUG=http2debug=2. I added more logging about why connections close.

@rayvbr
Copy link
Author

rayvbr commented Aug 4, 2016

@bradfitz, thanks!

Unfortunately, it didn't result in any other debugging lines than I had before.

I created the following stand-alone test script that you can use to reproduce the issue. If you want, I can also provide an Amazon instance with nginx installed that you can test against. Alternatively, you can use the standard Nginx docker container with the conf file I placed a few comments above.

package main

import (
    "crypto/tls"
    "io/ioutil"
    "log"
    "net"
    "net/http"
    "time"

    "github.com/golang/net/http2"
)

const cHTTPRequestURL = "http://DOMAIN/test.example"
const cNumberOfRequests = 20

func main() {
    http2.VerboseLogs = true
    transport := &http2.Transport{
        DisableCompression: true,
        AllowHTTP:          true,
        DialTLS: func(network, addr string, cfg *tls.Config) (net.Conn, error) {
            return net.Dial(network, addr)
        },
    }
    httpClient := http.Client{Transport: transport}

    count := 0

    requests := make([]chan struct{}, cNumberOfRequests)
    for i := 0; i < cNumberOfRequests; i++ {
        httpRequest, _ := http.NewRequest("GET", cHTTPRequestURL, nil)
        requests[i] = make(chan struct{})
        httpRequest.Cancel = requests[i]
        go httpDo(httpClient, httpRequest, count)
        count++
    }
    for i := 0; i < cNumberOfRequests; i++ {
        close(requests[i])
    }
    time.Sleep(time.Duration(10) * time.Second)
}

func httpDo(client http.Client, httpRequest *http.Request, count int) {
    log.Printf("Requesting...%d\n", count)
    response, httpError := client.Do(httpRequest)
    if httpError != nil {
        log.Printf("HTTP Error %d: %s\n", count, httpError)
        return
    }
    defer response.Body.Close()

    var ioErr error
    _, ioErr = ioutil.ReadAll(response.Body)

    if ioErr != nil {
        log.Printf("Body read error %d: %s\n", count, ioErr)
        return
    }
    log.Printf("Done %d\n", count)
    return
}

@bradfitz
Copy link
Contributor

bradfitz commented Aug 4, 2016

I ran the nginx docker container with your config and tried your repro program from different machines at differing network latencies away from my server.

I cannot reproduce.

Are you sure you're running the latest code? Please show me all the logs.

I note also you're importing from github, not golang.org/x/net/http2. That's a red flag. Please use the golang.org/x/net/http2 code. Maybe you updated the wrong one and are using the old code?

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 4, 2016
@bradfitz bradfitz changed the title x/net/http2: RST_STREAM before receiving response headers result in closing tcp connection x/net/http2: Transport connection closes sometimes after Request.Cancel, against nginx? Aug 4, 2016
@rayvbr
Copy link
Author

rayvbr commented Aug 5, 2016

I did some new tests with golang.org/x/net/http2, but this doesn't seem to change anything.

Log of the client side: http://pastebin.com/WeEQs37L
Log of the server side: http://pastebin.com/7qbpz7Lm

Edit: updated logs with higher debugging level

@rayvbr
Copy link
Author

rayvbr commented Aug 14, 2016

The problem doesn't seem to be caused by not using TLS. I've just re-tested with the following code, and an 'https://' URL, but keep having the same issues.

package main

import (
    "io/ioutil"
    "log"
    "net/http"
    "time"

    "golang.org/x/net/http2"
)

const cHTTPRequestURL = "https://EXAMPLE.com/test.pasf"
const cNumberOfRequests = 20

func main() {
    http2.VerboseLogs = true
    transport := &http2.Transport{}
    httpClient := http.Client{Transport: transport}

    count := 0

    requests := make([]chan struct{}, cNumberOfRequests)
    for i := 0; i < cNumberOfRequests; i++ {
        httpRequest, _ := http.NewRequest("GET", cHTTPRequestURL, nil)
        requests[i] = make(chan struct{})
        httpRequest.Cancel = requests[i]
        go httpDo(httpClient, httpRequest, count)
        count++
    }
    for i := 0; i < cNumberOfRequests; i++ {
        close(requests[i])
    }
    time.Sleep(time.Duration(10) * time.Second)
}

func httpDo(client http.Client, httpRequest *http.Request, count int) {
    log.Printf("Requesting...%d\n", count)
    response, httpError := client.Do(httpRequest)
    if httpError != nil {
        log.Printf("HTTP Error %d: %s\n", count, httpError)
        return
    }
    defer response.Body.Close()

    var ioErr error
    _, ioErr = ioutil.ReadAll(response.Body)

    if ioErr != nil {
        log.Printf("Body read error %d: %s\n", count, ioErr)
        return
    }
    log.Printf("Done %d\n", count)
    return
}

Client log: http://pastebin.com/tJSGLfMp
Server log: http://pastebin.com/7ivBC5TM

Let me know if I can help by providing a server for you to test against.

@rayvbr
Copy link
Author

rayvbr commented Aug 14, 2016

Also note that I just tested with go1.7rc6 and the issue is still there

@bradfitz
Copy link
Contributor

Can you try this again at tip? (of either Go or x/net/http2)

If this is around cancelation, it's quite likely this was fixed by the fix to #16974

@matipan
Copy link

matipan commented Apr 22, 2017

Hi! @rayvbr have you tried this again??

@rayvbr
Copy link
Author

rayvbr commented Apr 24, 2017

Sorry, forgot to respond. The issues seems to be resolved for the regular H2 case.

That said, I'm currently experimenting with HTTP/2 Push client support, and there I'm sometimes running into the same problem again (or a different problem with the same symptoms).

c3mb0 pushed a commit to c3mb0/net that referenced this issue Apr 2, 2018
For debugging golang/go#16514

Change-Id: I8aa5706eef4e9b4104cab391172b8919601ebf3a
Reviewed-on: https://go-review.googlesource.com/25440
Reviewed-by: Chris Broadfoot <cbro@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@bradfitz
Copy link
Contributor

Timeout. Open a new bug if there are new problems.

@golang golang locked and limited conversation to collaborators Apr 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants