Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: TestTransportMaxConnsPerHost flakes #31784

Closed
bradfitz opened this issue May 1, 2019 · 14 comments
Closed

net/http: TestTransportMaxConnsPerHost flakes #31784

bradfitz opened this issue May 1, 2019 · 14 comments
Labels
Milestone

Comments

@bradfitz
Copy link
Member

@bradfitz bradfitz commented May 1, 2019

I've seen this a few times now. @fraenkel, could you take a look?

https://storage.googleapis.com/go-build-log/b6c795d5/android-amd64-emu_95c3cb1f.log

--- FAIL: TestTransportMaxConnsPerHost (0.05s)
    transport_test.go:638: request failed: Get https://127.0.0.1:34180: write tcp 127.0.0.1:37680->127.0.0.1:34180: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45068: read tcp 127.0.0.1:36221->127.0.0.1:45068: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45070: read tcp 127.0.0.1:36221->127.0.0.1:45070: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45074: read tcp 127.0.0.1:36221->127.0.0.1:45074: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45066: read tcp 127.0.0.1:36221->127.0.0.1:45066: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45072: read tcp 127.0.0.1:36221->127.0.0.1:45072: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45069: read tcp 127.0.0.1:36221->127.0.0.1:45069: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45071: read tcp 127.0.0.1:36221->127.0.0.1:45071: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45073: read tcp 127.0.0.1:36221->127.0.0.1:45073: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45067: write tcp 127.0.0.1:36221->127.0.0.1:45067: use of closed network connection
2019/05/01 16:13:25 http: TLS handshake error from 127.0.0.1:45080: write tcp 127.0.0.1:36221->127.0.0.1:45080: use of closed network connection
FAIL
exitcode=1go_android_exec: adb exec-out rm -rf /data/local/tmp/go_android_exec/http.test-20892
@bradfitz bradfitz added this to the Go1.13 milestone May 1, 2019
@bradfitz

This comment has been minimized.

Copy link
Member Author

@bradfitz bradfitz commented May 1, 2019

Oh, and ignore the http: TLS handshake error from spam. That's an unrelated issue that needs fixing. (Tests shouldn't spam)

The concerning part here is only:

--- FAIL: TestTransportMaxConnsPerHost (0.05s)
    transport_test.go:638: request failed: Get https://127.0.0.1:34180: write tcp 127.0.0.1:37680->127.0.0.1:34180: use of closed network connection
@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 2, 2019

on the plus side, I can reproduce it

$ ~/workspace/go/bin/go test -run TestTransportMaxConnsPerHost -count=100
--- FAIL: TestTransportMaxConnsPerHost (0.01s)
    transport_test.go:638: request failed: Get https://127.0.0.1:39561: write tcp 127.0.0.1:36518->127.0.0.1:39561: use of closed network connection
@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 2, 2019

another oddity that randomly appears when the other paases

$ ~/workspace/go/bin/go test -run TestTransportMaxConnsPerHost -count=100 -race
--- FAIL: TestTransportMaxConnsPerHost (0.03s)
    transport_test.go:662: Too many get connections (http2): 2
    transport_test.go:676: Too many get connections (http2): 3
--- FAIL: TestTransportMaxConnsPerHost (0.03s)
    transport_test.go:662: Too many get connections (http2): 2
    transport_test.go:676: Too many get connections (http2): 3
FAIL
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented May 2, 2019

Change https://golang.org/cl/175097 mentions this issue: net/http: skip flaky TestTransportMaxConnsPerHost for now

gopherbot pushed a commit that referenced this issue May 2, 2019
Updates #31784

Change-Id: Iee056c850c03939606b227a12715c76b0339d268
Reviewed-on: https://go-review.googlesource.com/c/go/+/175097
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 3, 2019

@bradfitz I know how to make the test pass, but it is just papering over a potential issue.
The failing client is the 11th one. We have created a bit of a race between the current connection being closed and what connection the next client gets. It is quite a lot of code before MarkDead() is called in http2, and that window between Close() and MarkDead() allows any client to retrieve a connection that is closed.
The test case is simulating a network failure. I decided to try and make it cleaner by adding transport.CloseIdleConnections after the conn.Close(). Much to my surprise we end up in a hang at transport.go: line 1030.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 3, 2019

I don't see how to fix the CloseIdleConnections which would have fixed the testcase.
The transport tracks host connection counts. When we close idle connections, the counts are not corrected for http2. We don't know the connect method key, nor do we know which connections are being closed on the http2 side.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 3, 2019

I spoke too soon. We have the connection key and we can remove the host connection count. However, we really have no idea if the http2 connection is really closed or not. But we have marked the persistConn has broken regardless.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 3, 2019

I have a fix coming. The definition of MaxConnPerHost once you CloseIdleConnections for http/2 is not always exact but it does attempt to do the right thing.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented May 3, 2019

Change https://golang.org/cl/174950 mentions this issue: net/http: TestTransportMaxConnsPerHost flakes

@gopherbot gopherbot closed this in 0a4d352 May 3, 2019
@cuonglm

This comment has been minimized.

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 11, 2019

@cuonglm This is a different issue. Its the one I saw here #31784 (comment)

@cuonglm

This comment has been minimized.

Copy link
Contributor

@cuonglm cuonglm commented May 11, 2019

@fraenkel do we have an issue to track it already?

@fraenkel

This comment has been minimized.

Copy link
Contributor

@fraenkel fraenkel commented May 11, 2019

Not that I know of.

@cuonglm

This comment has been minimized.

Copy link
Contributor

@cuonglm cuonglm commented May 11, 2019

@fraenkel I created one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.