Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: new TestDontCacheBrokenHTTP2Conn fails #35113

Open
cuonglm opened this issue Oct 23, 2019 · 14 comments
Assignees

Comments

@cuonglm
Copy link
Contributor

@cuonglm cuonglm commented Oct 23, 2019

Build log: https://storage.googleapis.com/go-build-log/76c387a0/linux-amd64-race_356de3ff.log

--- FAIL: TestDontCacheBrokenHTTP2Conn (0.03s)
    transport_test.go:5885: for iteration 5, doBreak=false; unexpected error Get "https://127.0.0.1:35315": some write error
    transport_test.go:5889: GotConn calls = 0; want 1
FAIL
FAIL	net/http	8.568s
FAIL
2019/10/23 15:34:13 Failed: exit status 1

@cuonglm

This comment has been minimized.

Copy link
Contributor Author

@cuonglm cuonglm commented Oct 23, 2019

This happens when I test https://go-review.googlesource.com/c/go/+/202797.

Is that a known flaky one cc @bradfitz @bcmills

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 23, 2019

Sigh

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 23, 2019

This was just introduced in https://go-review.googlesource.com/c/go/+/202078 for #34978 (and backported to Go 1.13, sadly too early apparently).

Looking.

@bradfitz bradfitz self-assigned this Oct 23, 2019
@bradfitz bradfitz changed the title TestDontCacheBrokenHTTP2Conn fails net/http: new TestDontCacheBrokenHTTP2Conn fails Oct 23, 2019
@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 23, 2019

I can repro:

bradfitzlaptop http$ go test -v -run=TestDontCacheBrokenHTTP2Conn -race -count=50
=== RUN   TestDontCacheBrokenHTTP2Conn
--- PASS: TestDontCacheBrokenHTTP2Conn (0.04s)
=== RUN   TestDontCacheBrokenHTTP2Conn
--- PASS: TestDontCacheBrokenHTTP2Conn (0.04s)
=== RUN   TestDontCacheBrokenHTTP2Conn
--- PASS: TestDontCacheBrokenHTTP2Conn (0.05s)
=== RUN   TestDontCacheBrokenHTTP2Conn
--- PASS: TestDontCacheBrokenHTTP2Conn (0.04s)
=== RUN   TestDontCacheBrokenHTTP2Conn
    TestDontCacheBrokenHTTP2Conn: transport_test.go:5885: for iteration 5, doBreak=false; unexpected error Get "https://127.0.0.1:58689": some write error
    TestDontCacheBrokenHTTP2Conn: transport_test.go:5889: GotConn calls = 0; want 1
--- FAIL: TestDontCacheBrokenHTTP2Conn (0.03s)
panic: Log in goroutine after TestDontCacheBrokenHTTP2Conn has completed

goroutine 205 [running]:
testing.(*common).logDepth(0xc000192100, 0xc00002a460, 0x4c, 0x3)
	/Users/bradfitz/go/src/testing/testing.go:679 +0x67c
testing.(*common).log(...)
	/Users/bradfitz/go/src/testing/testing.go:661
testing.(*common).Errorf(0xc000192100, 0x1822436, 0x19, 0xc0001a5678, 0x1, 0x1)
	/Users/bradfitz/go/src/testing/testing.go:710 +0x90
net/http_test.TestDontCacheBrokenHTTP2Conn.func2(0x1813c27, 0x3, 0xc00047b1d0, 0xf, 0xc000238360, 0x1041a70, 0x1d25470, 0x1d25470)
	/Users/bradfitz/go/src/net/http/transport_test.go:5849 +0x123
net/http.(*Transport).dial(0xc000384000, 0x1917980, 0xc00046b860, 0x1813c27, 0x3, 0xc00047b1d0, 0xf, 0x0, 0x205, 0x0, ...)
	/Users/bradfitz/go/src/net/http/transport.go:1035 +0xf8
net/http.(*Transport).dialConn(0xc000384000, 0x1917980, 0xc00046b860, 0x0, 0xc0003b8500, 0x5, 0xc00047b1d0, 0xf, 0x0, 0x0, ...)
	/Users/bradfitz/go/src/net/http/transport.go:1457 +0x2b1c
net/http.(*Transport).dialConnFor(0xc000384000, 0xc0003746e0)
	/Users/bradfitz/go/src/net/http/transport.go:1300 +0x150
created by net/http.(*Transport).queueForDial
	/Users/bradfitz/go/src/net/http/transport.go:1269 +0x68b
exit status 2
FAIL	net/http	0.386s
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 23, 2019

and backported to Go 1.13, sadly too early apparently

Note that #35087 never actually got the CherryPickApproved label. I was kind of intentionally holding off on pinging it to give the CL a bit of soak time, but I guess I should have been more explicit about that.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Oct 23, 2019

Soaking before cherry-picking is new to me. That's a thing we do now? I mean, not a terrible idea, but first I'd heard.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 23, 2019

It's a thing I've been doing informally, at least. (#31887) Not a formal policy, as far as I'm aware.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 23, 2019

At any rate: for the race condition!

Would it make sense to add synchronization between the Dial hook and the loop, so that we know that all of the Conns returned by Dial have been Closed before we move to the next iteration? I suspect that the problem here is the aliasing on brokenState.

@cuonglm

This comment has been minimized.

Copy link
Contributor Author

@cuonglm cuonglm commented Oct 24, 2019

@bradfitz not sure it's related, but if I apply this patch, the test won't fail even with -count=500:

diff --git a/src/net/http/transport_test.go b/src/net/http/transport_test.go
index f76530b8fa..b60c164630 100644
--- a/src/net/http/transport_test.go
+++ b/src/net/http/transport_test.go
@@ -5871,9 +5871,7 @@ func TestDontCacheBrokenHTTP2Conn(t *testing.T) {
                        TLSHandshakeDone: func(cfg tls.ConnectionState, err error) {
                                brokenState.Lock()
                                defer brokenState.Unlock()
-                               if doBreak {
-                                       brokenState.broken = true
-                               }
+                               brokenState.broken = doBreak
                        },
                })
                req, err := NewRequestWithContext(ctx, "GET", cst.ts.URL, nil)
@cuonglm

This comment has been minimized.

Copy link
Contributor Author

@cuonglm cuonglm commented Oct 25, 2019

@bradfitz not sure it's related, but if I apply this patch, the test won't fail even with -count=500:

diff --git a/src/net/http/transport_test.go b/src/net/http/transport_test.go
index f76530b8fa..b60c164630 100644
--- a/src/net/http/transport_test.go
+++ b/src/net/http/transport_test.go
@@ -5871,9 +5871,7 @@ func TestDontCacheBrokenHTTP2Conn(t *testing.T) {
                        TLSHandshakeDone: func(cfg tls.ConnectionState, err error) {
                                brokenState.Lock()
                                defer brokenState.Unlock()
-                               if doBreak {
-                                       brokenState.broken = true
-                               }
+                               brokenState.broken = doBreak
                        },
                })
                req, err := NewRequestWithContext(ctx, "GET", cst.ts.URL, nil)

The test still fails at -count=1000

@cuonglm

This comment has been minimized.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Oct 31, 2019

linux-386-clang: https://build.golang.org/log/9c05e68c70e7bf8388daf62daf8e777c4ccec143

--- FAIL: TestDontCacheBrokenHTTP2Conn (0.02s)
    transport_test.go:5889: for iteration 5, doBreak=false; unexpected error Get "https://127.0.0.1:45761": some write error
    transport_test.go:5893: GotConn calls = 0; want 1
    transport_test.go:5853: unexpected Dial error: dial tcp 127.0.0.1:45761: connect: connection reset by peer
@bcmills bcmills closed this Nov 12, 2019
@bcmills bcmills reopened this Nov 12, 2019
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 12, 2019

Change https://golang.org/cl/206819 mentions this issue: net/http: add some debugging to TestDontCacheBrokenHTTP2Conn

gopherbot pushed a commit that referenced this issue Nov 13, 2019
Not a fix, but will give us more info when it flakes again.

Updates #35113

Change-Id: I2f90c24530c1bea81dd9d8c7a59f4b0640dfa4c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/206819
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.