New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestDialerDualStack fails with "i/o timeout" or "got 5.81849453s; want <= 95ms" #13324

Open
mdempsky opened this Issue Nov 19, 2015 · 14 comments

Comments

Projects
None yet
10 participants
@mdempsky
Member

mdempsky commented Nov 19, 2015

TestDialerDualStack is commonly flaking on OpenBSD and Windows, albeit with different error messages:

https://www.google.com/search?q=TestDialerDualStack+openbsd+site%3Abuild.golang.org

--- FAIL: TestDialerDualStack (14.78s)
    dial_test.go:633: got 5.81849453s; want <= 95ms

https://www.google.com/search?q=TestDialerDualStack+windows+site%3Abuild.golang.org

--- FAIL: TestDialerDualStack (5.30s)
    dial_test.go:664: dial tcp 127.0.0.1:2651: i/o timeout

@mdempsky mdempsky added this to the Go1.7 milestone Nov 19, 2015

@mikioh mikioh added the Testing label May 11, 2016

@mikioh mikioh changed the title from net: TestDialerDualStack flaky on OpenBSD and Windows to net: TestDialerDualStack fails with "i/o timeout" or "got 5.81849453s; want <= 95ms" May 19, 2016

@mikioh

This comment has been minimized.

Contributor

mikioh commented May 19, 2016

This looks different from #15316, #15574. It happened on active open side, and it took 5s for a simple dial call, strange.

@mikioh mikioh modified the milestones: Go1.7Maybe, Go1.7 May 19, 2016

@mikioh

This comment has been minimized.

Contributor

mikioh commented May 19, 2016

Funny, I got "PASS: TestDialerDualStack (5.99s)" when I ran TestDialerDualStack on openbsd59-vm like the following:

go test -run=TestDialerDualStack$ -v -count=100 net
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (0.00s)
(snip)
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (5.99s)
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (0.00s)
=== RUN   TestDialerDualStack
PASS
Socket statistical information:
(inet4, stream, default): opened=900 connected=200 listened=300 accepted=0 closed=900 openfailed=0 connectfailed=400 listenfailed=0 acceptfailed=0 closefailed=0
(inet6, stream, default): opened=400 connected=200 listened=200 accepted=50 closed=450 openfailed=0 connectfailed=0 listenfailed=0 acceptfailed=100 closefailed=0

and this is not related to runtime/GC stuff because I got the same result when GOGC=off.

@mikioh

This comment has been minimized.

Contributor

mikioh commented May 19, 2016

Hm,

openbsd-amd64:~/$ go test -run=DualStack$ -count=50 net
ok      net     6.042s
openbsd-amd64:~/$ go test -run=DualStack$ -count=100 net
ok      net     18.048s

dragonfly-amd64:~/$ go test -run=DualStack$ -count=50 net
ok      net 0.075s
dragonfly-amd64:~/$ go test -run=DualStack$ -count=100 net
ok      net 1.136s
@bradfitz

This comment has been minimized.

@adg adg assigned mikioh and unassigned mikioh Jun 27, 2016

@adg

This comment has been minimized.

Contributor

adg commented Jun 27, 2016

cc @aclements, would you please run your failure analysis tool against this?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 16, 2016

Sent https://go-review.googlesource.com/24985 to demote this to a flaky test for Go 1.7. Kicking this bug down the road to Go 1.8.

It would be great if somebody (@mikioh?) could investigate this.

@bradfitz bradfitz modified the milestones: Go1.8, Go1.7Maybe Jul 16, 2016

@gopherbot

This comment has been minimized.

gopherbot commented Jul 17, 2016

CL https://golang.org/cl/24985 mentions this issue.

gopherbot pushed a commit that referenced this issue Jul 17, 2016

net: demote TestDialerDualStack to a flaky test
Only run TestDialerDualStack on the builders, as to not annoy or
otherwise distract users when it's not their fault.

Even though the intention is to only run this on the builders, very
few of the builders have IPv6 support. Oh well. We'll get some
coverage.

Updates #13324

Change-Id: I13e7e3bca77ac990d290cabec88984cc3d24fb67
Reviewed-on: https://go-review.googlesource.com/24985
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
@mikioh

This comment has been minimized.

Contributor

mikioh commented Jul 17, 2016

Not sure if I have a spare time for this issue. FWIW, the root cause of this bug probably may lie down on the protocol stack inside kernel (IP protocol control block bookkeepers and IO event notifiers), the runtime package (are we missing some important thing on epoll/kevent usage?), or the net package (are we testing it in the wrong direction?). The first analysis would just dissect the timeout TCP connection, by scraping the TCP information inside the kernel out.

@rsc rsc modified the milestones: Go1.9, Go1.8 Nov 11, 2016

@bradfitz

This comment has been minimized.

Member

bradfitz commented Mar 23, 2017

And Darwin.

I'm going to disable the test. It's not proving its worth.

@gopherbot

This comment has been minimized.

gopherbot commented Mar 23, 2017

CL https://golang.org/cl/38459 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 23, 2017

net: mark TestDialerDualStack as flaky
It was already marked flaky for everything but the dashboard.
Remove that restriction. It's just flaky overall.

It's doing more harm than good.

Updates #13324

Change-Id: I36feff32a1b8681e77700f74b9c70cb4073268eb
Reviewed-on: https://go-review.googlesource.com/38459
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

@broady broady modified the milestones: Go1.9Maybe, Go1.9 Jul 17, 2017

@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.10 Jul 20, 2017

@gopherbot

This comment has been minimized.

gopherbot commented Dec 8, 2017

Change https://golang.org/cl/82916 mentions this issue: net: increase timeout for TestDialerDualStackFDLeak

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Dec 8, 2017

Sent https://golang.org/cl/82916 for the TestDialerDualStackFDLeak failures. Dropping the releaseblocker label and moving to unplanned.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.10, Unplanned Dec 8, 2017

gopherbot pushed a commit that referenced this issue Dec 8, 2017

net: increase timeout for TestDialerDualStackFDLeak
This test has been getting occasional timeouts on the race builder.
The point of the test is whether a file descriptor leaks, not whether
the connection occurs in a certain amount of time. So use a very large
timeout. The connection is normally fast and the timeout doesn't matter.

Updates #13324

Change-Id: Ie1051c4a0be1fca4e63b1277101770be0cdae512
Reviewed-on: https://go-review.googlesource.com/82916
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment