Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestDialerDualStack fails with "i/o timeout" or "got 5.81849453s; want <= 95ms" #13324

Open
mdempsky opened this issue Nov 19, 2015 · 14 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-OpenBSD OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@mdempsky
Copy link
Member

TestDialerDualStack is commonly flaking on OpenBSD and Windows, albeit with different error messages:

https://www.google.com/search?q=TestDialerDualStack+openbsd+site%3Abuild.golang.org

--- FAIL: TestDialerDualStack (14.78s)
    dial_test.go:633: got 5.81849453s; want <= 95ms

https://www.google.com/search?q=TestDialerDualStack+windows+site%3Abuild.golang.org

--- FAIL: TestDialerDualStack (5.30s)
    dial_test.go:664: dial tcp 127.0.0.1:2651: i/o timeout
@mdempsky mdempsky added this to the Go1.7 milestone Nov 19, 2015
@mikioh mikioh added the Testing An issue that has been verified to require only test changes, not just a test failure. label May 11, 2016
@mikioh mikioh changed the title net: TestDialerDualStack flaky on OpenBSD and Windows net: TestDialerDualStack fails with "i/o timeout" or "got 5.81849453s; want <= 95ms" May 19, 2016
@mikioh
Copy link
Contributor

mikioh commented May 19, 2016

This looks different from #15316, #15574. It happened on active open side, and it took 5s for a simple dial call, strange.

@mikioh mikioh modified the milestones: Go1.7Maybe, Go1.7 May 19, 2016
@mikioh
Copy link
Contributor

mikioh commented May 19, 2016

Funny, I got "PASS: TestDialerDualStack (5.99s)" when I ran TestDialerDualStack on openbsd59-vm like the following:

go test -run=TestDialerDualStack$ -v -count=100 net
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (0.00s)
(snip)
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (5.99s)
=== RUN   TestDialerDualStack
--- PASS: TestDialerDualStack (0.00s)
=== RUN   TestDialerDualStack
PASS
Socket statistical information:
(inet4, stream, default): opened=900 connected=200 listened=300 accepted=0 closed=900 openfailed=0 connectfailed=400 listenfailed=0 acceptfailed=0 closefailed=0
(inet6, stream, default): opened=400 connected=200 listened=200 accepted=50 closed=450 openfailed=0 connectfailed=0 listenfailed=0 acceptfailed=100 closefailed=0

and this is not related to runtime/GC stuff because I got the same result when GOGC=off.

@mikioh
Copy link
Contributor

mikioh commented May 19, 2016

Hm,

openbsd-amd64:~/$ go test -run=DualStack$ -count=50 net
ok      net     6.042s
openbsd-amd64:~/$ go test -run=DualStack$ -count=100 net
ok      net     18.048s

dragonfly-amd64:~/$ go test -run=DualStack$ -count=50 net
ok      net 0.075s
dragonfly-amd64:~/$ go test -run=DualStack$ -count=100 net
ok      net 1.136s

@bradfitz
Copy link
Contributor

@adg adg assigned mikioh and unassigned mikioh Jun 27, 2016
@adg
Copy link
Contributor

adg commented Jun 27, 2016

cc @aclements, would you please run your failure analysis tool against this?

@bradfitz
Copy link
Contributor

Sent https://go-review.googlesource.com/24985 to demote this to a flaky test for Go 1.7. Kicking this bug down the road to Go 1.8.

It would be great if somebody (@mikioh?) could investigate this.

@bradfitz bradfitz modified the milestones: Go1.8, Go1.7Maybe Jul 16, 2016
@gopherbot
Copy link

CL https://golang.org/cl/24985 mentions this issue.

gopherbot pushed a commit that referenced this issue Jul 17, 2016
Only run TestDialerDualStack on the builders, as to not annoy or
otherwise distract users when it's not their fault.

Even though the intention is to only run this on the builders, very
few of the builders have IPv6 support. Oh well. We'll get some
coverage.

Updates #13324

Change-Id: I13e7e3bca77ac990d290cabec88984cc3d24fb67
Reviewed-on: https://go-review.googlesource.com/24985
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
@mikioh
Copy link
Contributor

mikioh commented Jul 17, 2016

Not sure if I have a spare time for this issue. FWIW, the root cause of this bug probably may lie down on the protocol stack inside kernel (IP protocol control block bookkeepers and IO event notifiers), the runtime package (are we missing some important thing on epoll/kevent usage?), or the net package (are we testing it in the wrong direction?). The first analysis would just dissect the timeout TCP connection, by scraping the TCP information inside the kernel out.

@quentinmit quentinmit added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 7, 2016
@rsc rsc modified the milestones: Go1.9, Go1.8 Nov 11, 2016
@bradfitz
Copy link
Contributor

And Darwin.

I'm going to disable the test. It's not proving its worth.

@gopherbot
Copy link

CL https://golang.org/cl/38459 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 23, 2017
It was already marked flaky for everything but the dashboard.
Remove that restriction. It's just flaky overall.

It's doing more harm than good.

Updates #13324

Change-Id: I36feff32a1b8681e77700f74b9c70cb4073268eb
Reviewed-on: https://go-review.googlesource.com/38459
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
@broady broady modified the milestones: Go1.9Maybe, Go1.9 Jul 17, 2017
@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.10 Jul 20, 2017
@gopherbot
Copy link

Change https://golang.org/cl/82916 mentions this issue: net: increase timeout for TestDialerDualStackFDLeak

@ianlancetaylor
Copy link
Contributor

Sent https://golang.org/cl/82916 for the TestDialerDualStackFDLeak failures. Dropping the releaseblocker label and moving to unplanned.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.10, Unplanned Dec 8, 2017
gopherbot pushed a commit that referenced this issue Dec 8, 2017
This test has been getting occasional timeouts on the race builder.
The point of the test is whether a file descriptor leaks, not whether
the connection occurs in a certain amount of time. So use a very large
timeout. The connection is normally fast and the timeout doesn't matter.

Updates #13324

Change-Id: Ie1051c4a0be1fca4e63b1277101770be0cdae512
Reviewed-on: https://go-review.googlesource.com/82916
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-OpenBSD OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

10 participants