Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestDialParallel is flaky on windows-amd64-longtest #35616

Closed
bcmills opened this issue Nov 15, 2019 · 2 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 15, 2019

net.TestDialParallel is flaky on the windows-amd64-longtest builder.

As far as I can tell, the problem is just unexpected timing jitter:

--- FAIL: TestDialParallel (7.08s)
    dial_test.go:323: #5: got 1.0000027s; want >= 1.0124168s
    dial_test.go:323: #7: got 1.2002253s; want >= 1.2124168s
    dial_test.go:323: #10: got 1.0009746s; want >= 1.0124168s
FAIL
FAIL	net	59.274s

2019-11-15T02:31:58-498eaee/windows-amd64-longtest
2019-11-13T15:52:21-bf49905/windows-amd64-longtest
2019-11-12T01:09:40-c2edcf4/windows-amd64-longtest
2019-11-04T18:51:13-bf7e55b/windows-amd64-longtest

CC @bradfitz @mikioh @ianlancetaylor @zx2c4 @alexbrainman

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 16, 2019

Each failure is only for some or all of the tests that use closedPortDelay. That is measured before running the tests by actually calling Dial to connect to a closed port. The test gives an error if the test produces a result that is more than 95 milliseconds faster than the measured delay. Case 7 above adds an additional 200 millisecond delay which I believe is meant to correspond to the 300 milliseconds used in (*Dialer).fallbackDelay.

In the case above the measured closedPortDelay was evidently about 1.1 seconds. The results of the test were then about 1 second, and the test failed because the difference was more than 95 milliseconds.

Looks like the use 95 milliseconds was added in patchset 8 of https://golang.org/cl/8768, along with the comment

The issue on Windows is that connecting to a closed port always takes 1 second to fail, instead of being instantaneous like Linux. This is not specific to Go, and there's no obvious way around it.

So I've updated TestDialParallel to measure the delay, and expect Windows to have different timing behavior.

For the four failures listed above, the largest closedPortDelay is 1.1125066s. The shortest time for a failed connection during the actual test is 1.0000027s. The difference between those is 112.5ms, which is more than the expected maximum difference of 95ms.

I can't see any special meaning to 95ms, so I'll send a CL to push up the value on Windows.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 16, 2019

Change https://golang.org/cl/207466 mentions this issue: net: add more timing slop for TestDialParallel on Windows

@gopherbot gopherbot closed this in c20b71e Nov 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.