Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
net: TestDialTimeout flake #11872
I've been running run.bash in a loop on my linux/amd64 workstation for the past few days and out of ~500 runs, I've had three failures of TestDialTimeout with:
It has my GC changes from CL 12674, but I don't think those are related. The latest commit from master is ae1ea2a. This shows up occasionally on the dashboard as well:
This may be related to #11474, though the error sounds different.
It's probably because the test cases, both TestDialTimeout and TestDialTimeoutFDLeak, depend on a wrong assumption with runtime scheduler. Looks like the flakiness in net/http is irrelevant, though.
Thanks for the confirmation. The root cause comes from the current corner-cutting socktest package implementation. It tries to track socket calls with socket descriptor numbers as a key for testing, and has no care about "quick socket descriptor number recycling" for simplicity. Therefore it may confuse socket descriptors in a situation like the following:
This may happen in selfConnect, in the case of TCP simultaneous open, and Linux is one of the platforms that can make TCP simultaneous open happen easily.