-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/signal: TestSignalTrace failures with "timeout after 100ms waiting for hangup" #46736
Comments
Marking as release-blocker for Go 1.17 (CC @golang/release) because this test is new as of 1.17 — we shouldn't be shipping new tests that are known to be flaky. If we can confirm that the test failures are not due to an actual regression in Go 1.17, we can add a call to |
The #45773 issue referred to exclusively linux failures. The code I was testing when I added this test case (for #44193) was linux specific, so they seemed relevant - I just hadn't seen #45773 until being cc:d on this one. I recall being concerned while developing the fix for the #44193 that I wasn't breaking the thing it tests with my change. It didn't occur to me that the pre-existing code might not be working. Further, the fact that this present bug seems to be for non-linux code: darwin and illumos, I'm fairly sure they must have a different root cause. Have we seen this issue on the 1.16 branch? Or do we believe this is a 1.17 regression? |
The new test verifies the fix for the bug found in #44193, which was similar to #43149, which was present in 1.16. So my best guess is that the underlying cause was either similar or even worse in 1.16, and the test failures indicate either an incomplete fix or a bug in the test itself. |
I'm a bit confused, so here are some details that seem relevant to me. Please correct any of them, or add some more data points if known:
So, it feels like an important data point to seek is, do we have any crash logs like this from 1.16 (after CL 316869)? If not, are we convinced that this present bug isn't purely a 1.17 regression with code paths tested by |
I don't see any failures on the dashboard for the 1.16 branch, but given how few test runs occur on that branch that doesn't tell us much. (The rate of failures at head is high enough to rule out a hardware flake, but the failures are still relatively infrequent overall.) |
The build failure logs for https://go-review.googlesource.com/c/go/+/315049/ (submitted may 4) is it reasonable to discount the |
Yes, I think it's reasonable to focus on the I don't know why (But, really, tests in the standard library should be written so that they don't assume fast hardware, because Go users in general should be able to run |
To be quite honest, when I added this test, I was just reusing the pre-existing Given these occasional timeouts, I'd be tempted to replace all the complex code in |
We currently use |
Change https://golang.org/cl/329502 mentions this issue: |
2021-06-14T07:12:37-326ea43/illumos-amd64
2021-06-13T08:17:17-24cff0f/darwin-amd64-11_0
2021-05-25T23:41:42-74242ba/illumos-amd64
2021-05-04T00:03:39-496d7c6/linux-ppc64le-buildlet
2021-05-03T16:25:05-d75fbac/linux-ppc64-buildlet
2021-04-30T20:00:36-8e91458/linux-ppc64-buildlet
2021-04-30T19:41:02-0bbfc5c/illumos-amd64
See previously #45773.
(CC @AndrewGMorgan @prattmic @ianlancetaylor)
The text was updated successfully, but these errors were encountered: