os/signal: timeout in TestAllThreadsSyscallSignals on linux-ppc64-buildlet #44193
Comments
Tentatively milestoned to Go 1.16 because the function under test is new (https://tip.golang.org/doc/go1.16#syscall). |
@laboger FYI |
Please assign this to me. I have no idea what might have happened here, yet, but I'll try to figure it out. Looking at the build.golang.org status page, is this a one-off failure? |
This does seem to have occurred only once. |
Quickly looking at the code https://tip.golang.org/src/os/signal/signal_linux_test.go it looks like the test was evaluating whether or not it should be skipped at the time of the 3 minute timeout. Capturing the following just in case it otherwise gets lost at some point.
|
#42178 (comment) concerning the ppc64 build supports the detail that this code is CGO_ENABLED=0. My recollection of working on resolving that issue was that this architecture was noticeably slower overall than the systems I typically work with. This lends some support for a timeout being more likely on this architecture. However, the test should be nowhere near 3 minutes of runtime. That being said, the code in this crash trace does not appear to have timed out while running the loop inside the test, but before that loop even starts and in the code after all the syscall parts on all the threads have run and just as code is trying to unstop the world (and reenable GC). I've not yet reproduced the failure, and the build servers appear not to have failed again. But I'm still investigating. |
@AndrewGMorgan, another theory to consider: perhaps the actual slowdown was in some other |
It's not clear to me whether this is a deadlock, a livelock, or just a slow test, but the similarity to #43149 is concerning (CC @AndrewGMorgan @ianlancetaylor).
2021-02-09T18:40:13-e9c9683/linux-ppc64-buildlet
The text was updated successfully, but these errors were encountered: