-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: hang waiting for dead process on macOS 10.12 #18540
Comments
I get it to hang when I hit Ctrl+C during
|
@rfjakob, your trace is stuck reading from the pipe holding the subprocess output, not waiting for the subprocess itself. On the Mac, the subprocess had exited and yet wait did not return. On your system it looks like maybe the process was still running, since the pipe had not been closed yet. Next time it happens, before the ^C, please run 'ps axwwuf' and look for the hung go command and see if it has any children still. Thanks. |
I think a similar thing is happening on the macOS hosted android/arm builder. I can reproduce it manually by enough runs of
The hang is in builderRunTest, in the section that makes sure a wedged test binary is killed:
The interesting goroutine is 10, which is hanging on
in
Even though process was killed with What's interesting (to me) is that cmd.Wait() has completed cmd.Process.Wait() and is blocked on
Which leads us to goroutine 52:
So why didn't cmd.Process.Kill() also close the process' file descriptors, unblocking goroutine 52? I'm stuck here, but I can reproduce the hang fairly consistently, so any debugging hints would be much appreciated. |
I think this is an instance of cmd/go hitting a version of #18874, which means that cmd/go should probably change to not depend on cmd.Wait(). |
CL https://golang.org/cl/42270 mentions this issue. |
The original report on this issue is not an example of #18874. In that stack trace we can see that the program is sleeping in The original report was filed on January 6. On February 28 we stopped using I'm going to close this issue as unreproduceable, possibly fixed, with the other cases being reported as dups of #18874. |
Running the net/http tests with -v -short in a loop. Got one that hung for 400 minutes. The actual test binary failed after 10 minutes, but somehow the parent go command was wedged. It ate the ^C (because it knew it was waiting for the subprocess). Eventually kill -ABRT (63 minutes later) dumped the following:
The text was updated successfully, but these errors were encountered: