You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When buildlet times out, some processes remain in background and they need to be killed manually.
It seems that buildlet only kills its first children and not all its offsprings.
It occurs on aix/ppc64 builder. But I think it's a more general issue.
Here are some processes remaining after builds had failed on aix/ppc64. cmd/buildlet seems to only kill "go tool dist test ... "
Looking at the code, killProcessTree is called when the builder is killed by the coordinator (right?).
However, in my understanding, it only kills one process and not the whole process tree (cf https://github.com/golang/build/blob/master/cmd/buildlet/buildlet.go#L1590).
I think it should call syscall.Kill with -p.Pid in order to kill the whole process group.
I always forget how process groups & stuff work. So syscall.Kill is POSIX kill and POSIX kill with negative pid is:
If pid is negative, but not -1, sig shall be sent to all processes (excluding an unspecified set of system processes) whose process group ID is equal to the absolute value of pid, and for which the process has permission to send a signal.
Is this the behavior of everywhere where Go runs? Or do we need to do per-GOOS behavior?
Does the buildlet currently run child processes in their own process group? It probably should if not?
That is how syscall.Kill works on all Unix systems. No idea about Windows.
I don't know what the buildlet does today but it can start processes in their own process group by setting cmd.SysProcAttr.Setpgid. That should work on all Unix systems but not Windows. The pgid will then be cmd.Process.Pid, and you will want to pass the negative of that value to syscall.Kill.
(There are a few tests that themselves fiddled with the pgid of their own child processes, but it seems reasonable to not worry about those. We could handle those on GNU/Linux by setting up a PID namespace, but it hardly seems worth doing.)
and it looks like it manually finds all main process children and kill them all.
It is not perfect solution (there is a race there), but I think it is good enough. It would be nice to use Windows Job Objects https://godoc.org/github.com/alexbrainman/ps here, but that will affect the tests that buildlet runs. We do not want that.
When
buildlet
times out, some processes remain in background and they need to be killed manually.It seems that
buildlet
only kills its first children and not all its offsprings.It occurs on aix/ppc64 builder. But I think it's a more general issue.
Here are some processes remaining after builds had failed on aix/ppc64.
cmd/buildlet
seems to only kill "go tool dist test ... "Looking at the code,
killProcessTree
is called when the builder is killed by the coordinator (right?).However, in my understanding, it only kills one process and not the whole process tree (cf https://github.com/golang/build/blob/master/cmd/buildlet/buildlet.go#L1590).
I think it should call syscall.Kill with -p.Pid in order to kill the whole process group.
Maybe it's a duplicate of #15778.
The text was updated successfully, but these errors were encountered: