-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
Waiting on a sub-process in Go (or through Process.Wait or syscall.Wait4) does not align well with multi-threaded goroutine-based Go programs. At least on Linux they call into waitpid syscall (or one of related ones like waitid) which are designed that a process can call wait on a subprocess only once. While writing my init project in Go I realized this is a limitation which makes it hard to make whole program modular. You cannot have multiple different parts of a program observe status of subprocesses, e.g., one for logging what is happening with all subprocesses, another for spawning and waiting for programs, third for reaping unknown zombies who got reparented to the init process, and lastly if your init uses ptrace to inspect those unknown processes, you even hit bugs #60321 and in combination with reaping loop you have a problem that it might be the reaping loop which gets its wait woke up after you attach to the subprocess.
In short, currently having in a Go program just one process.Wait() and another syscall.Wait4(-1, ...) create an inherently racy execution which makes it hard to have a modular program. E.g., installing go-reaper package inherently breaks any process.Wait, sometimes, randomly, in hard to debug manner.
Because of this I propose that a new API is introduced which hides away the complexities of waiting on sub-processes in a similar way how os.signal package does for signal handling. Signal handling also has similar limitations at a low level, but because there is a Go abstraction which maps them into channels, any part of a program can hook into signals without interfering with the rest of the program. I propose the following API:
// WaitCode is enumeration of possible process states.
type WaitCode int
const (
// Process has terminated normally.
ProcessExited WaitCode = iota
// Process was terminated by a signal.
ProcessSignaled
// Process was stopped by delivery of a signal, but not trapped.
ProcessStopped
// Process has stopped and trapped.
ProcessTrapped
// Child continued.
ProcessContinued
// Process was killed by a signal and dumped core.
ProcessDumped
)
// WaitInfo provides information about the state of the process after waiting.
type WaitInfo struct {
Id int // Pid or pidfd
Status int
Code WaitCode
}
// Exited returns true if the process has terminated normally.
func (i WaitInfo) Exited() bool { return i.Code == ProcessExited }
// Signaled returns true if the process was terminated by a signal.
func (i WaitInfo) Signaled() bool { return i.Code == ProcessSignaled }
// Stopped returns true if the process was stopped by delivery of a signal, but not trapped.
func (i WaitInfo) Stopped() bool { return i.Code == ProcessStopped }
// Trapped returns true if the process has stopped and trapped.
func (i WaitInfo) Trapped() bool { return i.Code == ProcessTrapped }
// Continued returns true if the process has continued.
func (i WaitInfo) Continued() bool { return i.Code == ProcessContinued }
// CoreDump returns true if the process was killed by a signal and dumped core.
func (i WaitInfo) CoreDump() bool { return i.Code == ProcessDumped }
func (i WaitInfo) ExitStatus() int {
if !i.Exited() {
return -1
}
return i.Status
}
func (i WaitInfo) Signal() syscall.Signal {
if !i.Signaled() {
return -1
}
return syscall.Signal(i.Status)
}
func (i WaitInfo) StopSignal() syscall.Signal {
if !i.Stopped() {
return -1
}
return syscall.Signal(i.Status)
}
func (i WaitInfo) TrapCause() int {
if !i.Trapped() {
return -1
}
return i.Status
}
// Wait waits on the process with pid core one or more state transitions to those listed.
// Once transition to that state happens, a WaitInfo is sent to the provided channel.
// Wait can be called multiple times on the same process with an arbitrary combination
// of codes to wait on. When context is canceled, waiting gets canceled as well.
func Wait(ctx context.Context, pid int, c chan<- WaitInfo, code ...WaitCode) {
}
// WaitPidfd is similar to Wait, just that it operates on process file handles available on Linux.
func WaitPidfd(pidfd int, c chan<- WaitInfo, code ...WaitCode) {
}
// Waits on any process to transition to states listed. Once transition to that state happens,
// a WaitInfo is sent to the provided channel.Can be called multiple times with an arbitrary
// combination of codes to wait on. When context is canceled, waiting gets canceled as well.
func WaitAny(c chan<- WaitInfo, code ...WaitCode) {
}WaitPidfd is Linux specific. Maybe it could be extended to Windows process handles as well?
Current possible list of states/codes are inspired by unix. Not all of them will be relevant on other platforms. Maybe there are other states we should add other platforms have, but I am not familiar with them? I think possible states should be an union of what is available on supported platforms because the API is cross-platform.
I use context instead of explicit Stop like signal package has as I find it more modern and suitable for a new API. We could provide both or just Stop if others feel differently.
WaitInfo is inspired by WaitStatus and API is similar on purpose.
This has to be in the core Go library as os.exec and os.Process should use this internally (any anything else waiting on processes in stdlib), otherwise we didn't fix the main issue.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status