syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

virtuald · 2018-09-05T03:58:49Z

Currently, the documentation says:

// Signal that the process will get when its parent dies (Linux only)

However, according to the prctl man page:

Warning: the "parent" in this case is considered to be the thread that created this process. In other words, the signal will be sent when that thread terminates (via, for example, pthread_exit(3)), rather than after all of the threads in the parent process terminate.

I got bit by this in a python program -- started a new program on one thread, and tried to wait for it on another thread, and the child process kept dying and it took awhile to figure out what was going on. While I haven't ran into it in go yet, because in go threads and goroutines aren't one to one, I imagine if one ran into this sort of bug it would only occur intermittently.

Thinking about it, it seems like a user might want to call runtime.LockOSThread when using this?

It's not clear to me whether the docs should have a larger warning in them -- but I think at the minimum the documentation should be updated to say 'parent thread' or 'parent goroutine' instead of just 'parent'.

dominikh · 2018-09-05T04:27:27Z

If I am not mistaken, the current implementation of Go never kills threads, which is why you wouldn't be able to run into this bug in Go currently. At the same time, if Go ever does start killing threads, this might hit a lot of people at once.

virtuald · 2018-09-05T16:48:03Z

Looked around a bit, looks like go 1.10+ will kill a thread if you lock it and the goroutine exits without unlocking it: #20395

Here's an example that shows that:

package main

import (
	"flag"
	"fmt"
	"os/exec"
	"runtime"
	"syscall"
)

func main() {
	kill := false
	flag.BoolVar(&kill, "death", false, "Set deathsig")
	flag.Parse()

	cmd := exec.Command("sleep", "1000")
	if kill == true {
		fmt.Println("Death is coming")
		cmd.SysProcAttr = &syscall.SysProcAttr{
			Setpgid:   true,
			Pdeathsig: syscall.SIGKILL,
		}
	}

	// force other goroutines to be spawned on a new thread
	runtime.LockOSThread()

	done := make(chan struct{})

	go func() {
		runtime.LockOSThread()
		err := cmd.Start()
		if err != nil {
			fmt.Println(err)
		}
		close(done)
		fmt.Println("Exit goroutine")
	}()

	<-done

	fmt.Println("Waiting")
	err := cmd.Wait()
	fmt.Println("Done", err)
}

In that example, if -death is passed as an argument the program will immediately exit. Obviously, you have to go out of your way to trigger this behavior, but presumably someone who is messing with platform-specific behavior is going to want to know about platform-specific behaviors... thus why the docs should at least hint that this is possible.

virtuald · 2020-10-21T16:39:41Z

Circling back to this, I did eventually figure out the right way to use Pdeathsig, and I think the documentation should include some examples or discussion about this.

Here's an example of what people usually want to do when using Pdeathsig -- ensure the child dies when the parent dies. This example ensures that golang won't accidentally kill the OS thread associated with the child process.

package main

import (
	"fmt"
	"os/exec"
	"runtime"
	"syscall"
)

func main() {

	done := make(chan struct{})

	cmd := exec.Command("sleep", "1000")
	cmd.SysProcAttr = &syscall.SysProcAttr{
		Setpgid:   true,
		Pdeathsig: syscall.SIGKILL,
	}

	go func() {
		// On Linux, pdeathsig will kill the child process when the thread dies,
		// not when the process dies. runtime.LockOSThread ensures that as long
		// as this function is executing that OS thread will still be around
		runtime.LockOSThread()
		defer runtime.UnlockOSThread()

		err := cmd.Start()
		if err != nil {
			fmt.Println(err)
		}

		fmt.Println("Child is PID", cmd.Process.Pid)
		err = cmd.Wait()
		close(done)
		fmt.Println("Child exited:", err)
	}()

	fmt.Println("Waiting for child to exit")
	<-done
	fmt.Println("Parent exiting")
}

gopherbot · 2022-06-14T08:14:22Z

Change https://go.dev/cl/412114 mentions this issue: syscall: clarify Pdeathsig documentation on Linux

cespare · 2022-06-14T23:25:49Z

@ianlancetaylor On the CL you wrote

and it's not clear to me why the workaround works in the general case

(The workaround being to use LockOSThread in the goroutine that creates the child process to ensure that that thread can't be killed.)

Can you say more about why it might not work?

ianlancetaylor · 2022-06-14T23:49:04Z

The CL says " A workaround is to call LockOSThread just before starting the new process and UnlockOSThread after it finishes." But the thread can then later, before the child process exits, pick up another goroutine, that goroutine can call LockOSThread, and then that goroutine can exit. That will cause the thread to exit, and cause a signal to be sent to the child process, although the parent process overall is still running. Unless I misunderstand something.

cespare · 2022-06-15T01:33:44Z

But the thread can then later, before the child process exits, pick up another goroutine,

Isn't the point of LockOSThread that this cannot happen?

Tasssadar · 2022-06-15T09:26:06Z

The idea is that the goroutine locks to thread, and then just waits and does nothing until the process exits. That way, the thread cannot exit until the exec'd process exits

go func() {
	runtime.LockOSThread()

	err := cmd.Start()
	if err != nil {
		...
	}

	// some channels to communicate with this goroutine or whatever are here...
        // <-finished


	err = cmd.Wait()

	runtime.UnlockOSThread()
}()

Updates golang#27505

ianlancetaylor · 2022-06-15T20:59:34Z

Thanks, I understand the comment now. I was reading "after it finishes" as meaning "after StartProcess completes," but it really means "after Wait completes.`"

This is a rather large footgun, so let's mention that it sends the signal on thread termination and not process termination in the documentation. Updates #27505 Change-Id: I489cf7136e34a1a7896067ae24187b0d523d987e GitHub-Last-Rev: c8722b2 GitHub-Pull-Request: #53365 Reviewed-on: https://go-review.googlesource.com/c/go/+/412114 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>

See golang/go#27505 for context. Pdeathsig isn't safe to set without locking to the current OS thread, because otherwise thread termination will send the signal, which isn't the desired behavior. I discovered this while troubleshooting a problem that turned out to be unrelated, but I think it's necessary for correctness. Signed-off-by: Aaron Lehmann <alehmann@netflix.com>

On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal will be sent to the process when the OS thread on which cmd.Start() was executed dies. The runtime terminates an OS thread when a goroutine exits after being wired to the thread with runtime.LockOSThread(). If other goroutines are allowed to be scheduled onto a thread which called cmd.Start(), an unrelated goroutine could cause the thread to be terminated and prematurely signal the command. See golang/go#27505 for more information. Prevent started subprocesses with Pdeathsig from getting signaled prematurely by wiring the starting goroutine to the OS thread until the subprocess has exited. No other goroutines can be scheduled onto a locked thread so it will remain alive until unlocked or the daemon process exits. Signed-off-by: Cory Snider <csnider@mirantis.com>

…eebsd When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608

When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608

When Pdeathsig is used, the child process will receive the signal whenever the parent *thread* is killed. In go, an os thread can be killed if it is locked to a goroutine and left locked when the goroutine finishes execution. The go runtime will schedule goroutines to OS threads as it sees fit, so this can result in random unexpected signals being sent to child processes with pdeathsig set. There's no currently *known* case where the shim code leaves goroutines locked to a thread, but given this could happen now or in the future in our code or in any dependency code, it's best to safeguard against this. The fix is to keep the goroutine that spawns and waits on the child process locked to its os thread. That way, we can be sure it will never be killed by the go runtime. See also: golang/go#27505 Signed-off-by: Erik Sipsma <erik@dagger.io>

Also don't check os.ProcessState.Exited() when reporting child deaths, I had previously misunderstood what it was for. In linux child processes are killed via pdeathsig if the *thread* that called fork() is ended, not the process. So this change makes sure that we do not allow go to kill thread. However apparently in current versions of go the threads are never killed anyway so this wouldn't have happened. This code is more formally correct though. See: golang/go#27505

gopherbot added the Documentation label Sep 5, 2018

dominikh added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 6, 2018

dominikh changed the title ~~Update documentation for linux SysProcAttr.Pdeathsig~~ syscall: misleading documentation for linux SysProcAttr.Pdeathsig Sep 6, 2018

virtuald mentioned this issue Oct 21, 2020

Support new pid namespace or ability to kill all children rootless-containers/rootlesskit#65

Closed

ianlancetaylor added the help wanted label Oct 21, 2020

ianlancetaylor added this to the Backlog milestone Oct 21, 2020

AkihiroSuda mentioned this issue Oct 22, 2020

Pdeathsig rootless-containers/rootlesskit#182

Open

rafiss mentioned this issue Apr 9, 2021

testserver: run cockroach with Pdeathsig on linux cockroachdb/cockroach-go#101

Open

codefromthecrypt mentioned this issue Apr 11, 2021

Incorrect use of Pdeathsig for killing child on linux. tetratelabs/func-e#173

Closed

elchenberg mentioned this issue Feb 15, 2022

stop 'kubectl port-forward' when the parent process (d8s) exits damoon/d8s#32

Closed

Tasssadar mentioned this issue Jun 14, 2022

syscall: clarify Pdeathsig documentation on Linux #53365

Closed

Tasssadar added a commit to Tasssadar/go that referenced this issue Jun 15, 2022

syscall: clarify Pdeathsig function on Linux

c8722b2

Updates golang#27505

gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022

aaronlehmann mentioned this issue Jul 20, 2022

Lock to OS thread when Pdeathsig is set containerd/go-runc#86

Merged

corhere mentioned this issue Sep 28, 2022

Stop subprocesses from getting unexpectedly killed moby/moby#44215

Merged

corhere mentioned this issue Oct 3, 2022

proposal: syscall: add Umask option to SysProcAttr for Unix-like platforms #56016

Open

fho mentioned this issue Dec 8, 2022

run: executed cmd is not terminated when baur process dies simplesurance/baur#403

Open

3 tasks

corhere mentioned this issue Mar 28, 2023

git: set umask without reexec moby/buildkit#3742

Merged

sipsma mentioned this issue Apr 3, 2023

Safeguard pdeathsig usage in shim w/ os thread lock. dagger/dagger#4886

Merged

samuelkarp mentioned this issue Oct 19, 2023

integration: init release upgrade testing containerd/containerd#9262

Merged

bigkevmcd mentioned this issue Nov 21, 2023

Termination grace period is not respected when the pod is killed. flux-iac/tofu-controller#481

Open

chriso mentioned this issue Jun 3, 2024

Review use of Pdeathsig dispatchrun/dispatch#68

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

virtuald commented Sep 5, 2018

dominikh commented Sep 5, 2018

virtuald commented Sep 5, 2018

virtuald commented Oct 21, 2020

gopherbot commented Jun 14, 2022

cespare commented Jun 14, 2022

ianlancetaylor commented Jun 14, 2022

cespare commented Jun 15, 2022

Tasssadar commented Jun 15, 2022 •

edited

Loading

ianlancetaylor commented Jun 15, 2022

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505

Comments

virtuald commented Sep 5, 2018

dominikh commented Sep 5, 2018

virtuald commented Sep 5, 2018

virtuald commented Oct 21, 2020

gopherbot commented Jun 14, 2022

cespare commented Jun 14, 2022

ianlancetaylor commented Jun 14, 2022

cespare commented Jun 15, 2022

Tasssadar commented Jun 15, 2022 • edited Loading

ianlancetaylor commented Jun 15, 2022

Tasssadar commented Jun 15, 2022 •

edited

Loading