New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: signal handling: sighandler crash due to race with signal.Ignore #20748

Open
bcmills opened this Issue Jun 21, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@bcmills
Member

bcmills commented Jun 21, 2017

In reviewing @ianlancetaylor's CL 46003 (addressing #14571), I noticed what I believe to be a rare race-condition that could result in unexpected termination of a Go program.

I attempted to write a reproducer for the race in question, which would cause the program to erroneously exit if a _SigKill or _SigThrow signal arrives in between a call to signal.Notify and signal.Ignore for that signal. The symptom I was looking for is a termination with the same signal number (for _SigKill) or a goroutine dump (for _SigThrow).

Unfortunately, I am unable to reproduce the race in question because the program instead crashes with a SIGSEGV in runtime.sigfwd.

sigrace/sigrace.go:

package main

import (
	"os"
	"os/signal"
	"runtime"
	"syscall"
)

func main() {
	signal.Ignore(syscall.SIGHUP)

	go func() {
		runtime.LockOSThread()
		pid := os.Getpid()
		tid := syscall.Gettid()
		for {
			syscall.Tgkill(pid, tid, syscall.SIGHUP)
		}
	}()

	c := make(chan os.Signal, 1)
	for {
		signal.Notify(c, syscall.SIGHUP)
		signal.Ignore(syscall.SIGHUP)
	}
}
$ go version
go version devel +c4e0e81653 Wed Jun 21 15:54:38 2017 +0000 linux/amd64
$ go build sigrace && ./sigrace
Segmentation fault (core dumped)

Investigation with GDB gives the following stack trace:

#0  0x0000000000000001 in ?? ()
#1  0x0000000000452a6d in runtime.sigfwd ()
    at /home/bcmills/go/src/runtime/sys_linux_amd64.s:245
#2  0x000000c420053b18 in ?? ()
#3  0x000000000043b665 in runtime.sigfwdgo (sig=1, info=0xc420053d70, ctx=0xc420053c40, ~r3=false)
    at /home/bcmills/go/src/runtime/signal_unix.go:606
#4  0x000000000043a83b in runtime.sigtrampgo (sig=1, info=0xc420053d70, ctx=0xc420053c40)
    at /home/bcmills/go/src/runtime/signal_unix.go:272
#5  0x0000000000452ac3 in runtime.sigtramp ()
    at /home/bcmills/go/src/runtime/sys_linux_amd64.s:265
#6  0x0000000000452bb0 in ?? ()
    at /home/bcmills/go/src/runtime/sys_linux_amd64.s:349
#7  0x0000000000000000 in ?? ()

The faulting instruction in runtime.sigfwd is the CALL instruction that invokes fwdSig[sig]

(CC @aclements)

@bcmills

This comment has been minimized.

Member

bcmills commented Jun 21, 2017

If we can fix the SIGSEGV, where I'm going with this reproducer is that I think we'll want to either combine the runtime.sig.ignored and runtime.sig.wanted arrays, or remove the ignored array entirely and use a single bit per signal to indicate "handled or ignored".

@aclements

This comment has been minimized.

Member

aclements commented Nov 8, 2017

@bcmills

This comment has been minimized.

Member

bcmills commented Nov 9, 2017

I don't think I'm going to be able to get to this for 1.10.

@aclements

This comment has been minimized.

Member

aclements commented Nov 9, 2017

Thanks for the update. Bumping to 1.11.

@odeke-em

This comment has been minimized.

Member

odeke-em commented Jun 11, 2018

How's it going @bcmills? Just a kind ping as we had moved this to Go1.11, perhaps a Go1.12 bump too?

@bcmills

This comment has been minimized.

Member

bcmills commented Jun 11, 2018

Sadly, yes. I think at this point in the cycle any attempt to mess with signal handling will introduce more risk than it eliminates.

@bcmills bcmills modified the milestones: Go1.11, Go1.12 Jun 11, 2018

@bcmills bcmills self-assigned this Jun 11, 2018

@bcmills bcmills modified the milestones: Go1.12, Go1.13 Nov 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment