Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: signal handling changed by signal.Notify #36420

Open
stevenh opened this issue Jan 7, 2020 · 17 comments
Open

runtime: signal handling changed by signal.Notify #36420

stevenh opened this issue Jan 7, 2020 · 17 comments

Comments

@stevenh
Copy link
Contributor

@stevenh stevenh commented Jan 7, 2020

What version of Go are you using (go version)?

go version go1.13.5 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/steveh/.cache/go-build"
GOENV="/home/steveh/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/steveh/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/steveh/code/go.d.plugin/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build394267353=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Use signal.Notify to handle os.Interrupt

What did you expect to see?

All other signal handling to be unchanged

What did you see instead?

SEGV, BUS and FPE handling from external process kill(2) don't work if signal.Notify(...) is used.

Example

package main

import (
        "fmt"
        "os"
        "os/signal"
        "time"
)

func main() {
        c := make(chan os.Signal, 1)
        signal.Notify(c, os.Interrupt)
        go func() {
                s := <-c
                fmt.Println("Got signal:", s)
                signal.Stop(c)
        }()
        if signal.Ignored(os.Interrupt) {
                fmt.Println("ignored")
        }
        for i := 0; i < 1000; i++ {
                fmt.Println("sleep...")
                time.Sleep(time.Second)
        }
}

This seems related the definition in the runtime of these signals containing _SigPanic as those which are defined as _SigThrow continue to still function correctly; confirmed by changing SEGV from _SigPanic to _SigThrow which then continues to work even once signal.Notify(...) has been called.

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

Seems the without signal.Notify(...) gp.throwsplit is always true which means that _SigPanic is remapped to _SigThrow.

Given this and the docs in os/signal it seems that the behaviour without signal.Notify might actually be the unexpected behaviour even-though from a unix perspective it's unexpected.

Thoughts?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

What exactly do you mean when you say "SEGV, BUS and FPE handling from external process kill(2) don't work"? What precisely are you doing with your example program, and what precisely happens?

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

Without the call signal.Notify all of those signals cause the app to output a stack trace and exit with it called the app continues to run ignoring the external generated signals.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

I do not ever see this program output a stack trace when sent an external signal, even if I remove the call to signal.Notify, or remove the import of the os/signal package entirely. Can you show precisely what you are doing to cause a stack trace? By this I mean show us the exact commands that you run and the exact output that you see. Don't describe what you are showing, show us exactly what you are doing. Thanks.

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

my test app is called go-signal and I simply run the following to send a SIGSEGV

kill -SEGV `ps -ef |grep go-signal |grep -v grep | awk '{print $2}'`

With signal.Notify

./go-signal
sleep...
sleep...
sleep...
sleep...
sleep...
sleep...
sleep...
sleep...
^Cprocess: interrupt
Got signal: interrupt
^C

Without signal.Notify

./go-signal
sleep...
sleep...
sleep...
sleep...
SIGSEGV: segmentation violation
PC=0x4578a3 m=0 sigcode=0

goroutine 5 [syscall]:
runtime.notetsleepg(0x567820, 0x3b9ab614, 0x1415)
        /usr/local/go/src/runtime/lock_futex.go:227 +0x34 fp=0xc000044760 sp=0xc000044730 pc=0x409d64
runtime.timerproc(0x567800)
        /usr/local/go/src/runtime/time.go:311 +0x2f1 fp=0xc0000447d8 sp=0xc000044760 pc=0x446d11
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc0000447e0 sp=0xc0000447d8 pc=0x455991
created by runtime.(*timersBucket).addtimerLocked
        /usr/local/go/src/runtime/time.go:169 +0x10e

goroutine 1 [sleep]:
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:310
time.Sleep(0x3b9aca00)
        /usr/local/go/src/runtime/time.go:105 +0x157
main.main()
        /home/steveh/code/go-signal/main.go:32 +0xb6

goroutine 18 [syscall]:
os/signal.signal_recv(0x0)
        /usr/local/go/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
        /usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
        /usr/local/go/src/os/signal/signal_unix.go:29 +0x41

rax    0xfffffffffffffffc
rbx    0x3b9ab614
rcx    0xc0000446d8
rdx    0x0
rdi    0x567820
rsi    0x80
rbp    0xc0000446e8
rsp    0xc0000446a0
r8     0x0
r9     0x0
r10    0xc0000446d8
r11    0xc0000446d8
r12    0xff
r13    0xffffffffffffff
r14    0x4dc522
r15    0x38
rip    0x4578a3
rflags 0x206
cs     0x33
fs     0x53
gs     0x2b

To clarify in both cases package os/signal is loaded in the second there's just no call to signal.Notify

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

For reference my test is running under WSL

uname -a
Linux vader 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux

However also had my colleague tested on native Ubuntu and he sees the same too.

@danmrichards

This comment has been minimized.

Copy link

@danmrichards danmrichards commented Jan 7, 2020

I am also seeing the same behaviour as mentioned by @stevenh

For reference I am running Go version go version go1.13.5 linux/amd64 on Linux (Linux unity-XPS-15-9570 4.18.20-041820-generic #201812030624 SMP Mon Dec 3 11:25:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux)

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

Interesting. When I try the same thing, the program simply continues running.

Ah, I see. It works the way you describe for Go 1.13, but not for Go tip.

The difference is that in Go 1.13 the signal is being received by a G for which throwsplit is set. On tip it is not. When throwsplit is set SigPanic is changed to SigThrow.

When the program does not call signal.Notify, then sending an external SIGSEGV crashes the program with Go 1.11, Go 1.12, and Go 1.13. With earlier versions of Go, and with tip, the externally generated SIGSEGV is ignored.

Oddly, I think the change in tip is due to the new timers code. With the old timers code the only real thread in this program would sleep in notetsleepg, which calls entersyscallblock, which sets throwsplit. With the new timers code the only real thread sleeps in netpoll which does not call entersyscallblock. So on tip throwsplit is not set and the signal does not crash the program.

When the program does call signal.Notify, then we start a separate thread just to receive signals. That thread does not have throwsplit set, so the signal does not crash the program.

I think the question now is: what do we want to happen? Clearly whether the program calls signal.Notify for a signal other than SIGSEGV should not affect the behavior for an externally generated SIGSEGV. For Go versions through 1.10, we ignored externally generated SIGSEGV signals. For 1.11 through 1.13, for certain kinds of programs, we crashed on an externally generated SIGSEGV. On tip, we do not crash. But for Go 1.11 and later, including tip, whether we crash or not is actually slightly random, since a signal might randomly be delivered to a G with throwsplit set.

So we need to fix that. Whether throwsplit is set must not change the behavior for an externally generated SIGSEGV.

But still: what do we want to happen? Historically we have ignored externally generated SIGSEGV, SIGBUS, and SIGFPE signals. Should we continue to ignore them, or should we crash with a stack dump, or should we crash without a stack dump? We haven't documented any behavior for this, and I don't see any principle to follow. For example, currently for an externally generated SIGSYS we crash with a stack dump, for an externally generated SIGTERM we crash without a stack dump, for an externally generated SIGALRM we ignore the signal. I don't know which of those categories is right for an externally generated SIGSEGV.

@ianlancetaylor ianlancetaylor added this to the Go1.15 milestone Jan 7, 2020
@ianlancetaylor ianlancetaylor changed the title runtime: signal handling broken by signal.Notify runtime: signal handling changed by signal.Notify Jan 7, 2020
@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

In our case we're looking for feature parity apps built in other languages e.g. C & C++
SIGTERM:

sleep...
Terminated

SEGV:

sleep...
Segmentation fault (core dumped)

SIGBUS:

sleep...
Bus error (core dumped)

SIGFPE:

sleep...
Floating point exception (core dumped)

SIGALRM:

sleep...
Alarm clock

SIGSYS:

sleep...
Bad system call (core dumped)

In summary using the OS native behaviour if we don't handle it.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

Go has already deviated from OS native behavior for signals like SIGALRM. That is already documented at https://golang.org/pkg/os/signal, and I don't think we can reasonably change it now. Also, even if if OS native behavior tells us whether the crash the program or not, it doesn't tell us whether to print a stack dump or not. So while OS native behavior is a guide I don't think it determines the answer.

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

Agreed, using it as a guide is the most appropriate, as I'm not aware of other languages which have a standard convention to print a stack dump.

Another reason is stdout / err generally aren't monitored so the standard way to gain information about a "crash" is the generation of a core dump; there are of course workflows which are but cores always have the ability to work.

@cherrymui

This comment has been minimized.

Copy link
Contributor

@cherrymui cherrymui commented Jan 7, 2020

I remember the old behavior, and didn't realize the behavior has changed (probably accidentally).

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Jan 7, 2020

I would argue that an external SIGSEGV should crash the program.

(In fact, I think this may be more-or-less a duplicate of #19465.)

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

I came across that when investigating this issue, and agree it's very similar.

The notable difference is maintaining compatibility with previously installed C handlers, which is arguably a super-set of the functionality required for an external SIGSEGV etc to cause the program to crash.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

One way to approach this is to ask what the use case is. There are fairly well-defined use cases for several externally generated signals such as SIGHUP, SIGINT, SIGTERM, SIGQUIT, and the Go behavior tries to take those use cases into account.

What is the use case for an externally generated SIGSEGV, SIGBUS, SIGFPE signal?

@stevenh

This comment has been minimized.

Copy link
Contributor Author

@stevenh stevenh commented Jan 7, 2020

For SIGSEGV our use case is to force a core dump.

This is also the common definition for SIGQUIT but this is often used by apps more like a SIGTERM and proves unreliable.

For SIGBUS SIGFPE I've not come across an kill(2) based use, I just mentioned them here as they where both effected by the same change in behaviour under signal.Notify(..) on go 1.13.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 7, 2020

Note that Go programs don't dump core by default on receipt of any signal, SIGSEGV or otherwise. To get a Go program to dump core you need to set GOTRACEBACK=crash in the environment, or call runtime/debug.SetTraceback("crash"). If you do that then the program will dump core on receipt of a number of signals such as SIGQUIT, SIGILL, SIGABRT, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.