Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: signal handling: sigtrampgo stack check fails in deferred TSAN handler call #18255

Closed
bcmills opened this issue Dec 8, 2016 · 5 comments

Comments

Projects
None yet
3 participants
@bcmills
Copy link
Member

commented Dec 8, 2016

The fix for #17753 fixed the runtime's registration with the TSAN signal interceptor. When an asynchronous signal (such as SIGPROF) arrives while a C thread is running, it appears that TSAN defers the signal handler call until the next libc call after the signal occurs.

In a Go program, the libc call typically occurs on the g0 stack, not gsignal, so the sanity-check in sigtrampgo (to verify that we're on gsignal) fails.

I'm not entirely sure what the appropriate fix is. (@ianlancetaylor, any ideas?)

A repro case follows.

tsanprof.go:

package main

/*
#cgo CFLAGS: -g -std=c99 -fsanitize=thread
#cgo LDFLAGS: -g -fsanitize=thread

#include <stdlib.h>

void spin() {
	for (size_t n = 0; n < 1<<20; n++) {
		free(malloc(n));
	}
}
*/
import "C"

import (
	"io/ioutil"
	"runtime/pprof"
)

func goSpin() {
	for n := 0; n < 1<<20; n++ {
		_ = make([]byte, n)
	}
}
func main() {
	pprof.StartCPUProfile(ioutil.Discard)
	go C.spin()
	goSpin()
	pprof.StopCPUProfile()
}
$ go version
go version devel +94a4485 Tue Dec 6 22:33:48 2016 +0000 linux/amd64
$ $(go env CC) --version
clang version 3.8.0-2ubuntu3~trusty4 (tags/RELEASE_380/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ go build tsanprof && while ./tsanprof; do true; done
signal 27 received but handler not on signal stack
fatal error: non-Go code set up signal handler without SA_ONSTACK flag

runtime stack:
runtime.throw(0x5c2293, 0x39)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/panic.go:596 +0x95
runtime.sigNotOnStack(0x1b)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/signal_unix.go:442 +0x94
runtime.sigtrampgo(0x1b, 0x7f92d357f208, 0x7f92d357f288)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/signal_unix.go:229 +0xfc
runtime.sigtramp(0x7, 0x7ffd07a001a0, 0x7ffd07a001b0, 0x4032, 0x7f92d36a1d88, 0x7f92d35892d0, 0x7ffd07a00310, 0x4f2d70, 0x7f92d36597c0, 0x64b479f1000, ...)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/sys_linux_amd64.s:252 +0x25

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /home/bcmills/src/go.googlesource.com/go/src/runtime/asm_amd64.s:2184 +0x1 fp=0xc420044fe8 sp=0xc420044fe0

goroutine 1 [running]:
        goroutine running on other thread; stack unavailable

goroutine 35 [syscall]:
runtime.CPUProfile(0x5a9280, 0xc42001e150, 0x16477e0)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/cpuprof.go:448 +0x2d
runtime/pprof.profileWriter(0x85d500, 0x1661cc8)
        /home/bcmills/src/go.googlesource.com/go/src/runtime/pprof/pprof.go:691 +0x63
created by runtime/pprof.StartCPUProfile
        /home/bcmills/src/go.googlesource.com/go/src/runtime/pprof/pprof.go:678 +0x122
$
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2016

It shouldn't matter whether the signal occurs while the goroutine is executing on the g0 stack or not. If the signal handler is not running on the gsignal stack, that implies that the signal handler was not run on the alternate signal stack. That suggests that the runtime prints: non-Go code set up signal handler without SA_ONSTACK flag. I haven't looked at the program yet, though.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 9, 2016

I see the problem. The TSAN support library collects signals and their signal handlers, and then delivers the signals by calling the signal handlers directly. That is, the TSAN library ignores the sigaltstack call, and does not deliver invoke the signal handler on the alternate signal stack.

CC @dvyukov

Absent a fix to TSAN, perhaps we could have the Go signal handler decide that all is well if the stack pointer is on the g0 stack.

@bcmills

This comment has been minimized.

Copy link
Member Author

commented Dec 9, 2016

Absent a fix to TSAN, perhaps we could have the Go signal handler decide that all is well if the stack pointer is on the g0 stack.

I don't think there's really anything to fix in TSAN: it is handling the signal proper on an alternate stack, so there's no risk of blowing out the goroutine stack. It just happens to be waiting until the next libc call to deliver the signal (at which point it has good reason to believe that the stack is reasonably large and in a valid state).

I was thinking about making sigtrampgo just decide everything is fine if we're on g0, but at the moment it ends with a setg(g.m.gsignal) and setg(g) and I wasn't sure what to do about those.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 9, 2016

What do you think of https://golang.org/cl/34239 ?

@gopherbot

This comment has been minimized.

Copy link

commented Dec 9, 2016

CL https://golang.org/cl/34239 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.