Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: signal handler interferes with pthread_cancel/join on linux #6997

Closed
gopherbot opened this issue Dec 21, 2013 · 9 comments
Closed

Comments

@gopherbot
Copy link
Contributor

by sqweek:

What steps will reproduce the problem?

1. Make cgo calls to a C library that uses pthreads
2. Make the library call pthread_cancel on one of its threads
I created a small test case, available at https://github.com/sqweek/signal32-bug

What is the expected output?

The thread is cancelled and the program continues running.

What do you see instead?

The program crashes with message:
runtime: signal received on thread not created by Go: signal 32

Which compiler are you using (5g, 6g, 8g, gccgo)?

cgo/6g

Which operating system are you using?

Linux 3.12.0-1

Which version are you using?  (run 'go version')

Reproduced on both:
go version go1.1.2 linux/amd64
go version devel +f4d1cb8d9a91 Thu Sep 19 22:34:33 2013 +1000 linux/amd64

Please provide any additional information below.

According to http://man7.org/linux/man-pages/man3/pthread_cancel.3.html#NOTES
pthread_cancel() is implemented via signals on linux.

I tried changing signal 32's flags to 0 in pkg/runtime/signals_linux.h which fixes the
crash, but the program then hangs in pthread_join(). gdb reports the cancelled thread
still exists and is blocked (ie, "finish" never got anywhere) in
__libc_disable_asynccancel():

Thread 5 (Thread 0x7ffff6010700 (LWP 12137)):
#0  0x00007ffff7905aff in __libc_disable_asynccancel () from /usr/lib/libc.so.6
#1  0x00007ffff78cab49 in nanosleep () from /usr/lib/libc.so.6
#2  0x00007ffff78ca9d4 in sleep () from /usr/lib/libc.so.6
#3  0x000000000040100e in threadfunc (dummy=<optimized out>)
    at /d/code/go/src/github.com/sqweek/signal32-bug/thread.c:10
#4  0x00007ffff7bc70a2 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007ffff78f949d in clone () from /usr/lib/libc.so.6
@gopherbot
Copy link
Contributor Author

Comment 1 by sqweek:

Ah sorry, I must have not rebuilt the runtime package correctly. The binary that blocks
in _disable_asynccancel() is still calling rt_sigaction() to register go's signal
handler for SIGRTMIN. So I don't understand the cause of the different behaviour (strace
reports the thread just keeps trying to call futex() which fails with EAGAIN).
I have now properly rebuilt/installed runtime after clearing signal 32's flags, and
pthread_cancel works as expected.
Similar to issue #3871, but this signal doesn't need to be propagated to all threads at
least. NPTL pthreads only mentions the first two realtime signals as reserved so
hopefully that's the end of it!

@davecheney
Copy link
Contributor

Comment 2:

I believe this is the correct status, please comment on this issue if this is incorrect.

Status changed to Retracted.

@gopherbot
Copy link
Contributor Author

Comment 3 by sqweek:

Sorry if I was unclear - there is still an issue. Using cgo to call into a library that
ends up calling pthread_cancel makes the go runtime exit immediately.
The fix, however, should be as simple as clearing signal 32's flags in
pkg/runtime/signals_linux.h, which causes the runtime to leave pthreads' signal 32
handler in place.

@davecheney
Copy link
Contributor

Comment 4:

reopened, do you feel like proposing a fix ?

Status changed to New.

@gopherbot
Copy link
Contributor Author

Comment 5 by sqweek:

Thanks - is a diff sufficient?

Attachments:

  1. 6997.diff (541 bytes)

@davecheney
Copy link
Contributor

Comment 6:

I'm sorry, we cannot accept diffs via issues. Please review the contributoin guidelines
here http://golang.org/doc/contribute.html.

@gopherbot
Copy link
Contributor Author

Comment 7 by sqweek:

Thanks for the link, and apologies for my ignorance. I'll be back once I've been over
the guidelines.

@ianlancetaylor
Copy link
Contributor

Comment 8:

This issue was closed by revision c4770b9.

Status changed to Fixed.

@gopherbot
Copy link
Contributor Author

Comment 9:

CL https://golang.org/cl/170850043 mentions this issue.

aclements added a commit that referenced this issue Dec 23, 2014
These signals are used by glibc to broadcast setuid/setgid to all
threads and to send pthread cancellations.  Unlike other signals, the
Go runtime does not intercept these because they must invoke the libc
handlers (see issues #3871 and #6997).  However, because 1) these
signals may be issued asynchronously by a thread running C code to
another thread running Go code and 2) glibc does not set SA_ONSTACK
for its handlers, glibc's signal handler may be run on a Go stack.
Signal frames range from 1.5K on amd64 to many kilobytes on ppc64, so
this may overflow the Go stack and corrupt heap (or other stack) data.

Fix this by ensuring that these signal handlers have the SA_ONSTACK
flag (but not otherwise taking over the handler).

This has been a problem since Go 1.1, but it's likely that people
haven't encountered it because it only affects setuid/setgid and
pthread_cancel.

Fixes #9600.

Change-Id: I6cf5f5c2d3aa48998d632f61f1ddc2778dcfd300
Reviewed-on: https://go-review.googlesource.com/1887
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants