-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module #52654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Linux offers the signalfd syscall since 2.6.22 (with minor changes afterwards). This call allows signals to be handled as bytes read out of a file descriptor, rather than as interruptions to the flow of a program. Quite usefully, this file descriptor can be select()'d on (or poll()'d, epoll()'d, etc) alongside other "normal" file descriptors. In order to effectively use signalfd(), the signals in question must be blocked, though. So it makes sense to expose sigprocmask(2) at the same time, in order to allow this blocking to be set up. |
I've started on an implementation of this in /branches/signalfd-issue8407. |
Notice that 2.7 has seen its first beta release, so no new features are allowed for it. I think it's better to target this feature for 3.2. |
I'm with Martin, better to target 3.2 IMO. |
There's a lot more information than the signal number available as well. The signalfd_siginfo struct has 16 fields in it now and may have more in the future. Another advantage is that this approach allows all asynchronous preemption to be disabled. This side-steps the possibility of hitting any signal bugs in CPython itself. Of course, any such bugs that are found should be fixed, but fixing them is often quite challenging and involves lots of complex domain-specific knowledge. In comparison, the signalfd and sigprocmask extensions are quite straight-forward and self-contained. It's also possible to have several signalfds, each with a different signal mask. set_wakeup_fd is limited to a single fd per-process. sigprocmask has other uses all by itself, of course, like delaying the delivery of signals while some sensitive code runs. |
One open question regarding interaction with threading. sigprocmask's behavior in a multithreaded program is unspecified. pthread_sigmask should be used instead. I could either expose both of these and let the caller choose, or I could make signal.sigprocmask use pthread_sigmask if it's available, and fall back to sigprocmask. I don't see any disadvantages of doing the latter, and the former seems like it would create an attractive nuisance. However, I don't have much experience with sigprocmask; I'm only interested in it because of its use in combination with signalfd. |
Or perhaps you could disable the feature if pthread_sigmask isn't available. Apparently, FreeBSD and Mac OS X have it, as well as Linux. |
Trying pthread_sigmask first, and falling back, seems like the right strategy to me. |
I think this is ready for a first review. See <http://codereview.appspot.com/1132041\>. If everyone agrees this is inappropriate for 2.7, then I'll port the changes to 3.x. I don't expect there to be much difference in the 3.x version. |
I think the decision is up to Benjamin. |
Let's leave it for 3.2. |
Any hopes of getting this into Python 3.3? |
sigprocmask or (better) pthread_sigmask is required to fix bpo-11859 bug. --- Python has a test for "broken pthread_sigmask". Extract of configure.in: AC_MSG_CHECKING(if PTHREAD_SCOPE_SYSTEM is supported)
AC_CACHE_VAL(ac_cv_pthread_system_supported,
[AC_RUN_IFELSE([AC_LANG_SOURCE([[#include <pthread.h>
void *foo(void *parm) {
return NULL;
}
main() {
pthread_attr_t attr;
pthread_t id;
if (pthread_attr_init(&attr)) exit(-1);
if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) exit(-1);
if (pthread_create(&id, &attr, foo, NULL)) exit(-1);
exit(0);
}]])],
[ac_cv_pthread_system_supported=yes],
[ac_cv_pthread_system_supported=no],
[ac_cv_pthread_system_supported=no])
])
AC_MSG_RESULT($ac_cv_pthread_system_supported)
if test "$ac_cv_pthread_system_supported" = "yes"; then
AC_DEFINE(PTHREAD_SYSTEM_SCHED_SUPPORTED, 1, [Defined if PTHREAD_SCOPE_SYSTEM supported.])
fi
AC_CHECK_FUNCS(pthread_sigmask,
[case $ac_sys_system in
CYGWIN*)
AC_DEFINE(HAVE_BROKEN_PTHREAD_SIGMASK, 1,
[Define if pthread_sigmask() does not work on your system.])
;;
esac]) Extract of Python/thread_pthread.h: /* On platforms that don't use standard POSIX threads pthread_sigmask()
* isn't present. DEC threads uses sigprocmask() instead as do most
* other UNIX International compliant systems that don't have the full
* pthread implementation.
*/
#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK)
# define SET_THREAD_SIGMASK pthread_sigmask
#else
# define SET_THREAD_SIGMASK sigprocmask
#endif Because today more and more programs rely on threads, it is maybe not a good idea to provide a binding of sigprocmask(). I would prefer to only add pthread_sigmask() which has a determistic behaviour with threads. So only compile signal.pthread_sigmask() if pthread API is present and pthread_sigmask "is not broken". --- About the patch: the doc should explain that the signal masks are inherited for child processes (fork() and execve()). I don't know if this behaviour is specific to Linux or not. If we only use pthread_sigmask(), the doc is wrong: "Set the signal mask for the process." It's not for the process but only for the current thread. How does it work if I change the signal mask of the main thread and then I create a new thread: the signal mask is inherited, or a default mask is used instead? --- The new faulthandler uses a thread to implement a timeout: the thread uses pthread_sigmask() or sigprocmask() to ignore all signals. If I don't set the signal mask, some tests fail: check that a system call (like reading from a pipe) can be interrupted by signal. The problem is that signal may be send to the faulthandler thread, instead of the main thread. Hum, while I'm writing this, I realize that I should maybe not fallback to sigprocmask() because it may mask signals for the whole process (all threads)! |
signal_pthread_sigmask.patch:
The code is based on the last version of python-signalfd: Changes between python-signalfd and my patch:
I will work on a similar patch for signalfd() after the pthread_sigmask() patch is accepted. |
Oh, I forget to read again http://codereview.appspot.com/1132041. Here is an updated patch reusing some tests and with a better documentation. |
New changeset 28b9702a83d1 by Victor Stinner in branch 'default': |
signalfd.patch: Add signal.signalfd(), signal.SFD_CLOEXEC and signal.SFD_NONLOCK. The patch is based on http://codereview.appspot.com/1132041 and the last version (bzr) of python-signalfd. The patch uses also PyModule_AddIntMacro() for the 3 constants added in my last (pthread_sigmask), and changes pthread_sigmask() to raise an OSError instead of a RuntimeError. Note: python-signalfd has a bug: it doesn't pass the fd argument to signalfd(), it always pass -1. |
test_signal.PthreadSigmaskTests fails on Mac OS X. http://www.python.org/dev/buildbot/all/builders/PPC Leopard 3.x/builds/1785/steps/test/logs/stdio http://www.python.org/dev/buildbot/all/builders/x86 Tiger 3.x/builds/2429/steps/test/logs/stdio |
Update signalfd patch. |
|
Fixed: updated patch (version 3). |
The problem is that sometimes SIG_UNBLOCK does not immediatly call the pending signal, but it calls it "later". The problem is that I don't know exactly when. I tried to wait the pending signal using signal.pause(), but I got a bus error!? Example of the problem: pthread_sigmask(SIG_BLOCK, [SIGUSR1])
os.kill(os.getpid(), SIGUSR1)
pthread_sigmask(SIG_UNBLOCK, [SIGUSR1]) |
New changeset d003ce770ba1 by Victor Stinner in branch 'default': |
New changeset c9207c6ce24a by Victor Stinner in branch 'default': |
Since the commit c9207c6ce24a, test_signal fails on OpenIndiana: http://www.python.org/dev/buildbot/all/builders/x86%20OpenIndiana%203.x/builds/1179 Traceback (most recent call last):
File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 539, in test_block_unlock
self.assertEqual(set(old_mask) ^ set(blocked), {signum})
File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 501, in handler
1/0
ZeroDivisionError: division by zero |
New changeset f8c49a930015 by Victor Stinner in branch 'default': |
New changeset e3cb2c99a5a9 by Victor Stinner in branch 'default': |
Update the signalfd patch (version 4) against the default branch. Specify the minimum Linux version in signalfd() doc. The patch still lacks a structure to parse the bytes written into the file (see msg135438 for a ctypes example): a struct sequence should be fine. |
Wakeup test using two pending signals fails on FreeBSD 6.4 buildbot: ====================================================================== Traceback (most recent call last):
File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 266, in test_signum
self.check_signum(signal.SIGUSR1, signal.SIGALRM)
File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 232, in check_signum
self.assertSequenceEqual(raised, signals)
AssertionError: Sequences differ: (14,) != (30, 14) Wakeup file only contains SIGALRM, not (SIGUSR1, SIGALRM), whereas SIGUSR1 is raised before SIGALRM. Code of the test:
def test_signum(self):
old_handler = signal.signal(signal.SIGUSR1, lambda x,y:None)
self.addCleanup(signal.signal, signal.SIGUSR1, old_handler)
os.kill(os.getpid(), signal.SIGUSR1)
os.kill(os.getpid(), signal.SIGALRM)
self.check_signum(signal.SIGUSR1, signal.SIGALRM) |
There are warnings on the FreeBSD and OSX buildbots, where pthread_t |
New changeset 2e0d3092249b by Victor Stinner in branch 'default': |
There's a race. |
New changeset 234021dcad93 by Victor Stinner in branch 'default': |
Oh, nice catch. The "bug" is not new, Python behaves like that since Python 3.1. But in Python < 3.3, it doesn't mater because I don't think that wakeup was used to watch more than one signal. One trigger "something happened" was enough. The wakeup fd now contains the number of each signal, and so the behaviour has to change. I applied your patch and I added a test. |
Oooooh, sigwait() doesn't accept a timeout! I would be nice to have also sigwaitinfo().. and maybe also its friend, sigwaitinfo() (if we implement the former, it's trivial to implement the latter). Python 3.3 adds optional timeout to subprocess.wait(), subprocess.communicate(), threading.Lock.acquire(), etc. And I love timeout! It was really useful to implement the thread of faulthandler.dump_tracebacks_later(). |
Interesting. I suspected this would have an impact on the test_signal """ Traceback (most recent call last):
File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
line 272, in test_signum
self.check_signum(signal.SIGUSR1, signal.SIGALRM)
File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
line 238, in check_signum
self.assertEqual(raised, signals)
AssertionError: Tuples differ: (14, 30) != (30, 14) First differing element 0:
This means that the signals are not delivered in order. |
New changeset 29e08a98281d by Victor Stinner in branch 'default': |
Cool, "x86 FreeBSD 6.4 3.x" is green for the first time since a long time, thanks to my commit 29e08a98281d (test_signal doesn't check signal delivery order). |
Oops, I mean sigtimedwait() and sigwaitinfo(). |
I just noticed something "funny": signal_sigwait doesn't release the |
This whole thread is becoming quite confusing. |
You are complelty right, sorry to pollute this issue. Changes related to this issue:
I created issue bpo-12303 for sigwaitinfo() and sigtimedwait(), and issue bpo-12304 for signalfd. |
New changeset a5c8b6ebe895 by Victor Stinner in branch 'default': |
sigwait() is not useless: the signal is not necessary send by another thread. The signal can be send by alarm(), from a child process, by an user, by the kernel. Releasing the GIL is a new feature. Because it is cheap and pause() does also release the GIL, I commited your patch.
test_sigwait() test pass on all 3.x buildbots (some don't have sigwait(), the test is skipped). Sometimes, test_sigwait() is executed with 2 threads, the main thread and Tkinter event loop thread, and the signal is not blocked in any thread. On Linux, it works well with more than one thread. I added a test using a thread, we will see if it works on buildbots. The signal is raised by a thread (whereas the signal is not blocked in any thread). I wrote a can_test_blocked_signals() function in test_signal which has hardcoded tests to check for some known C threads: the faulthandler timeout thread and for the Tkinter event loop thread. can_test_blocked_signals() returns True if pthread_kill() is available. I don't know how it works if a thread uses pthread_kill() to raise a signal into the main thread (waiting in sigwait()), whereas the signal is not blocked in any thread. If it is not possible to get the list of all C/Python threads and/or block a signal in all threads, we can use a subprocess without threads (with only the main thread). Would you like to work on a patch to avoid the undefined behaviour? |
The test hangs on FreeBSD 8.2: [235/356] test_signal (The test has not run yet on FreeBSD 6.4 buildbot) |
See also bpo-12310 which may be related. |
If it doesn't release the GIL, it is useless.
It's mere luck: def test_sigwait(self):
old_handler = signal.signal(signal.SIGALRM, self.handler)
self.addCleanup(signal.signal, signal.SIGALRM, old_handler)
signal.alarm(1)
self.assertEqual(signal.sigwait([signal.SIGALRM]), signal.SIGALRM) Comment out the first two lines that change the signal handler, and
If we really wanted to test this properly, the way to go would be to |
New changeset a17710e27ea2 by Victor Stinner in branch 'default': |
neologix> Patch attached. I wrote a different patch based on your patch. The main change is that I chose to keep signal.alarm(1) instead of sending the signal from the parent process to the child process, because I don't think that time.sleep(0.5) is a reliable synchronization code to wait until the child process is waiting in sigwait(). test_sigwait_thread() uses time.sleep(1) to wait until the main thread is waiting in sigwait(), but it is not mandatory because the signal is already blocked in the thread. I wrote test_sigwait_thread() to ensure that sigwait() releases the GIL, not to check that if a thread raises a signal while sigwait() is waiting, sigwait() does correctly receive it. We may use time.sleep() in test_sigwait() if the signal is blocked in the parent process before the fork() (and unblocked in the parent process after the fork, but before sending the signal to the child process), but I think that signal.alarm() is more reliable and simple. -- test_sigwait(_thread) was the last known bug of this issue, so let's close it. Reopen it if you see other bugs in pthread_sigmask(), pthread_kill(), sigpending() and sigwait() functions, or maybe better: open new issues because this issue has a too long history! See issue bpo-12303 for sigwaitinfo() and sigtimedwait(), and issue bpo-12304 for signalfd. |
New changeset 37a87b709403 by Victor Stinner in branch 'default': |
New changeset 60b1ab4d0cd4 by Victor Stinner in branch 'default': |
The documentation has been left in a confusing state by this patch -- at least confusing to me. I've created bpo-14456 with further details. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: