-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interruption of locks by signals not guaranteed when locks are implemented using POSIX condition variables; add sys.thread_info #55432
Comments
The test test_lock_acquire_interruption in test_threadsignals.py will lock forever on AIX 5.3. truss shows it is stuck in thread_tsleep: |
We have the same issue on our FreeBSD 6.4 buildbot: Can you tell me what the values of _POSIX_SEMAPHORES and HAVE_SEM_TIMEDWAIT are (or whether they are undefined)? |
$ grep SEM pyconfig.h
#define HAVE_BROKEN_POSIX_SEMAPHORES 1
/* #undef HAVE_BROKEN_SEM_GETVALUE */
#define HAVE_SEM_GETVALUE 1
#define HAVE_SEM_OPEN 1
#define HAVE_SEM_TIMEDWAIT 1
#define HAVE_SEM_UNLINK 1
/* #undef POSIX_SEMAPHORES_NOT_ENABLED */
#ifndef _POSIX_PTHREAD_SEMANTICS
# define _POSIX_PTHREAD_SEMANTICS 1
|
Ok, the problem is when the condition variable-based implementation is chosen in Python/thread_pthread.h. “If a signal is delivered to a thread waiting for a condition variable, upon return from the signal handler the thread resumes waiting for the condition variable as if it was not interrupted, or it shall return zero due to spurious wakeup.” So, on those OSes where the semaphore API can't be used, the test can succeed or not depending on how pthread_cond_wait() behaves in the face of signals. It would be better if it didn't hang, though. (This is not a regression, by the way. Locks were totally uninterruptible under POSIX until 3.2, and the tests weren't there. The new tests were introduced in r87292 (see bpo-8844)) |
Here is a patch that makes the test fail rather than hang. Sébastien, can you try it out? |
Thanks Antoine, the patch solved the problem. I have other tests locking semi-randomly on AIX 5.3 and/or AIX 6.1, like test_socket and test_io. That may be related. |
Which test cases exactly? |
Well I don't know precisely:
Ran 158 tests in 16.867s FAILED (failures=1, skipped=4) I will try with pdb or something. |
You can run the whole test suite in verbose mode. This will pollute your Try something like "./python -E -bb -m test.regrtest -uall -v" |
You can also try to attach gdb to the running process: with Or if you don't have gdb7, you may try my faulthandler module: you will import faulthandler, signal; faulthandler.register(signal.SIGUSR1) Then you can display the current Python backtrace by sending a SIGUSR1 |
test_socket didnot lock this time (I will retry next). But test_subprocess.ProcessTestCasePOSIXPurePython.test_leaking_fds_on_error did. Looping forever on the following syscall: pipe(0x2FF1FD88) = 0 |
Here is what faulthandler reports when I trigger it as Python is locked in test_socket: """
[108/349] test_socket
Traceback (most recent call first):
File "/san_cis/home/cis/.buildbot/python-aix6/3.x.phenix.xlc/build/Lib/test/fork_wait.py", line 30 in f
""" (sic) It is coherent with what truss tells: And we can find this comment in the file: |
faulthandler is great by the way! I will use that in my application. |
faulthandler doesn't print the source code line (yet?). Here is the code: class ForkWait(unittest.TestCase):
...
def f(self, id):
while not self.stop:
self.alive[id] = os.getpid()
try:
time.sleep(SHORTSLEEP) <~~~ here
except IOError:
pass ForkWait.f() is used by ForkWait.test_wait() which creates 4 threads. You may use all_threads=True option of faulthandler to get the backtrace of all threads. Because ForkWait.test_wait() uses fork, you may also need to dump the backtrace of two processes. Debug threads+multiple processes is something horrible :-) |
At the time of the lock, there is only one python process running. So I guess the other processes have already left but this is not correctly detected. I will try again with all_threads=True. |
I kinda feel this is related to bpo-11185 which I reported earlier. |
Could you open another issue? (or post in bpo-11185 above) |
New changeset 5d0d488cbca8 by Antoine Pitrou in branch '3.2': |
Confirmed on Linux (kernel 2.6.37, eglibc 2.11.2) using strace: I think that we should just skip the test if Python uses pthread without semaphore and pthread_cond_timedwait() restarts if it is interrupted by a signal. Attached C program checks if pthread_cond_timedwait() does restart or not. On Linux, it should be compiled using: I don't know how to integrate pthread_cond_timedwait_signal.c into configure.in. I don't know if any OS does NOT restart pthread_cond_timedwait() after an interruption, so it's maybe simpler to always skip test_lock_acquire_interruption() and test_rlock_acquire_interruption() of test_threadsignals if Python uses pthread without semaphore. |
The problem is that I don't know how to check in test_threadsignals which thread implementation is used in Python. It would be nice to have some functions providing such information in the sys or sysconfig module, maybe something like sys.float_info. For example, sys._thread_info can be a dict like {'name': 'pthread', 'use_semaphore': True}. Keys:
thread_pthread.h contains many #ifdef which may be interesting to have in thread_info:
Well, the most important informations for this issue is the name of the thread implementation (pthread) and know if pthread semaphore are used or not. |
threading_info.patch:
thread._info() creates a new dict at each call. |
Most of these names will be removed in 3.3: |
Yeah, I know. My patch should be updated if bpo-11863 is fixed before this issue. Updated patch:
|
Example on my Debian Sid: >>> threading._info()
{'pthread_version': 'NPTL 2.11.2', 'use_semaphores': True, 'name': 'pthread'} |
The path which sets "pthread_version" should be inside "#ifdef Also, some comments about the doc:
|
Right, fixed.
Hum, a boolean was not a good idea. I replaced this key by:
I also renamed 'name' key to 'thread', so 'thread' is the name of the For example, test_threadsignals now uses: USING_PTHREAD_COND = (info['thread'] == 'pthread'
and info['lock'] == 'mutex+cond')
...
@unittest.skipIf(USING_PTHREAD_COND,
'POSIX condition variables cannot be interrupted') I think that the test is more clear than: (the message was wrong, the interrupt issue is on the condition, not on
Oh, I didn't realize that they were already unsupported as a compilation The new patch (version 3) marks also PyThread_Info() as private (rename |
Not that I want to bikeshed, but I think 'name' was ok (since you get Also, the 'pthread_version' documentation should state that it is |
New changeset 383a7301616b by Victor Stinner in branch 'default': |
Let's wait 6 hours for "x86 FreeBSD 6.4 3.x": The first build including my fix (383a7301616b) should be: |
New changeset bc1c6bd7eeb0 by Victor Stinner in branch 'default': |
"Builder x86 FreeBSD 6.4 3.x Build bpo-1394. Results: Build successful" Great! It should be the first success on this buildbot since... 4 months (r87292, issue bpo-8844))? Let's close this issue. |
New changeset 64008d17fb54 by Victor Stinner in branch 'default': |
sys_thread_info.patch:
Example: >>> sys.thread_info
sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.11.2') |
I don't think we want that API to be public, so it should be
What's the point? Sounds like pointless complication. |
Why should it be private in C, but not in Python? Why should it be private whereas PyLong_GetInfo() and PyFloat_GetInfo()
Complication? It does *simplify* configure.in. I don't want to create a new file just for a small function. |
Le mardi 26 avril 2011 à 01:34 +0000, STINNER Victor a écrit :
Well, if it's called _info() in Python, it's private too!
What I mean is that the "#ifdef WITH_THREAD" could be done in Also, when Python is compiled without threads, I don't think thread_info Another small thing: your doc says "name" is optional, but it shouldn't. |
My patch replaces thread._info() by sys.thread_info: it becomes public.
PyThread_GetInfo() requires some informations that are only available at the end of thread.c: USE_SEMAPHORES define from thread_pthread.h and PYTHREAD_NAME from thread.c. It is easier to define PyThread_GetInfo() here, instead of giving access to these defines outside thread.c (and these defines should remaing private).
By optional I mean that its value is None if Python is compiled without threads.
Terry Reedy proposed an empty string for name if Python is compiled without threads. Antoine suggested None instead of an empty string for lock and version fields, so I chose to use also None for None. But yes, I like the idea of sys.thread_info being None. -- Updated patch (sys_thread_info-2.patch):
|
New changeset 2b21fcf3d9a9 by Victor Stinner in branch 'default': |
New changeset 07655b3dee4f by Victor Stinner in branch '3.2': |
New changeset 3f18a03a2a1e by Victor Stinner in branch 'default': |
New changeset e5183f16c49d by Victor Stinner in branch '3.2': New changeset 54fb77e0762c by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: