You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee=Noneclosed_at=<Date2021-08-30.14:04:32.443>created_at=<Date2021-06-18.10:27:10.715>labels= ['3.10', 'library', '3.9', 'type-crash']
title='Segfault in _PyTrash_begin when faulthandler tries to dump thread stacks'updated_at=<Date2021-08-30.14:04:32.442>user='https://github.com/dgrisby'
I am using Python 3.9.4 on CentOS 7. faulthandler is registered with SIGUSR1:
Sending SIGUSR1 normally correctly dumps the thread stacks, but occasionally it segfaults from the main thread instead:
Thread 1 (Thread 0x7efe15e69740 (LWP 15201)):
#0 _PyTrash_begin (tstate=tstate@entry=0x0, op=op@entry=0x757ece0) at Objects/object.c:2125 #1 0x00007efe156f05e5 in frame_dealloc (f=0x757ece0) at Objects/frameobject.c:578 #2 0x00007efe15898f88 in _Py_DECREF (op=0x757ece0) at Include/object.h:430 #3 dump_traceback (write_header=0, tstate=0x757e1a0, fd=2) at Python/traceback.c:821 #4 _Py_DumpTracebackThreads (fd=fd@entry=2, interp=<optimized out>, interp@entry=0x0, current_tstate=0xbe6a70) at Python/traceback.c:921 #5 0x00007efe1590be7d in faulthandler_dump_traceback (interp=<optimized out>, all_threads=1, fd=2) at Modules/faulthandler.c:243 #6 faulthandler_user (signum=10) at Modules/faulthandler.c:839 #7 <signal handler called> #8 0x00007efe15243d2f in do_futex_wait () from /lib64/libpthread.so.0 #9 0x00007efe15243e07 in __new_sem_wait_slow () from /lib64/libpthread.so.0 #10 0x00007efe15243ea5 in sem_timedwait () from /lib64/libpthread.so.0 #11 0x00007efe15896d11 in PyThread_acquire_lock_timed (lock=lock@entry=0x7ea7080, microseconds=microseconds@entry=5000000, intr_flag=intr_flag@entry=1) at Python/thread_pthread.h
:457 #12 0x00007efe158f35a4 in acquire_timed (timeout=5000000000, lock=0x7ea7080) at Modules/_threadmodule.c:63 #13 lock_PyThread_acquire_lock (self=0x7efdf4518750, args=<optimized out>, kwds=<optimized out>) at Modules/_threadmodule.c:146 #14 0x00007efe15749916 in method_vectorcall_VARARGS_KEYWORDS (func=0x7efe15e1b310, args=0x186d208, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/descrobject.c:346
It has failed because tstate is null. tstate came from Py_TRASHCAN_BEGIN_CONDITION that calls PyThreadState_GET(), assuming it returns a valid pointer, but the comment on the _PyThreadState_GET macro says:
Efficient macro reading directly the 'gilstate.tstate_current' atomic
variable. The macro is unsafe: it does not check for error and it can
The only place I can see that tstate_current would be set to NULL is in _PyThreadState_DeleteCurrent(). I suspect that there has been a race with a thread exit.
I'm not sure quite what to do about this. Perhaps faulthandler should check if tstate_current is NULL and set it suitably if so?