You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've occasionally seen Django processes hanging around after web server restarts. Yesterday we attached gdb to one of these processes to try and figure out why. We saw the following stack trace:
(gdb) thread 6
[Switching to thread 6 (Thread 0x7f0772ecd700 (LWP 11709))]
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x0000000000516e37 in tstate_delete_common.41819 () #2 0x00000000004f7ff1 in t_bootstrap.49012 () #3 0x00007f078dd37e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #4 0x00007f078cb113fd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f078e162700 (LWP 3603))]
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x000000000051654f in PyEval_RestoreThread () #2 0x00007f07892880ac in _mysql_ConnectionObject_close (self=0x7f0774089460, args=) at _mysql.c:704 #3 0x00007f078928812a in _mysql_ConnectionObject_dealloc (self=0x7f0774089460) at _mysql.c:2022 #4 0x00000000005020b8 in subtype_dealloc.25364 () #5 0x000000000042618d in dict_dealloc.18153 () #6 0x000000000057a5b5 in PyDict_DelItem () #7 0x000000000057a987 in _localdummy_destroyed.49047 () #8 0x00000000004d91b6 in PyObject_Call () #9 0x00000000004d9e1b in PyObject_CallFunctionObjArgs () #10 0x00000000004f3057 in handle_callback () #11 0x00000000004f3212 in PyObject_ClearWeakRefs () #12 0x00000000004f3fe0 in localdummy_dealloc () #13 0x000000000042618d in dict_dealloc.18153 () #14 0x00000000005185c3 in PyThreadState_Clear () #15 0x0000000000518993 in PyInterpreterState_Clear () #16 0x00000000004f6318 in Py_Finalize () #17 0x00000000004c708a in Py_Main () #18 0x00007f078ca3e76d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6 #19 0x000000000041ba41 in _start ()
(gdb) quit
It looks like Thread 1 initially had acquired the GIL and ran PyInterpreterState_Clear which also acquired a lock on the head_mutex. However, the cleanup of the thread's MySQL connection objects makes a call into _mysql_ConnectionObject_close, which then releases the GIL for the mysql_close call. This allows Thread 6 to acquire the GIL and run. However as that thread is exiting it attempts to acquire the head_mutex in tstate_delete_common. This means Thread 6 now has the GIL and is attempting to acquire the head_mutex lock, whereas Thread 1 has locked head_mutex and is attempting to acquire the GIL resulting in a deadlock.
I see no documentation on python.org that says that releasing the GIL in a dealloc function is a bad plan, however it seems to have caused this issue. It should be ok to not give up the GIL in the MySQL connection close function, which would seem to prevent this from happening. Thoughts?
The text was updated successfully, but these errors were encountered:
methane
added a commit
to PyMySQL/mysqlclient
that referenced
this issue
Oct 24, 2014
We've occasionally seen Django processes hanging around after web server restarts. Yesterday we attached gdb to one of these processes to try and figure out why. We saw the following stack trace:
(gdb) thread 6
[Switching to thread 6 (Thread 0x7f0772ecd700 (LWP 11709))]
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x0000000000516e37 in tstate_delete_common.41819 ()
#2 0x00000000004f7ff1 in t_bootstrap.49012 ()
#3 0x00007f078dd37e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4 0x00007f078cb113fd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f078e162700 (LWP 3603))]
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0 0x00007f078dd3dfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x000000000051654f in PyEval_RestoreThread ()
#2 0x00007f07892880ac in _mysql_ConnectionObject_close (self=0x7f0774089460, args=) at _mysql.c:704
#3 0x00007f078928812a in _mysql_ConnectionObject_dealloc (self=0x7f0774089460) at _mysql.c:2022
#4 0x00000000005020b8 in subtype_dealloc.25364 ()
#5 0x000000000042618d in dict_dealloc.18153 ()
#6 0x000000000057a5b5 in PyDict_DelItem ()
#7 0x000000000057a987 in _localdummy_destroyed.49047 ()
#8 0x00000000004d91b6 in PyObject_Call ()
#9 0x00000000004d9e1b in PyObject_CallFunctionObjArgs ()
#10 0x00000000004f3057 in handle_callback ()
#11 0x00000000004f3212 in PyObject_ClearWeakRefs ()
#12 0x00000000004f3fe0 in localdummy_dealloc ()
#13 0x000000000042618d in dict_dealloc.18153 ()
#14 0x00000000005185c3 in PyThreadState_Clear ()
#15 0x0000000000518993 in PyInterpreterState_Clear ()
#16 0x00000000004f6318 in Py_Finalize ()
#17 0x00000000004c708a in Py_Main ()
#18 0x00007f078ca3e76d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#19 0x000000000041ba41 in _start ()
(gdb) quit
It looks like Thread 1 initially had acquired the GIL and ran PyInterpreterState_Clear which also acquired a lock on the head_mutex. However, the cleanup of the thread's MySQL connection objects makes a call into _mysql_ConnectionObject_close, which then releases the GIL for the mysql_close call. This allows Thread 6 to acquire the GIL and run. However as that thread is exiting it attempts to acquire the head_mutex in tstate_delete_common. This means Thread 6 now has the GIL and is attempting to acquire the head_mutex lock, whereas Thread 1 has locked head_mutex and is attempting to acquire the GIL resulting in a deadlock.
I see no documentation on python.org that says that releasing the GIL in a dealloc function is a bad plan, however it seems to have caused this issue. It should be ok to not give up the GIL in the MySQL connection close function, which would seem to prevent this from happening. Thoughts?
The text was updated successfully, but these errors were encountered: