Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPython 3.12 embedded in WeeChat causes segfault on subsequent calls to Py_EndInterpreter #116510

Open
trygveaa opened this issue Mar 8, 2024 · 5 comments
Labels
topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@trygveaa
Copy link

trygveaa commented Mar 8, 2024

Crash report

What happened?

WeeChat embeds CPython in order to run Python scripts inside WeeChat. It can load multiple scripts and they each get their own interpreter. When a script is loaded Py_NewInterpreter is called, and when it's unloaded Py_EndInterpreter is called.

With CPython 3.12 loading two scripts and then unloading them in the same order causes a segmentation fault. Interestingly, the segmentation fault doesn't happen if the script that was loaded last is unloaded first.

I bisected this and found it was introduced in commit de64e75. I also noticed that the crash doesn't occur in the main branch, and did another bisect and found it was fixed in commit 7a7bce5.

This issue seems similar to the one reported in #115649 which is also introduced by the same commit, but that one still crashes on the main branch (commit 735fc2c).

I haven't been able to reproduce this outside of WeeChat unfortunately, but here is a backtrace from the crash with WeeChat, with commit de64e75 of CPython and commit ec56a1103f47b15a641ff93528fd6f50025dd524 of WeeChat.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000074b5e5fd2700 in ?? ()
[Current thread is 1 (Thread 0x74b5e7bec940 (LWP 1881931))]
(gdb) bt
#0  0x000074b5e5fd2700 in ?? ()
#1  <signal handler called>
#2  0x000074b5e64f9ddd in _PyGCHead_SET_PREV (prev=<optimized out>, gc=<optimized out>) at ./Include/internal/pycore_gc.h:74
#3  _PyObject_GC_UNTRACK (op=0x74b5dfb746d0) at ./Include/internal/pycore_object.h:228
#4  PyObject_GC_UnTrack (op_raw=op_raw@entry=0x74b5dfb746d0) at Modules/gcmodule.c:2241
#5  0x000074b5e63c430c in module_dealloc (m=0x74b5dfb746d0) at Objects/moduleobject.c:672
#6  0x000074b5e63c393d in Py_DECREF (op=<optimized out>) at ./Include/object.h:681
#7  Py_XDECREF (op=<optimized out>) at ./Include/object.h:777
#8  meth_dealloc (m=0x74b5dfb81210) at Objects/methodobject.c:170
#9  0x000074b5e63b5d00 in Py_DECREF (op=0x74b5dfb81210) at ./Include/object.h:681
#10 Py_XDECREF (op=0x74b5dfb81210) at ./Include/object.h:777
#11 insertdict (interp=0x74b5e0e42010, mp=mp@entry=0x74b5df1d3d40, key=0x74b5dfb7db30, hash=<optimized out>, value=value@entry=0x74b5e6750240 <_Py_NoneStruct>) at Objects/dictobject.c:1304
#12 0x000074b5e63b6107 in _PyDict_SetItem_Take2 (value=0x74b5e6750240 <_Py_NoneStruct>, key=<optimized out>, mp=0x74b5df1d3d40) at Objects/dictobject.c:1854
#13 0x000074b5e63c5684 in _PyModule_ClearDict (d=0x74b5df1d3d40) at Objects/moduleobject.c:619
#14 0x000074b5e63c5a6e in _PyModule_Clear (m=m@entry=0x74b5df1e12b0) at Objects/moduleobject.c:567
#15 0x000074b5e64c884e in finalize_modules_clear_weaklist (verbose=0, weaklist=0x74b5dfbed080, interp=0x74b5e0e42010) at Python/pylifecycle.c:1491
#16 finalize_modules (tstate=tstate@entry=0x74b5e0ea0400) at Python/pylifecycle.c:1574
#17 0x000074b5e64cc476 in Py_EndInterpreter (tstate=0x74b5e0ea0400) at Python/pylifecycle.c:2137
#18 0x000074b5e6a5bce6 in weechat_python_unload (script=0x5d9a7b80ee60) at /home/trygve/dev/weechat/src/plugins/python/weechat-python.c:947
#19 0x000074b5e6a5bea6 in weechat_python_unload_all () at /home/trygve/dev/weechat/src/plugins/python/weechat-python.c:996
#20 0x000074b5e7b81cea in plugin_script_end (weechat_plugin=0x5d9a7b29ad50, plugin_data=0x74b5e6a9e640 <python_data>) at /home/trygve/dev/weechat/src/plugins/plugin-script.c:1841
#21 0x000074b5e6a5d8c4 in weechat_plugin_end (plugin=0x5d9a7b29ad50) at /home/trygve/dev/weechat/src/plugins/python/weechat-python.c:1634
#22 0x00005d9a78ed4c38 in plugin_unload (plugin=0x5d9a7b29ad50) at /home/trygve/dev/weechat/src/plugins/plugin.c:1261
#23 0x00005d9a78ed4d9f in plugin_unload_all () at /home/trygve/dev/weechat/src/plugins/plugin.c:1313
#24 0x00005d9a78ed50f8 in plugin_end () at /home/trygve/dev/weechat/src/plugins/plugin.c:1433
#25 0x00005d9a78e04340 in weechat_end (gui_end_cb=0x5d9a78ecb8ae <gui_main_end>) at /home/trygve/dev/weechat/src/core/weechat.c:709
#26 0x00005d9a78e03163 in main (argc=4, argv=0x7ffdb171c178) at /home/trygve/dev/weechat/src/gui/curses/normal/main.c:45

This was produced by creating these two python scripts:

dummy1.py:

import weechat

if weechat.register("dummy1", "trygveaa", "0.1", "MIT", "Dummy script 1", "", ""):
    weechat.prnt("", "Loaded dummy script 1")

dummy2.py

import weechat

if weechat.register("dummy2", "trygveaa", "0.1", "MIT", "Dummy script 2", "", ""):
    weechat.prnt("", "Loaded dummy script 2")

And then running weechat -t -r '/script load dummy1.py; /script load dummy2.py; /quit'.

Also, here is the issue report for WeeChat: weechat/weechat#2046

Since it's fixed in main it seems there won't be a problem with 3.13, but I wonder if the fix can be backported to 3.12?

CPython versions tested on:

3.12, CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.12.0a7+ (tags/v3.12.0a7-340-gde64e75616:de64e75616, Mar 8 2024, 19:43:39) [GCC 13.2.1 20230801]

@ericsnowcurrently
Copy link
Member

As to backporting 7a7bce5 (gh-113412), it wasn't obvious at the time that it was worth backporting, relative to the complexity of the change. Ultimately, that's a call for the 3.12 release manager, @Yhg1s to make.

CC @nascheme

@neo1973
Copy link

neo1973 commented May 1, 2024

The same thing happens in Kodi in various situations (e.g. xbmc/xbmc#24440 and reports on https://forum.kodi.tv/), the stack trace is basically the same:

ASAN output
==20353==ERROR: AddressSanitizer: SEGV on unknown address 0x7408bf1a18c0 (pc 0x7408e2f72d11 bp 0x7408bf19f120 sp 0x74089a1feec8 T54)
==20353==The signal is caused by a READ memory access.
    #0 0x7408e2f72d11 in PyObject_GC_UnTrack (/usr/lib/libpython3.12.so.1.0+0x172d11) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #1 0x7408e3074f7a  (/usr/lib/libpython3.12.so.1.0+0x274f7a) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #2 0x7408e2f88436  (/usr/lib/libpython3.12.so.1.0+0x188436) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #3 0x7408e2f774e0  (/usr/lib/libpython3.12.so.1.0+0x1774e0) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #4 0x7408e2ffaf00 in _PyModule_ClearDict (/usr/lib/libpython3.12.so.1.0+0x1faf00) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #5 0x7408e3074b4b  (/usr/lib/libpython3.12.so.1.0+0x274b4b) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #6 0x7408e30803fd in Py_EndInterpreter (/usr/lib/libpython3.12.so.1.0+0x2803fd) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a)
    #7 0x57c9f075f66d in CPythonInvoker::onExecutionDone() xbmc/interfaces/python/PythonInvoker.cpp:574:5
    #8 0x57c9f74c4303 in CLanguageInvokerThread::OnExit() xbmc/interfaces/generic/LanguageInvokerThread.cpp:122:14
    #9 0x57c9f74c4578 in non-virtual thunk to CLanguageInvokerThread::OnExit() xbmc/interfaces/generic/LanguageInvokerThread.cpp
    #10 0x57c9f3064a43 in CThread::Action() xbmc/threads/Thread.cpp:292:5
    #11 0x57c9f3066fb0 in CThread::Create(bool)::$_0::operator()(CThread*, std::promise<bool>) const xbmc/threads/Thread.cpp:152:18
    #12 0x57c9f3065c36 in void std::__invoke_impl<void, CThread::Create(bool)::$_0, CThread*, std::promise<bool>>(std::__invoke_other, CThread::Create(bool)::$_0&&, CThread*&&, std::promise<bool>&&) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/invoke.h:61:14
    #13 0x57c9f3065866 in std::__invoke_result<CThread::Create(bool)::$_0, CThread*, std::promise<bool>>::type std::__invoke<CThread::Create(bool)::$_0, CThread*, std::promise<bool>>(CThread::Create(bool)::$_0&&, CThread*&&, std::promise<bool>&&) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/invoke.h:96:14
    #14 0x57c9f306579f in void std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool>>>::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:292:13
    #15 0x57c9f3065618 in std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool>>>::operator()() /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:299:11
    #16 0x57c9f30651e8 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool>>>>::_M_run() /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:244:13
    #17 0x7408e0adcb62 in execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104:18
    #18 0x57c9efcddc56 in asan_thread_start(void*) (/home/mark/Coding/Repos/kodi-git/build_clang_debug_sanitizer/kodi.bin+0xa349c56) (BuildId: 32d6194589667529dde04169b4d13246ec286fba)
    #19 0x7408e08a9559  (/usr/lib/libc.so.6+0x8b559) (BuildId: 6542915cee3354fbcf2b3ac5542201faec43b5c9)
    #20 0x7408e0926a5b  (/usr/lib/libc.so.6+0x108a5b) (BuildId: 6542915cee3354fbcf2b3ac5542201faec43b5c9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/libpython3.12.so.1.0+0x172d11) (BuildId: 89181a30ef36f4bb519b2474a78e3798ad3c2f9a) in PyObject_GC_UnTrack
Thread T54 created by T0 here:
    #0 0x57c9efd9d7c8 in pthread_create (/home/mark/Coding/Repos/kodi-git/build_clang_debug_sanitizer/kodi.bin+0xa4097c8) (BuildId: 32d6194589667529dde04169b4d13246ec286fba)
    #1 0x7408e0adcc49 in __gthread_create /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:663:35
    #2 0x7408e0adcc49 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State>>, void (*)()) /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:172:37
    #3 0x57c9f3061bc2 in CThread::Create(bool) xbmc/threads/Thread.cpp:118:20
    #4 0x57c9f74c1d45 in CLanguageInvokerThread::execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&) xbmc/interfaces/generic/LanguageInvokerThread.cpp:59:5
    #5 0x57c9f74bf0a9 in ILanguageInvoker::Execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&) xbmc/interfaces/generic/ILanguageInvoker.cpp:27:10
    #6 0x57c9f74d0de4 in CScriptInvocationManager::ExecuteAsync(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::shared_ptr<ILanguageInvoker> const&, std::shared_ptr<ADDON::IAddon> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, bool, int) xbmc/interfaces/generic/ScriptInvocationManager.cpp:288:18
    #7 0x57c9f74ce2f2 in CScriptInvocationManager::ExecuteAsync(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::shared_ptr<ADDON::IAddon> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, bool, int) xbmc/interfaces/generic/ScriptInvocationManager.cpp:237:10
    #8 0x57c9f4b644f0 in ADDON::CServiceAddonManager::Start(std::shared_ptr<ADDON::IAddon> const&) xbmc/addons/Service.cpp:90:59
    #9 0x57c9f4b631c7 in ADDON::CServiceAddonManager::Start() xbmc/addons/Service.cpp:64:7
    #10 0x57c9f40976e1 in CApplication::Initialize() xbmc/application/Application.cpp:761:40
    #11 0x57c9f320aaea in XBMC_Run xbmc/platform/xbmc.cpp:43:22
    #12 0x57c9efdf173f in main xbmc/platform/posix/main.cpp:70:16
    #13 0x7408e0843ccf  (/usr/lib/libc.so.6+0x25ccf) (BuildId: 6542915cee3354fbcf2b3ac5542201faec43b5c9)

==20353==ABORTING
GDB (with debug symbols)
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007408e08ab393 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007408e085a6c8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007408e08424b8 in __GI_abort () at abort.c:79
#4  0x000057c9efdd1921 in __sanitizer::Abort() ()
#5  0x000057c9efdcf492 in __sanitizer::Die() ()
#6  0x000057c9efdafcf1 in __asan::ScopedInErrorReport::~ScopedInErrorReport() ()
#7  0x000057c9efdac351 in __asan::ReportDeadlySignal(__sanitizer::SignalContext const&) ()
#8  0x000057c9efdab199 in __asan::AsanOnDeadlySignal(int, void*, void*) ()
#9  0x00007408e085a770 in <signal handler called> () at /usr/lib/libc.so.6
#10 0x00007408e2f72d11 in _PyGCHead_SET_PREV (prev=<optimized out>, gc=<optimized out>) at ./Include/internal/pycore_gc.h:74
#11 _PyObject_GC_UNTRACK (op=0x7408b8c59170) at ./Include/internal/pycore_object.h:247
#12 PyObject_GC_UnTrack (op_raw=0x7408b8c59170) at Modules/gcmodule.c:2242
#13 0x00007408e3074f7b in module_dealloc (m=0x7408b8c59170) at Objects/moduleobject.c:709
#14 0x00007408e2f88437 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2625
#15 Py_DECREF (op=<optimized out>) at ./Include/object.h:705
#16 Py_XDECREF (op=<optimized out>) at ./Include/object.h:798
#17 Py_XDECREF (op=<optimized out>) at ./Include/object.h:795
#18 meth_dealloc (m=0x7408b8c59ad0) at Objects/methodobject.c:170
#19 0x00007408e2f774e1 in _Py_Dealloc (op=0x7408b8c59ad0) at Objects/object.c:2625
#20 Py_DECREF (op=0x7408b8c59ad0) at ./Include/object.h:705
#21 Py_XDECREF (op=0x7408b8c59ad0) at ./Include/object.h:798
#22 insertdict (interp=0x7408bf141800, mp=mp@entry=0x7408b17ed0c0, key=key@entry=0x7408b8c53470, hash=hash@entry=-9076975000305021121, value=value@entry=0x7408e33a9de0 <_Py_NoneStruct>) at Objects/dictobject.c:1319
#23 0x00007408e2ffaf01 in _PyDict_SetItem_Take2 (value=0x7408e33a9de0 <_Py_NoneStruct>, key=<optimized out>, mp=<optimized out>) at Objects/dictobject.c:1865
#24 PyDict_SetItem (value=0x7408e33a9de0 <_Py_NoneStruct>, key=<optimized out>, op=<optimized out>) at Objects/dictobject.c:1883
#25 _PyModule_ClearDict (d=0x7408b17ed0c0) at Objects/moduleobject.c:656
#26 0x00007408e3074b4c in finalize_modules_clear_weaklist (verbose=0, weaklist=0x7408b9b6e1c0, interp=0x7408bf141800) at Python/pylifecycle.c:1526
#27 finalize_modules (tstate=tstate@entry=0x7408bf19f120) at Python/pylifecycle.c:1609
#28 0x00007408e30803fe in Py_EndInterpreter (tstate=0x7408bf19f120) at Python/pylifecycle.c:2199
#29 0x000057c9f075f66e in CPythonInvoker::onExecutionDone (this=0x51300026fc80) at xbmc/interfaces/python/PythonInvoker.cpp:574
#30 0x000057c9f74c4304 in CLanguageInvokerThread::OnExit (this=0x517000197210) at xbmc/interfaces/generic/LanguageInvokerThread.cpp:122
#31 0x000057c9f74c4579 in non-virtual thunk to CLanguageInvokerThread::OnExit() () at xbmc/interfaces/generic/LanguageInvokerThread.cpp:124
#32 0x000057c9f3064a44 in CThread::Action (this=0x517000197238) at xbmc/threads/Thread.cpp:292
#33 0x000057c9f3066fb1 in CThread::Create(bool)::$_0::operator()(CThread*, std::promise<bool>) const (this=0x504000434ad8, pThread=0x517000197238, promise=...) at xbmc/threads/Thread.cpp:152
#34 0x000057c9f3065c37 in std::__invoke_impl<void, CThread::Create(bool)::$_0, CThread*, std::promise<bool> >(std::__invoke_other, CThread::Create(bool)::$_0&&, CThread*&&, std::promise<bool>&&) (__f=..., __args=@0x504000434af0: 0x517000197238, __args=...)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/invoke.h:61
#35 0x000057c9f3065867 in std::__invoke<CThread::Create(bool)::$_0, CThread*, std::promise<bool> >(CThread::Create(bool)::$_0&&, CThread*&&, std::promise<bool>&&) (__fn=..., __args=@0x504000434af0: 0x517000197238, __args=...)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/invoke.h:96
#36 0x000057c9f30657a0 in std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool> > >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) (this=0x504000434ad8)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:292
#37 0x000057c9f3065619 in std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool> > >::operator()() (this=0x504000434ad8) at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:299
#38 0x000057c9f30651e9 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<CThread::Create(bool)::$_0, CThread*, std::promise<bool> > > >::_M_run() (this=0x504000434ad0)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/std_thread.h:244
#39 0x00007408e0adcb63 in std::execute_native_thread_routine (__p=0x504000434ad0) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104
#40 0x000057c9efcddc57 in asan_thread_start(void*) ()
#41 0x00007408e08a955a in start_thread (arg=<optimized out>) at pthread_create.c:447
#42 0x00007408e0926a5c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Are there plans to address this in the 3.12 release cycle?

@mooninite
Copy link

I tried unsuccessfully backporting 7a7bce5 as threading tests failed. If someone could attach a 3.12 backport patch I could test Kodi.

@nascheme
Copy link
Member

nascheme commented May 1, 2024

I have a backport of the patch mostly done. I can finish it and then you can test. I think backporting this change would be a good idea given that it seems to work without issue in 3.13 and would solve a few problems for users of Python 3.12.

@ericsnowcurrently
Copy link
Member

ping @Yhg1s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: Todo
Development

No branches or pull requests

6 participants