Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

anntzer · 2018-08-16T11:34:05Z

Issue description

Consider the following example:

#include <pybind11/pybind11.h>

namespace py = pybind11;

py::object o;

PYBIND11_MODULE(python_example, m) {
  o = py::module::import("matplotlib.mathtext").attr("MathTextParser")("foo");

  py::cpp_function cleanup = [](py::handle weakref) {
    printf("weakref cleanup\n");
    o = py::none{};
    weakref.dec_ref();
  };
  (void) py::weakref(m, cleanup).release();

  py::module::import("atexit").attr("register")(
    py::cpp_function{
      [&]() -> void {
        printf("atexit cleanup\n");
        o = py::none{};
      }
    }
  );
}

Compile e.g. using the python_example template repository, Python 3.6, pybind11 2.2.3; run with Matplotlib 3.0rc1 (for example). Note that the matplotlib.mathtext.MathTextParser class appears to be fairly innocuous -- it directly inherits from object, its __init__ is just

def __init__(self, output): self._output = output.lower()

and it has no __del__. Still, for reasons I don't understand, it is the only class with which I have been able to reproduce the bug described below.

The code above sets a C-level global variable to an instance of MathTextParser, then tries to make sure that the instance is destroyed before CPython shuts down, using both "module destructor" methods currently documented by pybind11 (using atexit is documented in master, not in 2.2.3).

On Windows only, open a Python console and import the python_example module, then close the terminal without first exiting Python. This triggers a "Python has stopped working" error ("A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available."). Attaching Visual Studio to the process reveals that this is actually a segfault in subtype_dealloc (typeobject.c), with PyThreadState_GET() returning NULL; the type being deallocated is MathTextParser.

My interpretation of what is happening is that CPython has already torn down some critical utilities at that point (hence PyThreadState_GET returning NULL), but the module destructors have not been called. At some later point, we try to decref the global instance of MathTextParser that was still alive, ultimately leading to the deallocation of the MathTextParser type and the above segfault.

See item #3 of matplotlib/mplcairo#6 for the place where this issue was first reported.

Reproducible example code

See above.

The text was updated successfully, but these errors were encountered:

[test all] Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/) [ghstack-poisoned]

Pull Request resolved: #27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 ghstack-source-id: 91898924 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

[test all] Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/) [ghstack-poisoned]

Pull Request resolved: #27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 ghstack-source-id: 91961049 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

[test all] Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/) [ghstack-poisoned]

Pull Request resolved: #27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 ghstack-source-id: 92035069 Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

Summary: Pull Request resolved: #27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 ghstack-source-id: 92035069 Test Plan: unit tests on python 3.6 and python 3.5 Differential Revision: D17727362 fbshipit-source-id: c254023f6a85acce35528ba756a4efabba9a519f

Summary: Pull Request resolved: pytorch#27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported pybind/pybind11#1598 and pybind/pybind11#1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close pytorch#27182 ghstack-source-id: 92035069 Test Plan: unit tests on python 3.6 and python 3.5 Differential Revision: D17727362 fbshipit-source-id: c254023f6a85acce35528ba756a4efabba9a519f

UtmostK16 · 2021-09-29T11:21:37Z

I also meet the problem of exit failed when use atexit in a singlton object(static).
environment infos:
python 3.8
win7 (win10 is OK)

UtmostK16 · 2021-09-29T11:35:19Z

Add
The singleton object will create a sub thread, and using std::condition_variable::wait/notify_all.My Program was blocked after invoke notify_all in singleton object's destructor function.

anntzer mentioned this issue Nov 8, 2018

static py::object in C++ function may cause Segfault on exit #1598

Closed

zhaojuanmao mentioned this issue Oct 14, 2019

fix python rpc handler exit crash pytorch/pytorch#27251

Closed

rwgk mentioned this issue Feb 9, 2023

FWD pybind11 google/pybind11k#1493

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

anntzer commented Aug 16, 2018

UtmostK16 commented Sep 29, 2021

UtmostK16 commented Sep 29, 2021

Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

Comments

anntzer commented Aug 16, 2018

Issue description

Reproducible example code

UtmostK16 commented Sep 29, 2021

UtmostK16 commented Sep 29, 2021