Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to run module destructor (as weakref callback or as atexit handler) leads to segfault when object is later destroyed too late in CPython's shutdown sequence #1493

Open
anntzer opened this issue Aug 16, 2018 · 2 comments

Comments

@anntzer
Copy link
Contributor

anntzer commented Aug 16, 2018

Issue description

Consider the following example:

#include <pybind11/pybind11.h>

namespace py = pybind11;

py::object o;

PYBIND11_MODULE(python_example, m) {
  o = py::module::import("matplotlib.mathtext").attr("MathTextParser")("foo");

  py::cpp_function cleanup = [](py::handle weakref) {
    printf("weakref cleanup\n");
    o = py::none{};
    weakref.dec_ref();
  };
  (void) py::weakref(m, cleanup).release();

  py::module::import("atexit").attr("register")(
    py::cpp_function{
      [&]() -> void {
        printf("atexit cleanup\n");
        o = py::none{};
      }
    }
  );
}

Compile e.g. using the python_example template repository, Python 3.6, pybind11 2.2.3; run with Matplotlib 3.0rc1 (for example). Note that the matplotlib.mathtext.MathTextParser class appears to be fairly innocuous -- it directly inherits from object, its __init__ is just

def __init__(self, output): self._output = output.lower()

and it has no __del__. Still, for reasons I don't understand, it is the only class with which I have been able to reproduce the bug described below.

The code above sets a C-level global variable to an instance of MathTextParser, then tries to make sure that the instance is destroyed before CPython shuts down, using both "module destructor" methods currently documented by pybind11 (using atexit is documented in master, not in 2.2.3).

On Windows only, open a Python console and import the python_example module, then close the terminal without first exiting Python. This triggers a "Python has stopped working" error ("A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available."). Attaching Visual Studio to the process reveals that this is actually a segfault in subtype_dealloc (typeobject.c), with PyThreadState_GET() returning NULL; the type being deallocated is MathTextParser.

My interpretation of what is happening is that CPython has already torn down some critical utilities at that point (hence PyThreadState_GET returning NULL), but the module destructors have not been called. At some later point, we try to decref the global instance of MathTextParser that was still alive, ultimately leading to the deallocation of the MathTextParser type and the above segfault.

See item #3 of matplotlib/mplcairo#6 for the place where this issue was first reported.

Reproducible example code

See above.

zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 14, 2019
[test all] 

Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

[ghstack-poisoned]
zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 14, 2019
Pull Request resolved: #27251

 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182
ghstack-source-id: 91898924

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)
zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 15, 2019
[test all] 

Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

[ghstack-poisoned]
zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 15, 2019
Pull Request resolved: #27251

 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182
ghstack-source-id: 91961049

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)
zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 16, 2019
[test all] 

Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)

[ghstack-poisoned]
zhaojuanmao added a commit to pytorch/pytorch that referenced this issue Oct 16, 2019
Pull Request resolved: #27251

 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182
ghstack-source-id: 92035069

Differential Revision: [D17727362](https://our.internmc.facebook.com/intern/diff/D17727362/)
facebook-github-bot pushed a commit to pytorch/pytorch that referenced this issue Oct 16, 2019
Summary:
Pull Request resolved: #27251

 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close #27182
ghstack-source-id: 92035069

Test Plan: unit tests on python 3.6 and python 3.5

Differential Revision: D17727362

fbshipit-source-id: c254023f6a85acce35528ba756a4efabba9a519f
thiagocrepaldi pushed a commit to thiagocrepaldi/pytorch that referenced this issue Feb 4, 2020
Summary:
Pull Request resolved: pytorch#27251

 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit.

See similar issues reported pybind/pybind11#1598
and pybind/pybind11#1493.

Our local tests also caught this segment faults if py::objects are cleaned
up at program exit. The explaination is: CPython cleans up most critical
utitlies before cleaning up PythonRpcHandler singleton, so when
PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it
will crash.

The solution is to clean up py::objects earlier when Rpc agent join().
Be note that py::objects can not be cleaned up when Rpc agent is destroyed
as well, as Rpc agent is global variable and it will have same issue as
PythonRpcHandler.

close pytorch#27182
ghstack-source-id: 92035069

Test Plan: unit tests on python 3.6 and python 3.5

Differential Revision: D17727362

fbshipit-source-id: c254023f6a85acce35528ba756a4efabba9a519f
@UtmostK16
Copy link

I also meet the problem of exit failed when use atexit in a singlton object(static).
environment infos:
python 3.8
win7 (win10 is OK)

@UtmostK16
Copy link

Add
The singleton object will create a sub thread, and using std::condition_variable::wait/notify_all.My Program was blocked after invoke notify_all in singleton object's destructor function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants