Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py::keep_alive: nurse destructor can run before patient's #856

Closed
bmerry opened this issue May 16, 2017 · 6 comments
Closed

py::keep_alive: nurse destructor can run before patient's #856

bmerry opened this issue May 16, 2017 · 6 comments

Comments

@bmerry
Copy link
Contributor

bmerry commented May 16, 2017

Issue description

The py::keep_alive weakref trick doesn't always work. In some cases (at least in Python 2.7, haven't tried 3), the weakref callback can run before the referenced object is cleared. This happens in the cyclic garbage collector: first all weakref callbacks are handled (handle_weakrefs function), then the garbage is cleared. If the nurse is being destroyed on this path, then it can lead to the patient destructor running before the nurse's (leading to undefined behaviour if the destructor of the nurse references the patient). Note that the nurse doesn't itself have to be part of a reference cycle, it just has to be reachable from one. However, it needs to have Py_TPFLAGS_HAVE_GC. In the example code I've ensured that by specifying py::dynamic_attr, but I think when I originally encountered the problem it was set due to having a declared base.

I've had similar issues with Boost.Python before. I suspect the weakref approach is fundamentally untenable; it may be necessary to store a list of handles (with strong references) in the PyObject and decref them when the object is cleared.

Reproducible example code

keepalive.cpp:

#include <pybind11/pybind11.h>

namespace py = pybind11;

class B;

class A {
private:
    B &b;
public:
    A(B &b) : b(b) {}
    ~A() { py::print("In A::~A()"); }
};

class B {
public:
    ~B() { py::print("In B::~B()"); }
};

PYBIND11_PLUGIN(keepalive) {
    py::module m("keepalive");

    py::class_<A>(m, "A", py::dynamic_attr())
        .def(py::init<B &>(), py::keep_alive<1, 2>());

    py::class_<B>(m, "B").def(py::init<>());
    return m.ptr();
}

test.py:

from keepalive import A, B

lst = [A(B())]
lst.append(lst)
del lst

Output is

In B::~B()
In A::~A()

Expected output is the other way around.

@bmerry
Copy link
Contributor Author

bmerry commented May 16, 2017

Or maybe it would work for the callback function to call the C++ destructor if it hasn't already been called, and deal with that gracefully when the object is cleared - but I don't know nearly enough about the internals to know if that's sane or not.

@jagerman
Copy link
Member

I suspect what's happening here is that both A and B are being destroyed at the same time ("same" being within a single garbage collection). Essentially keep_alive is a bit weaker than you're expecting: it really only guarantees that A won't be admitted for garbage collection before B, but doesn't guarantee that they can't be admitted together (and as far as I know Python doesn't provide any guarantees on the order of destruction of objects within a garbage collection pass). If they are admitted together, you can get out-of-order destruction, as you're seeing.

I'm not sure that there's a simple workaround that could be done to deal with this on the pybind11 side (especially since neither object actually has to be pybind-managed).

You ought to be able to work around it by storing by storing a shared_ptr<B> instead of a B & (and making B's holder shared_ptr<B>); that should let the C++ object live beyond the destruction of the wrapping Python object.

@bmerry
Copy link
Contributor Author

bmerry commented May 27, 2017

it really only guarantees that A won't be admitted for garbage collection before B, but doesn't guarantee that they can't be admitted together

Not exactly. B isn't identified as garbage in the garbage collection pass, because its reference is held by pybind11. It's really about the order of clearing A vs calling the weakref callbacks on A's weakref, which isn't guaranteed. If A is cleared first, everything goes fine. If the weakref callback is called first, then it drops the last reference to B, immediately causing B to be cleared.

I'm not sure that there's a simple workaround that could be done to deal with this on the pybind11 side (especially since neither object actually has to be pybind-managed).

I hadn't considered that the objects aren't necessarily pybind-managed, but on the other hand if the nurse isn't a C++ object then I don't think this is an issue. I have some ideas for fixing it if it is pybind-managed, which I can experiment with. Is there a way to determine whether a given py::handle is pybind-managed?

You ought to be able to work around it by storing by storing a shared_ptr instead of a B & (and making B's holder shared_ptr); that should let the C++ object live beyond the destruction of the wrapping Python object.

I've already done exactly that, but I'm in the fortunate position of being able to modify the C++ API I'm wrapping.

@jagerman
Copy link
Member

B isn't identified as garbage in the garbage collection pass, because its reference is held by pybind11.

B won't be as long as A is referenced elsewhere, but once A is identified for gc, the weakref is cleared which admits both the weakref itself and A, so the list, A, B, and the weakref all get deallocated in the same gc pass.

from keepalive import A, B
import gc

gc.collect()
gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_COLLECTABLE)

lst = [A(B())]
lst.append(lst)
del lst

gc.collect()
gc.set_debug(0)

output:

gc: collecting generation 2...
gc: objects in each generation: 5 0 6532
gc: collectable <keepalive.A 0x7f6a4085a8b8>
gc: collectable <list 0x7f6a4099abc8>
In B::~B()
In A::~A()
gc: done, 3 unreachable, 0 uncollectable, 0.0006s elapsed

A fix would be nice, but I still suspect it will end up getting rather complicated.

Is there a way to determine whether a given py::handle is pybind-managed?

pybind11::detail::get_type_info(Py_TYPE(h.ptr())) returns a detail::type_info * if h is an instance of a pybind-registered type, nullptr if it isn't.

@bmerry
Copy link
Contributor Author

bmerry commented May 27, 2017

B won't be as long as A is referenced elsewhere, but once A is identified for gc, the weakref is cleared which admits both the weakref itself and A, so the list, A, B, and the weakref all get deallocated in the same gc pass.

Yes. I think we're in agreement and just thinking about the same thing in different ways.

You ought to be able to work around it by storing by storing a shared_ptr instead of a B & (and making B's holder shared_ptr); that should let the C++ object live beyond the destruction of the wrapping Python object.

Thanks, I'll see what I can cook up and how hairy it gets. Failing that, we should at least put a warning in the user manual.

@jagerman
Copy link
Member

Failing that, we should at least put a warning in the user manual.

Agreed. I'll wait for you to see if you can come up with something else, but if not, a note in the docs is definitely worthwhile.

bmerry added a commit to bmerry/pybind11 that referenced this issue May 28, 2017
This is a first attempt to address pybind#856. There is still a bit of work to
do (e.g., some tests, and double-checking that I'm not leaking
references). At this point I'm looking for feedback on whether the idea
is sound and would be accepted.

Instead of the weakref trick, every pybind-managed object holds a
(Python) list of references, which gets cleared in `clear_instance`. The
list is only constructed the first time it is needed.

The one downside I can see is that every pybind object will get bigger
by `sizeof(PyObject *)`, even if no keep_alive is used. On the other
hand, it should significantly reduce the overhead of using keep_alive,
since there is no need to construct a weakref object and a callback
object.

One thing I'm not sure about is whether the call to
`Py_CLEAR(instance->patients);` belongs inside or outside the
`has_value` test. At the moment it's inside, but that's cargo culting
rather than from an understanding of the conditions under which an
instance might not have a value.
bmerry added a commit to bmerry/pybind11 that referenced this issue May 28, 2017
This is a first attempt to address pybind#856. There is still a bit of work to
do (e.g., some tests, and double-checking that I'm not leaking
references). At this point I'm looking for feedback on whether the idea
is sound and would be accepted.

Instead of the weakref trick, every pybind-managed object holds a
(Python) list of references, which gets cleared in `clear_instance`. The
list is only constructed the first time it is needed.

The one downside I can see is that every pybind object will get bigger
by `sizeof(PyObject *)`, even if no keep_alive is used. On the other
hand, it should significantly reduce the overhead of using keep_alive,
since there is no need to construct a weakref object and a callback
object.

One thing I'm not sure about is whether the call to
`Py_CLEAR(instance->patients);` belongs inside or outside the
`has_value` test. At the moment it's inside, but that's cargo culting
rather than from an understanding of the conditions under which an
instance might not have a value.
bmerry added a commit to bmerry/pybind11 that referenced this issue Jun 24, 2017
This fixes pybind#856. Instead of the weakref trick, the internals structure
holds an unordered_map from PyObject* to a vector of references. To
avoid the cost of the unordered_map lookup for objects that don't have
any keep_alive patients, a flag is added to each instance to indicate
whether there is anything to do.
bmerry added a commit to bmerry/pybind11 that referenced this issue Jun 24, 2017
This fixes pybind#856. Instead of the weakref trick, the internals structure
holds an unordered_map from PyObject* to a vector of references. To
avoid the cost of the unordered_map lookup for objects that don't have
any keep_alive patients, a flag is added to each instance to indicate
whether there is anything to do.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants