Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cython's deep C-stacks upon deallocation #13901

Open
nbruin opened this issue Jan 3, 2013 · 11 comments
Open

Fix cython's deep C-stacks upon deallocation #13901

nbruin opened this issue Jan 3, 2013 · 11 comments

Comments

@nbruin
Copy link
Contributor

nbruin commented Jan 3, 2013

Once you know about python's trashcan that is used in its dealloc methods and that cython does not make use of it, it's not hard to cause a crash in deallocation of a cython class:

cython("""
cdef class A(object):
    cdef object ref
    def __init__(self,ref):
        self.ref = ref

""")

A long linked list of these will quickly exhaust the C-stack (set a ulimit on the stack if you have a lot of memory):

print "allocating a"
a=A(None)
for i in range(10**6):
    a=A(a)

print "deleting a"
del a
print "done deleting a"

Once you interleave with a python container type (tuple, for instance), the trashcan starts to kick in. The following runs without problem.

b=A(None)
print "allocating b"
for i in range(10**6):
    b=A((b,))

print "deleting b"
del b
print "done deleting b"

This issue came up as a side issue on #13896. The trashcan is a rather complicated thing to get working right. In particular, cython must take precautions to ensure that once deallocation is started on an object, it isn't put into the trashcan in deallocs of base types.

This is now upstream http://trac.cython.org/cython_trac/ticket/797

Upstream: Reported upstream. Developers acknowledge bug.

CC: @simon-king-jena

Component: memleak

Issue created by migration from https://trac.sagemath.org/ticket/13901

@nbruin nbruin added this to the sage-5.11 milestone Jan 3, 2013
@nbruin
Copy link
Contributor Author

nbruin commented Jan 3, 2013

Attachment: test.sage.gz

script to cause a C stack overflow.

@nbruin
Copy link
Contributor Author

nbruin commented Jan 3, 2013

comment:1

Example output:

sh-4.2$ ulimit -s 8000
sh-4.2$ sage test.sage
allocating b
deleting b
done deleting b
allocating a
deleting a
/usr/local/sage/5.0/spkg/bin/sage: line 468:  4393 Segmentation fault      (core dumped) python "$@"

@jpflori
Copy link

jpflori commented Jan 3, 2013

comment:3

In fact, I think the precautions taken are not enough for general cython classes. With the little

++_PyTrash_delete_nesting;
Py_TRASHCAN_SAFE_BEGIN(self);
--_PyTrash_delete_nesting;
...
++_PyTrash_delete_nesting;
Py_TRASHCAN_SAFE_END(self);
--_PyTrash_delete_nesting;

dance they are making sure there is room for one extra trashcan nesting provided that that call doesn't >use the same trick. However, a cython class could have a whole inheritance hierarchy going here (that >would all use this trick too!), so I'm pretty sure that the exact scenario they describe could still >happen. You'd need to know the depth of the inheritance line (for deallocs, multiple inheritance can't >happen, right?) and ensure there's enough room for all those.

About what you pointed out in #13896 and I've reproduced here where it now belongs,
I agree we can not use such a trick.
But I don't feel really confortable with this trick anyway.

It seems to me the trashcan macros where designed to avoid the situation you produced here, that is objects pointing to objects pointing to... which could eventually blow up the stack.

But then it also includes the fact that a given type can extend another one and so they had to use that trick to fool the trashcan mecanism.
This seems indeed necessary because it could surely lead to strange situation if the type specific action where happening between the trashcan macro and the super class action outside of it, especially if this super class doesn't use the trashcan macros.
And hopefully the trick is enough because I guess that at runtime through the Python interpreter all you create use the subtype_dealloc function which is skipping through all the inheritance till it reaches some class with another allocator which will hopefully be a base type not extending anything else or not using the trashcan macros.

The problem to deal with here is that each layer of Cython type will have its own dealloc and without any knowledge on the depth of the type tower we cannot so easily trick the trashcan macros indeed...
But shouldn't we just ignore such extensions? i.e. if we detect we are in the Cython world and giving hand to another Cython deallocator, we should not trick the trashcan magic by reincrementing _Py_Trash_delete_nesting until we hit a non Cython type.

@robertwb
Copy link
Contributor

robertwb commented Jan 3, 2013

comment:4

This is now http://trac.cython.org/cython_trac/ticket/797

@nbruin
Copy link
Contributor Author

nbruin commented Jan 3, 2013

comment:5

Replying to @jpflori:

But shouldn't we just ignore such extensions? i.e. if we detect we are in the Cython world and giving hand to another Cython deallocator, we should not trick the trashcan magic by reincrementing _Py_Trash_delete_nesting until we hit a non Cython type.

The problem there is that if the type gets trashcanned halfway up the tower, the pointer would be added to the trashcan we'd be returning all the way up to the leaf class dealloc instance. That would decref and free the memory block. Once the trashcan gets emptied, the pointer (now masquerading as an instance of one of the supers) gets processed again and again decreffed and freed. That's the double-free we're trying to avoid. For cython, the appropriate solution is probably something along the lines of

Py_TRASHCAN_SAFE_BEGIN(self); 
...
Clear_Weakrefs()
PY_CLEAR(slots)
...
--_PyTrash_delete_nesting;
super.dealloc(self) 
++_PyTrash_delete_nesting;
...
Py_TRASHCAN_SAFE_END(self);

i.e., PY_CLEAR etc. can lead to trashcanning, but not the calling of the dealloc of the super class. The python code figured that left a loophole open, but I think in cython that's not a loophole but a necessity: If you nest your inheritance deeper than the C-stack, you're asking for it.

@jpflori
Copy link

jpflori commented Jan 4, 2013

comment:6

Replying to @nbruin:

Replying to @jpflori:

But shouldn't we just ignore such extensions? i.e. if we detect we are in the Cython world and giving hand to another Cython deallocator, we should not trick the trashcan magic by reincrementing _Py_Trash_delete_nesting until we hit a non Cython type.

The problem there is that if the type gets trashcanned halfway up the tower, the pointer would be added to the trashcan we'd be returning all the way up to the leaf class dealloc instance. That would decref and free the memory block. Once the trashcan gets emptied, the pointer (now masquerading as an

I think I meant the other way around, that is emulating for Cython classes what subtype_dealloc does to skip the subtype themselves using subtype_dealloc.
This would ensure that a tower of Cython classes gets deallocated at once, even though the objects pointed to (which typically could lead to stack blow up) can be trashcan, just as is the case with a tower of CPython types using subtype_dealloc.

Of course implementing that will be more complicated than in the subtype_dealloc, because we have to know we are at the top and the kind of types above us.
And at each level we have some specific action to take (what is not the case with subtype_dealloc).

Not sure in fact now that the esaiest way is not to first check the depth of the type tower and use exactly the same trick as CPython but with that depth instead of 1 when decr/incrementing artificially the trashcan depth counter...

@jpflori

This comment has been minimized.

@jpflori
Copy link

jpflori commented Jan 4, 2013

Changed upstream from Reported upstream. No feedback yet. to Reported upstream. Developers acknowledge bug.

@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.1, sage-6.2 Jan 30, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.2, sage-6.3 May 6, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@mkoeppe mkoeppe removed this from the sage-6.4 milestone Dec 29, 2022
@nbruin
Copy link
Contributor Author

nbruin commented Feb 21, 2023

According to cython/cython#2842 there is now a cython directive that makes the trashcan available in cython. Some testing to see that it meets the requirements in sage required

@mkoeppe
Copy link
Member

mkoeppe commented Feb 21, 2023

That cython PR is for their elusive 3.0 release. We still carry a backport patch for our cython 0.29.x: build/pkgs/cython/patches/trashcan.patch. It is currently only used in our patched version of cypari2.

@nbruin
Copy link
Contributor Author

nbruin commented Feb 22, 2023

Oh, OK. I saw a "backport" mention at the bottom of the closed ticket, but now I see it's on an unrelated project. One has to be careful with reading github mentions carefully, apparently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants