Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memleak in UniqueRepresentation, @cached_method #12215

Closed
vbraun opened this issue Dec 21, 2011 · 190 comments
Closed

Memleak in UniqueRepresentation, @cached_method #12215

vbraun opened this issue Dec 21, 2011 · 190 comments

Comments

@vbraun
Copy link
Member

vbraun commented Dec 21, 2011

The documentation says that UniqueRepresentation uses weak refs, but this was switched over to the @cached_method decorator. The latter does currently use strong references, so unused unique parents stay in memory forever:

import sage.structure.unique_representation
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

for i in range(2,1000):
    ring = ZZ.quotient(ZZ(i))
    vectorspace = ring^2

import gc
gc.collect()
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

Related tickets:

Further notes:

  • not everything in Python can be weakref'ed, for example None cannot.
  • some results that are expensive to compute should not just be cached by a weak reference. Perhaps there is place for a permanent cache, or maybe some minimal age before garbage collecting it.

Apply

CC: @simon-king-jena @jdemeyer @mwhansen @vbraun @jpflori

Component: memleak

Keywords: UniqueRepresentation cached_method caching

Author: Simon King

Reviewer: Nils Bruin

Merged: sage-5.7.beta1

Issue created by migration from https://trac.sagemath.org/ticket/12215

@vbraun vbraun added this to the sage-5.0 milestone Dec 21, 2011
@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:3

See my comment at #5970: It seems that having a weak version of cached_function (which is used to decorate UniqueRepresentation.__classcall__ is the missing bit (in addition to #11521 and #715 and a two-line change in the polynomial ring constructor) for fixing the issues at #5970.

I think this should be done on top of #11115, which rewrites cached methods and already has a positive review.

@simon-king-jena
Copy link
Member

Dependencies: #11115

@simon-king-jena
Copy link
Member

comment:5

Here is a patch. It isn't tested yet.

@simon-king-jena
Copy link
Member

comment:6

... and I immediately updated the patch: Join categories were not using unique representation but cached_function (by #11900). So, that had to change.

@simon-king-jena
Copy link
Member

Changed dependencies from #11115 to #11115 #11900

@simon-king-jena
Copy link
Member

comment:7

Sorry, it was impossible to use weak_cached_function on the join function in sage.categories.category, since it may return a list (not weakly referenceable). Hence, I had to work around. With the attached patch (applied on top of #11900 and its dependencies), sage at least starts...

@simon-king-jena
Copy link
Member

comment:8

It turns out that all the patches can still not fix the problem. We also have to deal with sage.structure.factory.UniqueFactory.

I suggest to add an option to UniqueFactory, that decides whether a strong or a weak cache is used. And I suggest to do this here, because I don't want to create yet another ticket.

The applications of UniqueFactory should mainly be in cases where weak references work. Therefore I suggest to use the weak cache by default - I am curious how many doc tests will fail...

Coercion sucks.

@simon-king-jena
Copy link
Member

comment:9

It turns out that UniqueFactory already was somehow using weak references, but in an improper way. The new patch version replaces that by WeakValueDictionary.

It doesn't solve the problem, though.

@simon-king-jena
Copy link
Member

comment:10

I have slightly updated my patch, so that there is no conflict with #11935.

@simon-king-jena
Copy link
Member

comment:11

There is yet another location where it makes sense to use @weak_cached_function: For the cache of dynamic classes!

Namely, dynamic classes are frequently used in the category framework, they have a strong cache, and the parent/element classes keep a pointer to the category they belong to. So, that's preventing categories from being garbage collected.

I think that my patches from here, #715, and #11935 (which reduces the number of dynamic classes created) might actually be enough to fix the problem. When I run

sage: for p in primes(2,1000000):
....:     R = GF(p)['x','y','z']
....:     print get_memory_usage()

then one initially still sees an increased memory usage. But after a while it seems to stabilise.

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:12

I have updated the patch. It documents the changes, and at least the tests in sage/misc/cachefunc.pyx, in sage/categories/..., in sage/rings/... and in sage/structure/unique_representation.py pass.

Hence, needs review!

@simon-king-jena
Copy link
Member

Author: Simon King

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

Work Issues: segfaults for elliptic curves

@simon-king-jena
Copy link
Member

comment:15

While the tests in sage/categories, sage/rings and sage/structure/unique_representation.py pass, I get some segfaults for the elliptic curve tests. Thus, needs work.

@simon-king-jena
Copy link
Member

comment:16

I did sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py", and it did not reveal a segfault while running the tests. The test process itself crashed:

830 tests in 54 items.
830 passed and 0 failed.
Test passed.
The doctested process was killed by signal 11
         [23.8 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py" # Killed/crashed

Strange.

@simon-king-jena
Copy link
Member

comment:17

I think I found the problem.

Some doctest of the form

sage: K.residue_field()
<expected answer>

segfaults. But when the result is assigned to a variable, like this

sage: RF = K.residue_field(); RF
<expected answer>

then everything works.

Is it perhaps the case that garbage collection of the residue field (that was enabled by my patch) happens between the creation and the computation of the string representation of the object?

But that is strange. There are variables _ and __, which are supposed to provide strong references to the last two results - hence, there should be no garbage collection.

@simon-king-jena
Copy link
Member

comment:18

sage.structure.factory.UniqueFactory did use weak references before. But it did so - I think - improperly, namely without using weakref.WeakValueDictionary. The new patch version changes that.

It isn't ready for review, yet, because of the segfaults.

@simon-king-jena
Copy link
Member

comment:19

Some old code is not using the cache: There was some coerce map created in sage/rings/residue_field.pyx, whose parent was not created by Hom(domain,codomain), but directly by RingHomset(domain,codomain).

Changing it fixed at least one segfault. I wish all segfaults would go away so easily...

@simon-king-jena
Copy link
Member

comment:20

Fortunately, I now have a short example that triggers a memory access error when leaving Sage:

sage: E = EllipticCurve('15a1')
sage: K.<t>=NumberField(x^2+2*x+10)
sage: EK=E.base_extend(K)
sage: EK.torsion_subgroup()
Torsion Subgroup isomorphic to Z/4 + Z/4 associated to the Elliptic Curve defined by y^2 + x*y + y = x^3 + x^2 + (-10)*x + (-10) over Number Field in t with defining polynomial x^2 + 2*x + 10
sage: quit
Exiting Sage (CPU time 0m1.98s, Wall time 0m52.03s).
local/bin/sage-sage: Zeile 303: 30045 Speicherzugriffsfehler  sage-ipython "$@" -i

However, I wonder how I can trigger the error without leaving Sage, and how I can trace what is going on.

@simon-king-jena
Copy link
Member

comment:148

Replying to @jdemeyer:

With #12215+#13378 but without #12313:

sage -t  -force_lib devel/sage/sage/schemes/elliptic_curves/heegner.py

Ouch. Well, I hope I can reproduce it in Sage-5.6.rc0 debug version.

@simon-king-jena
Copy link
Member

comment:149

Fortunately I can confirm it (at least with MALLOC_CHECK_=3). I'm running it now under gdb.

@simon-king-jena
Copy link
Member

comment:150

What I get is:

Program received signal SIGSEGV, Segmentation fault.
0x00007fffedeb9f4c in __pyx_pf_4sage_9structure_11coerce_dict_16TripleDictEraser_2__call__ (__pyx_v_self=0x73982d0, __pyx_v_r=0x7fffea2aaf00) at sage/structure/coerce_dict.c:1107
1107      __pyx_t_10 = PyList_GET_ITEM(__pyx_t_1, (__pyx_v_h % PyList_GET_SIZE(__pyx_t_4)));
(gdb) bt
#0  0x00007fffedeb9f4c in __pyx_pf_4sage_9structure_11coerce_dict_16TripleDictEraser_2__call__ (__pyx_v_self=0x73982d0, __pyx_v_r=0x7fffea2aaf00) at sage/structure/coerce_dict.c:1107
#1  0x00007fffedeb9592 in __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__ (__pyx_v_self=0x73982d0, __pyx_args=0x75bfd10, __pyx_kwds=0x0) at sage/structure/coerce_dict.c:966
#2  0x00007ffff79be33e in PyObject_Call (func=0x73982d0, arg=0x75bfd10, kw=0x0) at Objects/abstract.c:2529
#3  0x00007ffff79bf059 in PyObject_CallFunctionObjArgs (callable=0x73982d0) at Objects/abstract.c:2760
#4  0x00007ffff7a64194 in handle_callback (ref=0x7fffea2aaf00, callback=0x73982d0) at Objects/weakrefobject.c:881
#5  0x00007ffff7a645e9 in PyObject_ClearWeakRefs (object=0x90c07b0) at Objects/weakrefobject.c:965
#6  0x00007fffee53af5b in __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject (o=0x90c07b0) at sage/structure/category_object.c:8990
#7  0x00007fffee7e6fe0 in __pyx_tp_dealloc_4sage_9structure_6parent_Parent (o=0x90c07b0) at sage/structure/parent.c:21519
#8  0x00007fffeea2aa7f in __pyx_tp_dealloc_4sage_9structure_10parent_old_Parent (o=0x90c07b0) at sage/structure/parent_old.c:7261
#9  0x00007fffeec3bca8 in __pyx_tp_dealloc_4sage_9structure_11parent_base_ParentWithBase (o=0x90c07b0) at sage/structure/parent_base.c:1876
#10 0x00007fffead58cbc in __pyx_tp_dealloc_4sage_9structure_11parent_gens_ParentWithGens (o=0x90c07b0) at sage/structure/parent_gens.c:5865
#11 0x00007ffff7a4ce4c in subtype_dealloc (self=0x90c07b0) at Objects/typeobject.c:1014
#12 0x00007ffff7a27be4 in _Py_Dealloc (op=0x90c07b0) at Objects/object.c:2243
#13 0x00007ffff7a480b0 in tupledealloc (op=0x845ab50) at Objects/tupleobject.c:220
#14 0x00007ffff7a27be4 in _Py_Dealloc (op=0x845ab50) at Objects/object.c:2243
#15 0x00007ffff7a480b0 in tupledealloc (op=0x7fffea2a0760) at Objects/tupleobject.c:220
#16 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7fffea2a0760) at Objects/object.c:2243
#17 0x00007ffff7a18e8e in dict_dealloc (mp=0x9052b70) at Objects/dictobject.c:985
#18 0x00007ffff7a27be4 in _Py_Dealloc (op=0x9052b70) at Objects/object.c:2243
#19 0x00007ffff7a4cd73 in subtype_dealloc (self=0x85045a0) at Objects/typeobject.c:999
#20 0x00007ffff7a27be4 in _Py_Dealloc (op=0x85045a0) at Objects/object.c:2243
#21 0x00007fffed102313 in __pyx_tp_dealloc_4sage_10categories_7functor_Functor (o=0x7985950) at sage/categories/functor.c:3209
#22 0x00007fffecee64a1 in __pyx_tp_dealloc_4sage_10categories_6action_Action (o=0x7985950) at sage/categories/action.c:6461
#23 0x00007fffbcd614ea in __pyx_tp_dealloc_4sage_6matrix_6action_MatrixMulAction (o=0x7985950) at sage/matrix/action.c:4724
#24 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7985950) at Objects/object.c:2243
#25 0x00007ffff7a02c8e in list_dealloc (op=0x759fa38) at Objects/listobject.c:309
#26 0x00007ffff7a27be4 in _Py_Dealloc (op=0x759fa38) at Objects/object.c:2243
#27 0x00007ffff7a02c8e in list_dealloc (op=0x7964858) at Objects/listobject.c:309
#28 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7964858) at Objects/object.c:2243
#29 0x00007fffedecec13 in __pyx_tp_clear_4sage_9structure_11coerce_dict_TripleDict (o=0x3cbd8d0) at sage/structure/coerce_dict.c:5921
#30 0x00007ffff7b1378b in delete_garbage (collectable=0x7fffffff30f0, old=0x7ffff7dc1540 <generations+96>) at Modules/gcmodule.c:769
#31 0x00007ffff7b13d04 in collect (generation=2) at Modules/gcmodule.c:930
#32 0x00007ffff7b13f06 in collect_generations () at Modules/gcmodule.c:996
#33 0x00007ffff7b14bcc in _PyObject_GC_Malloc (basicsize=264) at Modules/gcmodule.c:1457
#34 0x00007ffff7b14c04 in _PyObject_GC_New (tp=0x7ffff7d9c5a0 <PyDict_Type>) at Modules/gcmodule.c:1467
#35 0x00007ffff7a16cc7 in PyDict_New () at Objects/dictobject.c:277
#36 0x00007fffeaa83860 in __pyx_f_4sage_4libs_4pari_3gen_12PariInstance_new_ref (__pyx_v_self=0xcceae0, __pyx_v_g=0xa968360, __pyx_v_parent=0xa8d8748) at sage/libs/pari/gen.c:49228
#37 0x00007fffea9fb417 in __pyx_pf_4sage_4libs_4pari_3gen_3gen_80__getitem__ (__pyx_v_self=0xa8d8748, __pyx_v_n=0x61f7f0) at sage/libs/pari/gen.c:8638
#38 0x00007fffea9f63b7 in __pyx_pw_4sage_4libs_4pari_3gen_3gen_81__getitem__ (__pyx_v_self=0xa8d8748, __pyx_v_n=0x61f7f0) at sage/libs/pari/gen.c:7643
#39 0x00007fffeaa9a688 in __pyx_sq_item_4sage_4libs_4pari_3gen_gen (o=0xa8d8748, i=1) at sage/libs/pari/gen.c:55757
#40 0x00007ffff79bcdd7 in PySequence_GetItem (s=0xa8d8748, i=1) at Objects/abstract.c:1989
#41 0x00007ffff7a01934 in iter_iternext (iterator=0xa99c300) at Objects/iterobject.c:58
#42 0x00007ffff7a04abe in listextend (self=0xa7db060, b=0xa8d8748) at Objects/listobject.c:872
#43 0x00007ffff7a08ad9 in list_init (self=0xa7db060, args=0xa7a3920, kw=0x0) at Objects/listobject.c:2458
#44 0x00007ffff7a4c1ad in type_call (type=0x7ffff7d9a3c0 <PyList_Type>, args=0xa7a3920, kwds=0x0) at Objects/typeobject.c:737
#45 0x00007ffff79be33e in PyObject_Call (func=0x7ffff7d9a3c0 <PyList_Type>, arg=0xa7a3920, kw=0x0) at Objects/abstract.c:2529
#46 0x00007fffea9eb6f1 in __pyx_pf_4sage_4libs_4pari_3gen_3gen_12list (__pyx_v_self=0xa8d87d0) at sage/libs/pari/gen.c:4507
#47 0x00007fffea9eb4a0 in __pyx_pw_4sage_4libs_4pari_3gen_3gen_13list (__pyx_v_self=0xa8d87d0, unused=0x0) at sage/libs/pari/gen.c:4455
#48 0x00007ffff7a21156 in PyCFunction_Call (func=0xa85b858, arg=0x7ffff7f90060, kw=0x0) at Objects/methodobject.c:90
#49 0x00007ffff79be33e in PyObject_Call (func=0xa85b858, arg=0x7ffff7f90060, kw=0x0) at Objects/abstract.c:2529
#50 0x00007fffe14ff0aa in __pyx_pf_4sage_5rings_10polynomial_25polynomial_rational_flint_25Polynomial_rational_flint_6__init__ (__pyx_v_self=0xa781258, __pyx_v_parent=0x135f730, __pyx_v_x=0xa8d87d0, 
    __pyx_v_check=0x7ffff7d89ec0 <_Py_TrueStruct>, __pyx_v_is_gen=0x7ffff7d89e80 <_Py_ZeroStruct>, __pyx_v_construct=0x7ffff7d89e80 <_Py_ZeroStruct>)
    at sage/rings/polynomial/polynomial_rational_flint.cpp:5760
#51 0x00007fffe14fc966 in __pyx_pw_4sage_5rings_10polynomial_25polynomial_rational_flint_25Polynomial_rational_flint_7__init__ (__pyx_v_self=0xa781258, __pyx_args=0xa6f97d0, __pyx_kwds=0xa83c050)
    at sage/rings/polynomial/polynomial_rational_flint.cpp:5165
#52 0x00007ffff7a4c1ad in type_call (type=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, args=0xa6f97d0, kwds=0xa83c050)
    at Objects/typeobject.c:737
#53 0x00007ffff79be33e in PyObject_Call (func=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, arg=0xa6f97d0, kw=0xa83c050)
    at Objects/abstract.c:2529
#54 0x00007ffff7ac8282 in ext_do_call (func=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, pp_stack=0x7fffffff3a98, flags=2, na=4, nk=1)
    at Python/ceval.c:4334
#55 0x00007ffff7ac1a8b in PyEval_EvalFrameEx (f=0xa663520, throwflag=0) at Python/ceval.c:2705
#56 0x00007ffff7ac420b in PyEval_EvalCodeEx (co=0x7fffe303bb40, globals=0x10a78b0, locals=0x0, args=0xa8acf10, argcount=2, kws=0x0, kwcount=0, defs=0x7fffe17556e8, defcount=4, closure=0x0)
    at Python/ceval.c:3253
#57 0x00007ffff79fd447 in function_call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/funcobject.c:526
#58 0x00007ffff79be33e in PyObject_Call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/abstract.c:2529
#59 0x00007ffff79da359 in instancemethod_call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/classobject.c:2578
#60 0x00007ffff79be33e in PyObject_Call (func=0x7ffff0cdf060, arg=0x7fffea521990, kw=0x0) at Objects/abstract.c:2529
#61 0x00007fffe6888d3b in __pyx_f_4sage_9structure_11coerce_maps_24DefaultConvertMap_unique__call_ (__pyx_v_self=0x7fffbcf7f3f0, __pyx_v_x=0xa8d87d0, __pyx_skip_dispatch=0)
    at sage/structure/coerce_maps.c:3485
#62 0x00007fffee7acda1 in __pyx_pf_4sage_9structure_6parent_6Parent_28__call__ (__pyx_v_self=0x135f730, __pyx_v_x=0xa8d87d0, __pyx_v_args=0x7ffff7f90060, __pyx_v_kwds=0xa751480)
    at sage/structure/parent.c:7415
#63 0x00007fffee7ac0a4 in __pyx_pw_4sage_9structure_6parent_6Parent_29__call__ (__pyx_v_self=0x135f730, __pyx_args=0xa80fc30, __pyx_kwds=0x0) at sage/structure/parent.c:7096
#64 0x00007ffff79be33e in PyObject_Call (func=0x135f730, arg=0xa80fc30, kw=0x0) at Objects/abstract.c:2529
#65 0x00007fffddd720a3 in __pyx_pf_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_2__init__ (__pyx_v_self=0xa860400, __pyx_v_parent=0x331b730, __pyx_v_f=0xa8d87d0)
    at sage/rings/number_field/number_field_element.cpp:6090
#66 0x00007fffddd6d545 in __pyx_pw_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_3__init__ (__pyx_v_self=0xa860400, __pyx_args=0xa85eab0, __pyx_kwds=0x0)
    at sage/rings/number_field/number_field_element.cpp:5340
#67 0x00007ffff7a595f6 in wrap_init (self=0xa860400, args=0xa85eab0, 
    wrapped=0x7fffddd6d316 <__pyx_pw_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_3__init__(PyObject*, PyObject*, PyObject*)>, kwds=0x0) at Objects/typeobject.c:4719
#68 0x00007ffff79e2145 in wrapper_call (wp=0xa99c220, args=0xa85eab0, kwds=0x0) at Objects/descrobject.c:998
#69 0x00007ffff79be33e in PyObject_Call (func=0xa99c220, arg=0xa85eab0, kw=0x0) at Objects/abstract.c:2529
#70 0x00007ffff7ac6404 in PyEval_CallObjectWithKeywords (func=0xa99c220, arg=0xa85eab0, kw=0x0) at Python/ceval.c:3890
#71 0x00007ffff79e1194 in wrapperdescr_call (descr=0x7fffde6d9ae0, args=0xa85eab0, kwds=0x0) at Objects/descrobject.c:306
#72 0x00007ffff79be33e in PyObject_Call (func=0x7fffde6d9ae0, arg=0xa881360, kw=0x0) at Objects/abstract.c:2529
#73 0x00007fffddaddacf in __pyx_pf_4sage_5rings_12number_field_30number_field_element_quadratic_28NumberFieldElement_quadratic___init__ (__pyx_v_self=0xa860400, __pyx_v_parent=0x331b730, 
    __pyx_v_f=0xa8d87d0) at sage/rings/number_field/number_field_element_quadratic.cpp:3893
#74 0x00007fffddadbbdb in __pyx_pw_4sage_5rings_12number_field_30number_field_element_quadratic_28NumberFieldElement_quadratic_1__init__ (__pyx_v_self=0xa860400, __pyx_args=0xa85ca38, __pyx_kwds=0x0)
    at sage/rings/number_field/number_field_element_quadratic.cpp:3386
#75 0x00007ffff7a4c1ad in type_call (type=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, args=0xa85ca38, kwds=0x0)
    at Objects/typeobject.c:737
#76 0x00007ffff79be33e in PyObject_Call (func=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, arg=0xa85ca38, kw=0x0)
    at Objects/abstract.c:2529
#77 0x00007ffff7ac7bc3 in do_call (func=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, pp_stack=0x7fffffff4a10, na=2, nk=0)
    at Python/ceval.c:4239
#78 0x00007ffff7ac6efc in call_function (pp_stack=0x7fffffff4a10, oparg=2) at Python/ceval.c:4044
#79 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x33b1e00, throwflag=0) at Python/ceval.c:2666
#80 0x00007ffff7ac71f4 in fast_function (func=0x7fffddd47450, pp_stack=0x7fffffff4d90, n=2, na=2, nk=0) at Python/ceval.c:4107
#81 0x00007ffff7ac6ee0 in call_function (pp_stack=0x7fffffff4d90, oparg=1) at Python/ceval.c:4042
#82 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x37a3270, throwflag=0) at Python/ceval.c:2666
#83 0x00007ffff7ac420b in PyEval_EvalCodeEx (co=0x7fffde989ca0, globals=0x1108b20, locals=0x0, args=0xa850f10, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#84 0x00007ffff79fd447 in function_call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/funcobject.c:526
#85 0x00007ffff79be33e in PyObject_Call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/abstract.c:2529
#86 0x00007ffff79da359 in instancemethod_call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/classobject.c:2578
#87 0x00007ffff79be33e in PyObject_Call (func=0x7fffc2ecc360, arg=0xa70d1b0, kw=0x0) at Objects/abstract.c:2529
#88 0x00007fffe6888d3b in __pyx_f_4sage_9structure_11coerce_maps_24DefaultConvertMap_unique__call_ (__pyx_v_self=0x7fffbcd4b780, __pyx_v_x=0xa8d87d0, __pyx_skip_dispatch=0)
    at sage/structure/coerce_maps.c:3485
#89 0x00007fffee7acda1 in __pyx_pf_4sage_9structure_6parent_6Parent_28__call__ (__pyx_v_self=0x331b730, __pyx_v_x=0xa8d87d0, __pyx_v_args=0x7ffff7f90060, __pyx_v_kwds=0xa9d32c0)
    at sage/structure/parent.c:7415
#90 0x00007fffee7ac0a4 in __pyx_pw_4sage_9structure_6parent_6Parent_29__call__ (__pyx_v_self=0x331b730, __pyx_args=0xa7a3a70, __pyx_kwds=0x0) at sage/structure/parent.c:7096
#91 0x00007ffff79be33e in PyObject_Call (func=0x331b730, arg=0xa7a3a70, kw=0x0) at Objects/abstract.c:2529
#92 0x00007ffff7ac6404 in PyEval_CallObjectWithKeywords (func=0x331b730, arg=0xa7a3a70, kw=0x0) at Python/ceval.c:3890
#93 0x00007ffff7ab2344 in builtin_map (self=0x0, args=0xa8dde70) at Python/bltinmodule.c:1038
#94 0x00007ffff7a210f2 in PyCFunction_Call (func=0x7ffff7f52060, arg=0xa8dde70, kw=0x0) at Objects/methodobject.c:81
#95 0x00007ffff7ac6cdc in call_function (pp_stack=0x7fffffff5900, oparg=2) at Python/ceval.c:4021
#96 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x3944d10, throwflag=0) at Python/ceval.c:2666
...

I see a couple of familiar names in the backtrace...

@simon-king-jena
Copy link
Member

Crash log

@simon-king-jena
Copy link
Member

comment:151

Attachment: sage_crash_WgD9iG.log

Thanks to Volker's enhanced backtraces, running the test in verbose mode and without gdb yields this backtrace, and the crash occurs here (line 6467 of heegner.py)

        sage: E = EllipticCurve('681b')
        sage: I = E.heegner_index(-8); I

Unfortunately, running this in an interactive session works just fine.

@simon-king-jena
Copy link
Member

comment:152

Got it, I think.

The crash happens in the last line of the following snippet

  __pyx_t_1 = __pyx_v_self->D->buckets;
  __Pyx_INCREF(__pyx_t_1);
  __pyx_t_4 = __pyx_v_self->D->buckets;
  __Pyx_INCREF(__pyx_t_4);
  __pyx_t_10 = PyList_GET_ITEM(__pyx_t_1, (__pyx_v_h % PyList_GET_SIZE(__pyx_t_4)));

and according to the crash log, we have

        __pyx_t_1 = 0x7f3db87dcb00 <_Py_NoneStruct>

Hence, again, we have the problem that some attributes have already become invalid. I think this was fixed by the second patch from #12313.

Suggestion: In order to keep things modular, the part of the second #12313 patch that applies to TripleDict shall be moved here, so that #12215 remains independent of #12313. And then, the second patch of #12313 should be replaced by something that only takes care of the new MonoDict.

Rationale: #12313 has a problem with a time regression, while #12215 should (hopefully) be fine after installing the fix.

@simon-king-jena
Copy link
Member

Safer callback in TripleDictEraser

@simon-king-jena
Copy link
Member

comment:153

Attachment: trac12215_safe_callback.patch.gz

The patch's up, and it fixes the crash in heegner.py (tested in sage-5.6.rc0 debug version with MALLOC_CHECK_=3)

Apply trac12215_weak_cached_function_combined.patch trac12215_safe_callback.patch

@simon-king-jena

This comment has been minimized.

@nbruin
Copy link
Contributor

nbruin commented Jan 22, 2013

comment:154

Yes, this solves the problem here as well, so positive review.

It looks like the analysis on #12313:226 and the patch that followed from it was based on this ticket. I probably pulled the non-raw patches for #12313 when I tested ... Should we factor out a utility from the Patchbot to pull and apply patches given a ticket number?

Happy to see this work did find some use after all. Again, I believe that in the future, when TripleDictEraser holds a weakref to its dictionary, this won't be necessary anymore, because the weakref will be broken before attributes on the dictionary get erased.

That enhanced traceback (including cython code!) is extremely cool. A big thanks to Volker for making that happen. With that traceback, you only only have to stare at the traceback to diagnose this problem.

@simon-king-jena
Copy link
Member

comment:155

Just for the record: All tests pass on my openSuse laptop in the debug version of sage-5.6.rc0+#13878+#13378+#12215, with MALLOC_CHECK_=3.

@jdemeyer
Copy link

Merged: sage-5.7.beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants