Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parents probably not reclaimed due to too much caching #715

Closed
robertwb opened this issue Sep 20, 2007 · 422 comments
Closed

Parents probably not reclaimed due to too much caching #715

robertwb opened this issue Sep 20, 2007 · 422 comments

Comments

@robertwb
Copy link
Contributor

Here is a small example illustrating the issue.

The memory footprint of the following piece of code grows indefinitely.

sage: K = GF(1<<55,'t') 
sage: a = K.random_element() 
sage: while 1: 
....:     E = EllipticCurve(j=a); P = E.random_point(); 2*P; del E, P;

E and P get deleted, but when 2*P is computed, the action of integers on A, the abelian group of rational points of the ellitpic curve, gets cached in the corecion model.

A key-value pair is left in coercion_model._action_maps dict:

(ZZ,A,*) : IntegerMulAction

Moreover there is at least also references to A in the IntegerMulAction and one in ZZ._action_hash.

So weak refs should be used in all these places if it does not slow things too much.

To be merged with #11521. Apply:

and then the patches from #11521.

Depends on #13145
Depends on #13741
Depends on #13746
Depends on #11521

Dependencies: #13145, #13741, #13746, to be merged with #11521

CC: @jpflori @zimmermann6 @vbraun @robertwb @nbruin @malb @orlitzky

Component: coercion

Keywords: weak cache coercion Cernay2012

Author: Simon King, Jean-Pierre Flori

Reviewer: Jean-Pierre Flori, Simon King, Nils Bruin

Merged: sage-5.5.beta0

Issue created by migration from https://trac.sagemath.org/ticket/715

@sagetrac-mabshoff sagetrac-mabshoff mannequin added the t: feature label Sep 23, 2007
@sagetrac-mabshoff sagetrac-mabshoff mannequin added this to the sage-2.10.2 milestone Feb 3, 2008
@sagetrac-mabshoff sagetrac-mabshoff mannequin removed the t: feature label Feb 3, 2008
@mwhansen
Copy link
Contributor

comment:3

I think this is a bit too vague for a ticket. Robert, could you be more specific or close this?

@robertwb
Copy link
Contributor Author

comment:4

The coercion model needs to use weakrefs so that parents aren't needlessly referenced when they're discarded. It is nontrivial to see where the weakrefs need to go, and how to do so without slowing the code down.

The ticket is still valid.

@loefflerd

This comment has been minimized.

@jpflori

This comment has been minimized.

@jpflori
Copy link

jpflori commented Jul 4, 2011

comment:7

With the piece of code in the desrciption, there is only one reference to these objects in that ZZ._hash_actions dictionary because to build it we test if A1 == A2 and not A1 is A2 as in coercion_model._action_maps, and because of the current implementation of ellitpic curves, see http://groups.google.com/group/sage-nt/browse_thread/thread/ec8d0ad14a819082 and #11474, and decause the above code use only one j-invariant, only ones gets finally stored.

However with random curves, I guess there would be all of them.

About the weakref, the idea should more be to build something like WeakKeyDictionnary if it does not slow down coercion too much...

@nbruin
Copy link
Contributor

nbruin commented Jul 4, 2011

comment:8

The following example also exhibits a suspicious, steady growth in memory use. The only reason I can think of why that would happen is that references to the created finite field remain lying around somewhere, preventing deallocation:

sage: L=prime_range(10^8)
sage: for p in L: k=GF(p)

If you change it to the creation of a polynomial ring the memory use rises much faster:

sage: L=prime_range(10^8)
sage: for p in L: k=GF(p)['t']

Are "unique" parents simply never deallocated?

@jpflori
Copy link

jpflori commented Jul 4, 2011

comment:9

Be aware that polynomial rings are also cached because of uniqueness of parents, explaining somehow your second memory consumption; see #5970 for example.

For finite fields I did not check.

@jpflori
Copy link

jpflori commented Oct 24, 2011

comment:11

See #11521 for some concrete instances of this problem and some advice to investigate it.

@simon-king-jena
Copy link
Member

comment:12

In my code for the computation Ext algebras of basic algebras, I use letterplace algebras (see #7797), and they involve the creation of many polynomial rings. Only one of them is used at a time, so, the others could be garbage collected. But they aren't, and I suspect this is because of using strong references in the coercion cache.

See the following example (using #7797)

sage: F.<a,b,c> = FreeAlgebra(GF(4,'z'), implementation='letterplace')
sage: import gc
sage: len(gc.get_objects())
170947
sage: a*b*c*b*c*a*b*c
a*b*c*b*c*a*b*c
sage: len(gc.get_objects())
171556
sage: del F,a,b,c
sage: gc.collect()
81
sage: len(gc.get_objects())
171448
sage: cm = sage.structure.element.get_coercion_model()
sage: cm.reset_cache()
sage: gc.collect()
273
sage: len(gc.get_objects())
171108

That is certainly not a proof of my claim, but it indicates that it might be worth while to investigate.

In order to facilitate work, I am providing some other tickets that may be related to this:

I guess that one should use a similar cache model to what I did in #11521: The key for the cache should not just be (domain,codomain), because we want that garbage collection of the cache item is already allowed if just one of domain or codomain is collectable.

@simon-king-jena
Copy link
Member

comment:13

I try to wrap my mind around weak references. I found that when creating a weak reference, one can also provide a method that is called when the weak reference becomes invalid. I propose to use such method to erase the deleted object from the cache, regardless whether it appears as domain or codomain.

Here is a proof of concept:

sage: ref = weakref.ref
sage: D = {}
sage: def remove(x):
....:     for a,b,c in D.keys():
....:         if a is x or b is x or c is x:
....:             D.__delitem__((a,b,c))
....:             
sage: class A:
....:     def __init__(self,x):
....:         self.x = x
....:     def __repr__(self):
....:         return str(self.x)
....:     def __del__(self):
....:         print "deleting",self.x
....:         
sage: a = A(5)
sage: b = A(6)
sage: r = ref(a,remove)
sage: s = ref(b,remove)
sage: D[r,r,s] = 1
sage: D[s,r,s] = 2
sage: D[s,s,s] = 3
sage: D[s,s,1] = 4
sage: D[r,s,1] = 5
sage: D.values()
[5, 3, 1, 4, 2]
sage: del a
deleting 5
sage: D.values()
[4, 3]
sage: del b
deleting 6
sage: D.values()
[]

@simon-king-jena
Copy link
Member

comment:14

It turns out that using weak references in the coercion cache will not be enough. Apparently there are other direct references that have to be dealt with.

@simon-king-jena
Copy link
Member

comment:15

I wonder whether the problem has already been solved. I just tested the example from the ticket description, and get (at least with #11900, #11521 and #11115):

sage: K = GF(1<<55,'t')
sage: a = K.random_element()
sage: m0 = get_memory_usage()
sage: for i in range(1000):
....:     E = EllipticCurve(j=a); P = E.random_point(); PP = 2*P
....:     
sage: get_memory_usage() - m0
15.22265625

I think that this is not particularly scary. I'll repeat the test with vanilla sage-4.8.alpha3, but this will take a while to rebuild.

@simon-king-jena
Copy link
Member

comment:16

No, even in vanilla sage-4.8.alpha3 I don't find a scary memory leak in this example.

Do we have a better example? One could, of course, argue that one should use weak references for caching even if we do not find an apparent memory leak. I am preparing a patch for it now.

@simon-king-jena
Copy link
Member

comment:17

Here is an experimental patch.

A new test shows that the weak caching actually works.

Note that the patch also introduces a weak cache for polynomial rings, which might be better to put into #5970. Well, we can sort things out later...

@simon-king-jena
Copy link
Member

comment:18

It needs work, though. Some tests in sage/structure fail, partially because of pickling, partially because some tests do not follow the new specification of TripleDict (namely that the first two parts of each key triple and the associated value must be weak referenceable.

@simon-king-jena
Copy link
Member

comment:19

Now I wonder: Should I try to use weak references and make it accept stuff that does not allow for weak references?

In the intended applications, weak references are possible. But in some tests and in the pickle jar, the "wrong" type of keys (namely strings and ints) are used.

@simon-king-jena
Copy link
Member

comment:20

The only place where the weak references are created is in the set(...) method of TripleDict. I suggest to simply catch the error that may occur when creating a weak reference, and then use a different way of storing the key. I am now running tests, and I hope that this ticket will be "needs review" in a few hours.

@simon-king-jena
Copy link
Member

comment:21

With the attached patch, all tests pass for me, and the new features are doctested. Needs review!

@simon-king-jena
Copy link
Member

Author: Simon King

@simon-king-jena
Copy link
Member

Changed keywords from none to weak cache coercion

@simon-king-jena
Copy link
Member

Dependencies: #11900

@vbraun
Copy link
Member

vbraun commented Dec 26, 2012

comment:360

I noticed that the SAGE_DEBUG documentation doesn't quite match what we are doing with it. So I proposed to change it at #13865.

@jpflori
Copy link

jpflori commented Dec 27, 2012

comment:361

Hopefully the updated Cython 0.17.3 at #13832 might fix the last bugs we encounter.
It indeed involves a fix concerning deallocation of weakreferable cdefed classes, see
https://groups.google.com/d/topic/cython-users/4es75DeacRA/discussion for the release annoucement and
https://groups.google.com/d/topic/cython-users/K5EFvq22UNI/discussion for a previous bug report.
So the end of the story is that the intensive of weakrefs made here just revealed bugs already present in Sage but which by some chance never produced segfaults.

See some comments as well on testing Sage with a pydebug enable Python at #13864 and the long thread at
https://groups.google.com/d/topic/sage-devel/Wt7uxbDkh_A/discussion

@jdemeyer
Copy link

comment:362

Please note #13870.

@jdemeyer
Copy link

jdemeyer commented Jan 7, 2013

comment:363

More bad news: #715 + #11521 cause a significant slowdown for the command

sage: time p = polar_plot(lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000)

from 22 to 33 seconds. See https://groups.google.com/forum/?fromgroups#!topic/sage-devel/EzFPIG6EFMI

@simon-king-jena
Copy link
Member

comment:364

Without #715 and friends,

sage: %prun p = polar_plot(lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000) 

yields

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    88368   12.267    0.000   20.873    0.000 arith.py:1439(gcd)
     9263    9.309    0.001   32.102    0.003 <string>:1(<lambda>)
79788/39894    2.004    0.000    2.865    0.000 lazy_attribute.py:506(__get__)
    39894    1.599    0.000    4.681    0.000 homset.py:296(__init__)
    97631    1.145    0.000    1.737    0.000 arith.py:1611(lcm)
    19950    0.961    0.000    6.910    0.000 homset.py:40(Hom)
   185999    0.879    0.000    0.880    0.000 {method 'canonical_coercion' of 'sage.structure.coerce.CoercionModel_cache_maps' objects}
 8263/999    0.824    0.000   29.602    0.030 plot.py:2307(adaptive_refinement)
    39895    0.373    0.000    1.783    0.000 {hasattr}
   159576    0.328    0.000    0.328    0.000 {getattr}
    14601    0.309    0.000    0.510    0.000 quotient_fields.py:55(gcd)
    39890    0.304    0.000    5.033    0.000 homset.py:573(__init__)
   116811    0.259    0.000    0.259    0.000 weakref.py:55(__getitem__)

With #715, it becomes

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    89840   43.019    0.000   68.180    0.001 arith.py:1489(gcd)
     9415   24.524    0.003   97.043    0.010 <string>:1(<lambda>)
82004/41002    5.752    0.000    7.564    0.000 lazy_attribute.py:506(__get__)
    41002    4.583    0.000   12.597    0.000 homset.py:353(__init__)
    20504    4.108    0.000   19.894    0.001 homset.py:80(Hom)
   189095    2.924    0.000    2.925    0.000 {method 'canonical_coercion' of 'sage.structure.coerce.CoercionModel_cache_maps' objects}
    99255    2.392    0.000    3.942    0.000 arith.py:1661(lcm)
 8415/999    1.517    0.000   88.121    0.088 plot.py:2316(adaptive_refinement)
   205132    1.118    0.000    1.699    0.000 weakref.py:223(__new__)
   164064    1.088    0.000    1.088    0.000 weakref.py:228(__init__)
    41003    0.979    0.000    5.099    0.000 {hasattr}
   205132    0.581    0.000    0.581    0.000 {built-in method __new__ of type object at 0x7f9b33e874a0}
   164008    0.578    0.000    0.578    0.000 {getattr}
    40998    0.546    0.000   13.200    0.000 homset.py:630(__init__)
   119635    0.545    0.000    0.545    0.000 weakref.py:55(__getitem__)
    14954    0.532    0.000    0.813    0.000 quotient_fields.py:55(gcd)
    20499    0.424    0.000    8.366    0.000 rings.py:635(__new__)
   133072    0.394    0.000    0.394    0.000 rational_field.py:217(__hash__)
   114209    0.370    0.000    0.370    0.000 {method 'lcm' of 'sage.structure.element.PrincipalIdealDomainElement' objects}
    40998    0.332    0.000   13.532    0.000 homset.py:30(__init__)
    20499    0.330    0.000    0.330    0.000 dynamic_class.py:122(dynamic_class)
    20499    0.304    0.000    7.942    0.000 homset.py:23(RingHomset)
   119748    0.297    0.000    0.297    0.000 {method 'gcd' of 'sage.rings.integer.Integer' objects}
   189095    0.262    0.000    0.262    0.000 {sage.structure.element.get_coercion_model}
    61551    0.213    0.000    0.213    0.000 {isinstance}
    41002    0.204    0.000    5.303    0.000 sets_cat.py:255(_element_constructor_)
        1    0.201    0.201   98.842   98.842 plot.py:2401(generate_plot_points)

So, it seems to me that the slow-down is in the creation of homsets.

First question: Why are so many homsets needed in this example?

Second question: What can we do to make the creation of a homset more efficient?

@simon-king-jena
Copy link
Member

comment:365

Replying to @simon-king-jena:

First question: Why are so many homsets needed in this example?

By this, I mean "Why so many even with strong cache?"

Note that #715 only involves a mild increase in the number of homsets created. But the time for creation increases dramatically.

@JohnCremona
Copy link
Member

comment:366

To me it looks as if most of the extra time is in the symbolic gcd calls. But seriously, why on earth does potting a simple trig function involve anything as sophisticated as creating homsets? And why also are any gcds being computed? Is it that for each of the values t which are being iterated over, which are rational multiples of pi, the evaluation of sin(7t) and cos (30t) is being much too clever when all that is needed is a low-precision numerical value?

@simon-king-jena
Copy link
Member

comment:367

Replying to @JohnCremona:

To me it looks as if most of the extra time is in the symbolic gcd calls. But seriously, why on earth does potting a simple trig function involve anything as sophisticated as creating homsets?

And in particular: Why is Set of Homomorphisms from Integer Ring to Real Interval Field with 64 bits of precision created a couple of thousands of times, even with a strong cache?

And why also are any gcds being computed? Is it that for each of the values twhich are being iterated over, which are rational multiples of pi, the evaluation of sin(7t) and cos (30t) is being much too clever when all that is needed is a low-precision numerical value?

I don't know. But in any case, there is a regression in the time for creating a homset. I have opened #13922 for this problem.

@simon-king-jena
Copy link
Member

comment:368

Replying to @simon-king-jena:

And in particular: Why is Set of Homomorphisms from Integer Ring to Real Interval Field with 64 bits of precision created a couple of thousands of times, even with a strong cache?

PS: The category for this homset is always the same, namely the category of euclidean domains. It should definitely not happen that this homset is created more than once, even with a weak cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests