Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqrt memory leaks #9129

Closed
zimmermann6 opened this issue Jun 3, 2010 · 54 comments
Closed

sqrt memory leaks #9129

zimmermann6 opened this issue Jun 3, 2010 · 54 comments

Comments

@zimmermann6
Copy link

cf http://groups.google.com/group/sage-support/browse_thread/thread/8c18b2b91004c35a#

m = get_memory_usage()
i=0
while True:
    i+=1
    a = 2.sqrt()
    if i%1000==0:
        print get_memory_usage(m)

Depends on #14780

Upstream: Fixed upstream, in a later stable release.

CC: @robertwb @malb @craigcitro @koffie @williamstein @burcin @jpflori

Component: basic arithmetic

Keywords: sd35.5

Author: Volker Braun

Branch/Commit: 42a4f7f

Reviewer: Marc Mezzarobba

Issue created by migration from https://trac.sagemath.org/ticket/9129

@rlmill
Copy link
Mannequin

rlmill mannequin commented Jul 8, 2010

comment:2

I've run valgrind with a clean startup and quit:

http://sage.math.washington.edu/home/rlmill/sage-clean.mem.log

and with an execution of the following loop:

for i in range(1000):
    if i%50 == 0:
        print i
    a = ZZ(randint(2^400,2^800)).sqrt()
    b = Mod(2^32+1,3).sqrt()

http://sage.math.washington.edu/home/rlmill/sage-sqrt.mem.log

Of particular importance are lines 18757, 16971, 16915, 16831, etc. (Just search through for "definitely")

@zimmermann6
Copy link
Author

comment:3

thanks Robert, however your file is not accessible (neither from the web nor from sage.math).
Paul

@rlmill
Copy link
Mannequin

rlmill mannequin commented Jul 8, 2010

comment:4

Fixed.

@zimmermann6
Copy link
Author

comment:5

with the following code in Sage 4.4.4:

for i in range(10^4):
   a=Mod(2^32+1,3).sqrt()

valgrind says:

==6861== 5,312,296 bytes in 9,911 blocks are possibly lost in loss record 6,212\
 of 6,212
==6861==    at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6861==    by 0x4D1E1B8: _PyObject_GC_Malloc (gcmodule.c:1351)
==6861==    by 0x4D1E2AD: _PyObject_GC_NewVar (gcmodule.c:1383)
==6861==    by 0x4C78F80: PyFrame_New (frameobject.c:642)
==6861==    by 0x4CF0324: PyEval_EvalCodeEx (ceval.c:2755)
==6861==    by 0x4C7997A: function_call (funcobject.c:524)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x4C5F0DE: instancemethod_call (classobject.c:2579)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x4CE9022: PyEval_CallObjectWithKeywords (ceval.c:3575)
==6861==    by 0x1E356FA6: __pyx_pf_4sage_9structure_7factory_13UniqueFactory__\
_call__ (factory.c:877)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x4CAD893: slot_tp_call (typeobject.c:5378)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x1BCC2AC6: __pyx_pf_4sage_5rings_12finite_rings_11integer_mod_1\
9IntegerMod_abstract_sqrt (integer_mod.c:6959)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x4CE9022: PyEval_CallObjectWithKeywords (ceval.c:3575)
==6861==    by 0x4C6926B: methoddescr_call (descrobject.c:246)
==6861==    by 0x4C4EA72: PyObject_Call (abstract.c:2492)
==6861==    by 0x4CE9022: PyEval_CallObjectWithKeywords (ceval.c:3575)
==6861==    by 0x1BCA21A8: __pyx_pf_4sage_5rings_12finite_rings_11integer_mod_1\
4IntegerMod_int_sqrt (integer_mod.c:18128)
==6861==    by 0x4CEEEDE: PyEval_EvalFrameEx (ceval.c:3706)
==6861==    by 0x4CF0B44: PyEval_EvalCodeEx (ceval.c:2968)
==6861==    by 0x4CF0C11: PyEval_EvalCode (ceval.c:522)
==6861==    by 0x4CF0287: PyEval_EvalFrameEx (ceval.c:4401)

Does it say something to somebody fluent in Pyrex?

@rlmill
Copy link
Mannequin

rlmill mannequin commented Jul 8, 2010

comment:6

Possibly lost means that the garbage collector is being lazy and is by Python design. You need to look at the line numbers I highlighted in the above files, which are "definitely lost."

@zimmermann6
Copy link
Author

comment:7

Replying to @rlmill:

Fixed.

not quite:

-rwx--x--x 1 rlmill rlmill 1219074 2010-07-08 04:58 sage-sqrt.mem.log

@rlmill
Copy link
Mannequin

rlmill mannequin commented Jul 8, 2010

comment:8

sorry... jetlag

@zimmermann6
Copy link
Author

comment:9

Replying to @rlmill:

Of particular importance are lines 18757, 16971, 16915, 16831, etc. (Just search through for "definitely")

those lines seem to indicate the problem lies in the Singular and/or GINAC interface. Any specialist of those interfaces out there?

@rlmill
Copy link
Mannequin

rlmill mannequin commented Aug 1, 2010

comment:10

Replying to @zimmermann6:

those lines seem to indicate the problem lies in the Singular and/or GINAC interface.

I'm not so sure about this. Let's pick on this particular leak:

==25238== 5,320 (1,680 direct, 3,640 indirect) bytes in 35 blocks are definitely lost in loss record 17,325 of 18,340
==25238==    at 0x4C22FEB: malloc (vg_replace_malloc.c:207)
==25238==    by 0x13642E4C: __pyx_f_4sage_5rings_7integer_fast_tp_new (integer.c:29882)
==25238==    by 0x4EC2232: type_call (typeobject.c:731)
==25238==    by 0x4E6CC77: PyObject_Call (abstract.c:2492)
==25238==    by 0x218F6E2B: __pyx_f_4sage_4libs_8singular_8singular_si2sa_ZZ(snumber*, sip_sring*) (singular.cpp:3084)
==25238==    by 0x21902F6D: __pyx_f_4sage_4libs_8singular_8singular_si2sa(snumber*, sip_sring*, _object*) (singular.cpp:5483)
==25238==    by 0x2084D51E: __pyx_pf_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular_coefficients(_object*, _object*) (multi_polynomial_libsingular.cpp:27385)
==25238==    by 0x4E6CC77: PyObject_Call (abstract.c:2492)
==25238==    by 0x20858DB0: __pyx_pf_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular_gcd(_object*, _object*, _object*) (multi_polynomial_libsingular.cpp:24344)
==25238==    by 0x4E6CC77: PyObject_Call (abstract.c:2492)
==25238==    by 0x4F01D15: PyEval_CallObjectWithKeywords (ceval.c:3575)
==25238==    by 0x4E87C25: methoddescr_call (descrobject.c:246)
==25238==    by 0x4E6CC77: PyObject_Call (abstract.c:2492)
==25238==    by 0x4F01D15: PyEval_CallObjectWithKeywords (ceval.c:3575)
==25238==    by 0xF9E2BB9: __Pyx_PyEval_CallObjectWithKeywords (element.c:26384)
==25238==    by 0xF9D7B3C: __pyx_pf_4sage_9structure_7element_16NamedBinopMethod___call__ (element.c:19673)
==25238==    by 0x4E6CC77: PyObject_Call (abstract.c:2492)
==25238==    by 0x15AF691C: __pyx_f_4sage_5rings_22fraction_field_element_20FractionFieldElement__add_ (fraction_field_element.c:5090)
==25238==    by 0xF9BE0A1: __pyx_pf_4sage_9structure_7element_11RingElement___add__ (element.c:10804)
==25238==    by 0x4E6CF6D: binary_op1 (abstract.c:917)
==25238==    by 0x4E6D41F: PyNumber_Add (abstract.c:1157)
==25238==    by 0x4F0611A: PyEval_EvalFrameEx (ceval.c:1189)
==25238==    by 0x4F0A1C0: PyEval_EvalCodeEx (ceval.c:2968)
==25238==    by 0x4F0A291: PyEval_EvalCode (ceval.c:522)
==25238==    by 0x4F1E371: PyImport_ExecCodeModuleEx (import.c:675)

On line 3842 of multi_polynomial_libsingular.pyx, we have:

        if _ring.ringtype != 0:
            if _ring.ringtype == 4:
                P = self._parent.change_ring(RationalField())
                res = P(self).gcd(P(right))
                coef = sage.rings.integer.GCD_list(self.coefficients() + right.coefficients())   <--------------------
                return self._parent(coef*res)

The calls to .coefficients() are creating the integers which are not freed. Here is the definition of that function:

        cdef poly *p
        cdef ring *r
        r = (<MPolynomialRing_libsingular>self._parent)._ring
        if r!=currRing: rChangeCurrRing(r)
        base = (<MPolynomialRing_libsingular>self._parent)._base
        p = self._poly
        coeffs = list()
        while p:
            coeffs.append(si2sa(p_GetCoeff(p, r), r, base))
            p = pNext(p)
        return coeffs

Looks innocent enough... si2sa ends up calling:

cdef Integer si2sa_ZZ(number *n, ring *_ring):
    ...
    cdef Integer z
    z = Integer()
    z.set_from_mpz(<__mpz_struct*>n)
    return z

I really don't see where any of this could be going wrong. I think it has to do with the fast integer creation functions. Sage has a pool of allocated Integer objects. The integer_pool_count seems to go up and down randomly, staying in the low range. From one loop to the next, in the original poster's first example, it goes 9, 8, 11, 10, ...

I think that the experts for this memory pool need to step up to the plate...

@koffie
Copy link

koffie commented Nov 2, 2010

comment:11

I just tried the second leak in 4.6 on OS X 10.6.4 and the results are:

for i in range(10^6):
   t=Mod(2^32+1,3).sqrt()
   if i % 10000 == 0:
      print i, get_memory_usage()
0 243.6796875
10000 243.6796875
20000 243.6796875
30000 243.6796875
40000 243.6796875
50000 243.6796875
60000 243.6796875
70000 243.6796875
80000 243.6796875
90000 243.6796875
100000 243.6796875
110000 243.6796875
120000 243.6796875
130000 243.6796875
140000 243.6796875
150000 243.6796875
160000 243.6796875
170000 243.6796875

After which I interupted the loop since I concluded the leak was no longer there. Can the person who reported this check it on his own machine?

@koffie
Copy link

koffie commented Nov 2, 2010

comment:12

The first reported memory leak is still there, but note that you don't need the extemely large random integers i the example to expose the leak. All you need is a non square integer.

Doing the example only with squares in the interval 2!^400 till 2!^800:

m = get_memory_usage()
i=0
while True:
    i+=1
    a = ZZ(randint(2^200,2^400)^2).sqrt()
    if i%1000==0:
        print get_memory_usage(m)
0.0
0.0
0.0
0.0

The example using 2 as my favorite non square integer:

m = get_memory_usage()
i=0
while True:
    i+=1
    a = 2.sqrt()
    if i%1000==0:
        print get_memory_usage(m)
0.76953125
1.26953125
2.01953125
2.51953125
3.01953125
3.76953125
4.26953125
5.01953125
6.51953125

@koffie
Copy link

koffie commented Nov 2, 2010

comment:13

I dived a bit deeper into the source code to see what is actually going on and I found that in the end the symbolic ring sqrt function is the one where things go wrong. It's not the general symbolic ring framework since other symbolic ring functions dont misbehave.

functions=['arccos', 'arccosh',
'arcsin', 'arcsinh', 'arctan', 'arctanh', 'cos', 'cosh', 'exp', 'log', 'sin', 'sinh', 'sqrt', 'tan', 'tanh']
for function in functions:
    print function
    a=SR(2)
    m = get_memory_usage()
    for i in xrange(10000):
        b=a.__getattribute__(function)()
    print get_memory_usage(m)

arccos
0.0
arccosh
0.0
arcsin
0.0
arcsinh
0.0
arctan
0.0
arctanh
0.0
cos
0.0
cosh
0.0
exp
0.0
log
0.0
sin
0.0
sinh
0.0
sqrt
7.03125
tan
0.0
tanh
0.0

I'm not able to figure out which code get's called from the symbolic ring part on, since the internal working of the symbolic ring is a bit to complex for me.
Ie. the source code of SR(2).sqrt is

return new_Expression_from_GEx(self._parent,
                g_hold2_wrapper(g_power_construct, self._gobj, g_ex1_2, hold))

And new_Expression_from_GEx doesn't have any documentation or whatsoever.

@zimmermann6
Copy link
Author

comment:14

I confirm the second leak seems to be fixed now (tried with Sage 4.6 on Fedora and 4.5.2 on
Ubuntu).

The first one is still present in Sage 4.6.

Paul

@zimmermann6
Copy link
Author

comment:15

the problem seems to be in the power function. With Sage 4.6:

sage: a=SR(2)
sage: m = get_memory_usage()
sage: for i in xrange(10000):
....:     b=a.__pow__(1/3)
....:     
sage: print get_memory_usage(m)
11.953125

Paul

@zimmermann6
Copy link
Author

comment:16

William, Burcin, the memory leaks seems to be in the g_pow call at line 2458 of
symbolic/expression.pyx (in sage 4.6).

I guess the g_pow function is a wrapper to Ginac (from file libs/ginac/decl.pxi)
but I could see no occurrence of pow in c_lib/include/ginac_wrap.h.

Please can you help?

Paul

@burcin
Copy link

burcin commented Nov 3, 2010

comment:17

I don't have time to check now, but the pow() method of numeric objects should be called in this case. You can see the code here:

http://pynac.sagemath.org/hg/file/b233d9dadcfa/ginac/numeric.cpp#l297

There is nothing that immediately catches my eye there.

To experiment, you'll need a pynac development environment:

http://wiki.sagemath.org/pynac/start

@burcin
Copy link

burcin commented Nov 3, 2010

comment:18

Sorry for the spam, but is the problem still there with the patch at #8659 applied?

@zimmermann6
Copy link
Author

comment:19

Replying to @burcin:

Sorry for the spam, but is the problem still there with the patch at #8659 applied?

yes, but the leak seems to be smaller:

----------------------------------------------------------------------
| Sage Version 4.6, Release Date: 2010-10-30                         |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
Loading Sage library. Current Mercurial branch is: 8659
sage: a=SR(2)
sage: m = get_memory_usage()
sage: for i in xrange(10000):
....:     b=a.__pow__(1/3)
sage: print get_memory_usage(m)
4.4140625

instead of 11.703125 without the #8659 patch.

Paul

@zimmermann6
Copy link
Author

comment:20

I added some print-statements in pynac-0.2.3.p0/src/ginac/numeric.cpp as follows:

  Number_T pow(const Number_T& base, const Number_T& exp) {
    std::cerr << "enter pow, base=" << base << " exp=" << exp << "\n";
    verbose("pow");
    if (base.t != exp.t) {
      Number_T a, b;
      std::cerr << "coerce\n";
      coerce(a, b, base, exp);
      return pow(a,b);
    }
    switch (base.t) {
    case DOUBLE:
      std::cerr << "double\n";
      return std::pow(base.v._double, exp.v._double);
    case LONG:
      // TODO: change to use GMP!                                               
      std::cerr << "long\n";
      return std::pow((double)base.v._long, (double)exp.v._long);
    case PYOBJECT:
      std::cerr << "PYOBJECT\n";
      if PyInt_Check(base.v._pyobject) {
          PyObject* o = Integer(PyInt_AsLong(base.v._pyobject));
          PyObject* r = PyNumber_Power(o, exp.v._pyobject, Py_None);
          std::cerr << "PyInt_Check\n";
          Py_DECREF(o);
          return r;
      }
      return PyNumber_Power(base.v._pyobject, exp.v._pyobject, Py_None);
    default:
      stub("invalid type: pow Number_T");
    }
  }

Then it seems that for inexact powers that function is called twice for SR input:

sage: SR(2).__pow__(1/2)
enter pow, base=2 exp=1/2
PYOBJECT
enter pow, base=2 exp=1/2
PYOBJECT
sqrt(2)

but for exact powers only once:

sage: SR(4).__pow__(1/2)
enter pow, base=4 exp=1/2
PYOBJECT
2

For Integer input we get:

sage: Integer(2).__pow__(1/2)
enter pow, base=2 exp=1/2
PYOBJECT
sqrt(2)
sage: Integer(4).__pow__(1/2)
2

For ZZ input we get:

sage: ZZ(2).__pow__(1/2)
enter pow, base=2 exp=1/2
PYOBJECT
sqrt(2)
sage: ZZ(4).__pow__(1/2)
2

In all cases there is no memory leak for exact powers, but there is for inexact powers.
Thus I strongly suspect some special code that detects exact powers, but where is it?

Paul

@zimmermann6
Copy link
Author

Changed keywords from none to sd35.5

@burcin
Copy link

burcin commented Feb 15, 2012

comment:23

Replying to @zimmermann6:

In all cases there is no memory leak for exact powers, but there is for inexact powers.
Thus I strongly suspect some special code that detects exact powers, but where is it?

The __pow__ method of rational numbers. :) When computing 2^(1/2), Integer.__pow__ delegates the operation to Rational.__pow__. If the result is not exact, Rational.__pow__ returns a symbolic expression.

This should really be tested with the patch at #8659, which eliminates the need to call Number_T::pow() twice. Even with that patch, the leak is still there however.

@zimmermann6
Copy link
Author

comment:24

I updated the description.

Paul

PS: it seems the "stopgaps" were deleted, but I don't know which value it was...

@zimmermann6

This comment has been minimized.

@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@vbraun
Copy link
Member

vbraun commented Mar 20, 2014

comment:39

Feel free to file the upstream bug report faster next time ;-)

@zimmermann6
Copy link
Author

comment:40

Feel free to file the upstream bug report faster next time ;-)

the problem is that I had no idea in which upstream component the leak was (or in Sage), and I had no idea how to search where the leak was.

If you can give some information how we can do this in similar cases, it would be very helpful.

Paul

@vbraun
Copy link
Member

vbraun commented Mar 20, 2014

Author: Volker Braun

@vbraun
Copy link
Member

vbraun commented Mar 20, 2014

comment:41

Will be fixed by the pynac update at #14780

@vbraun vbraun removed this from the sage-6.2 milestone Mar 20, 2014
@mezzarobba
Copy link
Member

comment:42

Replying to @zimmermann6:

If you can give some information how we can do this in similar cases, it would be very helpful.

Fwiw, I used the objgraph module to see what kind of objects were being leaked. It turned out to be triples similar to those returned by pynac.pyx:py_rational_power_parts—the hardest part was probably to discover that function, but the observation that the leak occurred specifically for non-square integers helped.

@mezzarobba
Copy link
Member

Dependencies: #14780

@mezzarobba
Copy link
Member

Changed upstream from Reported upstream. No feedback yet. to Fixed upstream, in a later stable release.

@mezzarobba
Copy link
Member

Reviewer: Marc Mezzarobba

@zimmermann6
Copy link
Author

comment:45

could we add a small doctest checking that the leak is gone?

Paul

@vbraun
Copy link
Member

vbraun commented Mar 22, 2014

Branch: u/vbraun/sqrt_memory_leaks

@vbraun
Copy link
Member

vbraun commented Mar 22, 2014

comment:47

Please review, then.


New commits:

2795ef5test for memory leak

@vbraun
Copy link
Member

vbraun commented Mar 22, 2014

Commit: 2795ef5

@vbraun vbraun added this to the sage-6.2 milestone Mar 22, 2014
@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 22, 2014

Changed commit from 2795ef5 to 42a4f7f

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 22, 2014

Branch pushed to git repo; I updated commit sha1. New commits:

42a4f7fdocument output

@mezzarobba
Copy link
Member

comment:49

lgtm. Paul, please complain if you disagree, since you were the one who asked for a regression test!

@zimmermann6
Copy link
Author

comment:50

please complain if you disagree

I agree, thank you to everybody!

@vbraun
Copy link
Member

vbraun commented Apr 13, 2014

Changed branch from u/vbraun/sqrt_memory_leaks to 42a4f7f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants