[MRG] Avoid reference cycles in Tree #2790

jnothman · 2014-01-23T23:01:42Z

~~This is a WIP because~~ we need to check if wrapping value with a ndarray at predict time is too expensive in time. If so, we can implement our own take using memcpy (which I'd rather over reverting to the version where predict duplicates apply) but for the moment I'm having trouble getting that to work...

This also needs to be tested for any further memory leaks.

coveralls · 2014-01-23T23:14:56Z

Coverage remained the same when pulling 86fbebd on jnothman:tree_without_cycles into f8e7cf1 on scikit-learn:master.

glouppe · 2014-01-24T07:21:24Z

This is great! Thanks for figuring this out @jnothman

@ogrisel Could you have a look to confirm that the leak is gone?

glouppe · 2014-01-24T07:25:40Z

On my box, using @ogrisel script, I cannot reproduce the leak.

glouppe · 2014-01-24T07:34:14Z

Regarding predict, I did a quick benchmark using GradientBoostingRegressor with n_estimators=1000 and I cannot observe any significant performance decrease of predict execution time.

So overall, I am +1 for this. Any second opinion?

Thanks for the patch!

jnothman · 2014-01-24T08:11:53Z

GradientBoostingRegressor reimplements predict, so I don't think that's a useful assessment! RandomForest might be worth trying.

1000 trees fit on 50 features:

Predicting 1e5 samples, best of 3
At master: 9.2 s
At 86fbebd: 9.48 s

Similar 2-3% time increase for smaller samples. I assume this is not big, but not insignificant..? We can in any case accept this fix and speed up predict in another patch.

glouppe · 2014-01-24T08:13:15Z

GradientBoostingRegressor reimplements predict, so I don't think that's a useful assessment!

Whoops :)

Similar 2-3% time increase for smaller samples. I assume this is not big, but not insignificant..? We can in any case accept this fix and speed up predict in another patch.

+1 for that.

ogrisel · 2014-01-24T10:21:45Z

Weird, I still get the leak when I run the script of #2787 but now the leak is "stabilizing" at the last iteration:

(py27)0 [~/code/scikit-learn (pr/2790)]$ make in 1>/dev/null 2>&1 && python ~/tmp/check_memleak.py
70MB
155MB
234MB
234MB

on master I get:

(py27)0 [~/code/scikit-learn (master)]$ make in 1>/dev/null 2>&1 && python ~/tmp/check_memleak.py
70MB
155MB
235MB
315MB

ogrisel · 2014-01-24T10:35:55Z

@jnothman does the script run without leaking on your box? I ran it on OSX with a Python 2.7 after a make clean. I also ran the tests on your branch and they pass.

BTW, maybe we could use PyMem_Alloc and friends instead of libc.malloc. This we could use tracemalloc from Python HEAD to track the remaining leak: http://www.python.org/dev/peps/pep-0454/

ogrisel · 2014-01-24T10:42:59Z

Actually we should stick to libc.malloc for Python < 3.4 and use PyMem_Alloc only for Python 3.4+ as PyMem_Alloc requires the GIL up to Python 3.3: http://www.python.org/dev/peps/pep-0445/#gil-free-pymem-malloc

It might be possible to select the right malloc at compile time with a preprocessor macro in our own .h file depending on the Python version using those constants: http://stackoverflow.com/a/12348545/163740

GaelVaroquaux · 2014-01-24T11:35:45Z

Actually we should stick to libc.malloc for Python < 3.4 and use PyMem_Alloc
only for Python 3.4+ as PyMem_Alloc

This is going to add a lot of complexity for little gains, I believe.

ogrisel · 2014-01-24T11:44:41Z

This is going to add a lot of complexity for little gains, I believe.

Well it would give us very fine control over memory management by making it possible to use the tracemalloc tool that supports differential snapshots to spot leaks along with the traceback of the faulty allocation.

glouppe · 2014-01-24T13:31:54Z

Actually we should stick to libc.malloc for Python < 3.4 and use PyMem_Alloc only for Python 3.4+ as PyMem_Alloc

This is going to add a lot of complexity for little gains, I believe.

I agree. Let's not make this more complicated than it has already become.

glouppe · 2014-01-24T13:34:43Z

@ogrisel This is really odd. Here are the results on my box:

# this branch 
 % python leak.py 
60MB
61MB
61MB
61MB

# master
 % python leak.py                                                      
60MB
140MB
220MB
300MB

glouppe · 2014-01-24T13:47:33Z

sklearn/tree/_tree.pyx

+            return
+        elif ret == 1:
+            raise RuntimeError('Cannot resize tree after building is complete')
+        raise MemoryError()


Couldn't you simply check self.locked here and not change the semantics of _resize_c?

Anyway, can you comment on why we need this at all? This is internal API and the case this additional test covers should never happen. There is no need to be defensive in my opinion.

ogrisel · 2014-01-24T14:17:28Z

I re-pulled the tree_without_cycles branch from @jnothman's repo to make sure I had the right code, I still leak some memory on the first 2 iterations of my script:

(py27)0 [~/code/scikit-learn (tree_without_cycles)]$ python ~/tmp/check_memleak.py
70MB
155MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB

arjoly · 2014-01-24T14:28:40Z

I got the following results.

ajoly at arnaud-joly-002 in ~/git/scikit-learn on 34c4908! # 0.14.1
(sklearn) ± python leak.py     
70MB
115MB
155MB
195MB
235MB
275MB
275MB
275MB
275MB
275MB
275MB

ajoly at arnaud-joly-002 in ~/git/scikit-learn on master!
(sklearn) ± python leak.py
69MB
154MB
235MB
315MB
395MB
475MB
555MB
635MB
715MB

ajoly at arnaud-joly-002 in ~/git/scikit-learn on 86fbebd!
(sklearn) ± python leak.py               
70MB
155MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB
234MB

arjoly · 2014-01-24T14:35:25Z

~~I saw the same behavior for all branch.~~

It looks like that tree from 0.14.1 consumes less memory than master. :(

ogrisel · 2014-01-24T14:53:06Z

Ok so @arjoly you have increasing yet stabilizing memory usage on 0.14.1 as well. So the leak might be fixed in this branch in the end. Python might be refusing to free some RSS for some other reason.

arjoly · 2014-01-24T14:54:01Z

I saw the same behavior for all branch.

I wasn't synced with master :-(
I corrected the bench.

ogrisel · 2014-01-24T14:54:23Z

@arjoly On master you have the leak: it never stabilizes. On the other branch (0.14.1 and @jnothman's fix) you reach stability.

ogrisel · 2014-01-24T15:00:54Z

Here is a new version of my script that tracks leaking objects (without references):

import gc
import os
import psutil
import objgraph
import numpy as np
from sklearn.tree import ExtraTreeRegressor

X = np.random.normal(size=(100, 50))
Y = np.random.normal(size=(100, int(5e4)))

p = psutil.Process(os.getpid())

initially_without_ref = objgraph.get_leaking_objects()

def print_mem():
    print("{:.0f}MB".format(p.get_memory_info().rss / 1e6))
    currently_without_ref = objgraph.get_leaking_objects()
    print([o for o in currently_without_ref
             if o not in initially_without_ref])

print_mem()

for i in range(3):
    et = ExtraTreeRegressor(max_features=1).fit(X, Y)
    del et
    gc.collect()
    print_mem()

Now the results:

on this branch:

73MB
[]
159MB
[<listiterator object at 0x10d744e10>, {140694180998368: [140694180488784]}]
159MB
[<listiterator object at 0x10d744e10>, {140694180998368: [140694180488784]}]
159MB
[<listiterator object at 0x10d744e10>, {140694180998368: [140694180488784]}]

I don't know what 140694180998368 is but it's there just once, so it might be an artifact of the functions calls or the loop. The listiterator is not a real leak, it's expected when we enter the loop.

on master, we are leaking sklearn._tree.Tree instances:

73MB
[]
158MB
[<listiterator object at 0x1044cce10>, <sklearn.tree._tree.Tree object at 0x1061c7650>, {140266067200112: [140266066904800]}]
238MB
[<listiterator object at 0x1044cce10>, <sklearn.tree._tree.Tree object at 0x1061c7650>, {140266067200112: [140266066904800]}, <sklearn.tree._tree.Tree object at 0x1061c7710>]
318MB
[<listiterator object at 0x1044cce10>, <sklearn.tree._tree.Tree object at 0x1061c7650>, {140266067200112: [140266066904800]}, <sklearn.tree._tree.Tree object at 0x1061c7710>, <sklearn.tree._tree.Tree object at 0x1061c77d0>]

on 0.14.1:

74MB
[]
119MB
[<listiterator object at 0x1087b6990>, {140731809630416: [140731782125488]}]
159MB
[<listiterator object at 0x1087b6990>, {140731809630416: [140731782125488]}]
159MB
[<listiterator object at 0x1087b6990>, {140731809630416: [140731782125488]}]

So in the end this branch seems to correctly fix the regression introduced in master.

arjoly · 2014-01-24T15:02:02Z

Could someone explain me why we can't do as before and wrapped the c array into a numpy array at the last moment?

glouppe · 2014-01-24T15:04:04Z

@ogrisel Thanks for the check. So in conclusions, this PR correctly fixes the issue.

ogrisel · 2014-01-24T15:04:09Z

sklearn/tree/_tree.pyx

        self.nodes = NULL
+        self.locked = False


This new attribute would deserve an inline comment to explain the motivation and how it is meant to be used.

I would remove it (see my comment above).

ogrisel · 2014-01-24T15:07:22Z

@ogrisel Thanks for the check. So in conclusions, this PR correctly fixes the issue.

Yes but before merging I think this PR should at least have more inline comments to explain the reference counting magic that happens under the hood.

pprett · 2014-01-24T15:17:52Z

sklearn/tree/_tree.pyx


    # XXX using (size_t)(-1) is ugly, but SIZE_MAX is not available in C89
    # (i.e., older MSVC).
    cdef int _resize_c(self, SIZE_t capacity=<SIZE_t>(-1)) nogil:
        """Guts of _resize. Returns 0 for success, -1 for error."""
+        if self.locked:
+            return 1


if 0 is success and -1 is error -- what is 1?

I'm happy to get rid of locked, but will leave a note in there for anyone
who tries to add post-building tree expansion. Comments coming.

On 25 January 2014 02:17, Peter Prettenhofer notifications@github.comwrote:

In sklearn/tree/_tree.pyx:

# XXX using (size_t)(-1) is ugly, but SIZE_MAX is not available in C89 # (i.e., older MSVC). cdef int _resize_c(self, SIZE_t capacity=<SIZE_t>(-1)) nogil: """Guts of _resize. Returns 0 for success, -1 for error."""

if self.locked:

return 1

if 0 is success and -1 is error -- what is 1?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2790/files#r9150939
.

coveralls · 2014-01-25T11:15:55Z

Coverage remained the same when pulling 87194f7 on jnothman:tree_without_cycles into f203953 on scikit-learn:master.

glouppe · 2014-01-25T13:15:42Z

Awesome, thanks for your work Joel! I am merging this.

[MRG] Avoid reference cycles in Tree

jnothman · 2014-01-25T13:44:36Z

Great! Let's hope we've ironed out everything :)

On the topic of which, implementing Tree.predict without _get_value_ndarray
(by reimplementing np.take, basically) doesn't appear to make it faster.

On 26 January 2014 00:15, Gilles Louppe notifications@github.com wrote:

Merged #2790 #2790.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2790
.

larsmans · 2014-01-25T14:12:00Z

Sorry for not being in time for a review, but this looks like an elegant solution. Thanks!

arjoly · 2014-01-25T18:53:06Z

Thanks for the fix !!!

ogrisel · 2014-01-26T14:54:52Z

Thanks for the fix @jnothman.

jnothman added 2 commits January 24, 2014 07:54

FIX remove reference cycles from Tree

e532639

FIX dtype refcount; block resizing after building

86fbebd

jnothman mentioned this pull request Jan 23, 2014

[Regression] Memory Leak in decision trees #2787

Closed

glouppe reviewed Jan 24, 2014
View reviewed changes

ogrisel reviewed Jan 24, 2014
View reviewed changes

pprett reviewed Jan 24, 2014
View reviewed changes

Remove 'locked', add comments

87194f7

glouppe added a commit that referenced this pull request Jan 25, 2014

Merge pull request #2790 from jnothman/tree_without_cycles

e8fdfa6

[MRG] Avoid reference cycles in Tree

glouppe merged commit e8fdfa6 into scikit-learn:master Jan 25, 2014

larsmans mentioned this pull request Jan 25, 2014

[MRG] faster sorting in trees; random forests almost 2× as fast #2747

Merged

[MRG] Avoid reference cycles in Tree #2790

[MRG] Avoid reference cycles in Tree #2790

Conversation

jnothman commented Jan 23, 2014

coveralls commented Jan 23, 2014

glouppe commented Jan 24, 2014

glouppe commented Jan 24, 2014

glouppe commented Jan 24, 2014

jnothman commented Jan 24, 2014

glouppe commented Jan 24, 2014

ogrisel commented Jan 24, 2014

ogrisel commented Jan 24, 2014

ogrisel commented Jan 24, 2014

GaelVaroquaux commented Jan 24, 2014

ogrisel commented Jan 24, 2014

glouppe commented Jan 24, 2014

glouppe commented Jan 24, 2014

glouppe Jan 24, 2014

Choose a reason for hiding this comment

glouppe Jan 24, 2014

Choose a reason for hiding this comment

ogrisel commented Jan 24, 2014

arjoly commented Jan 24, 2014

arjoly commented Jan 24, 2014

ogrisel commented Jan 24, 2014

arjoly commented Jan 24, 2014

ogrisel commented Jan 24, 2014

ogrisel commented Jan 24, 2014

arjoly commented Jan 24, 2014

glouppe commented Jan 24, 2014

ogrisel Jan 24, 2014

Choose a reason for hiding this comment

glouppe Jan 24, 2014

Choose a reason for hiding this comment

ogrisel commented Jan 24, 2014

pprett Jan 24, 2014

Choose a reason for hiding this comment

jnothman Jan 25, 2014

Choose a reason for hiding this comment

coveralls commented Jan 25, 2014

glouppe commented Jan 25, 2014

jnothman commented Jan 25, 2014

larsmans commented Jan 25, 2014

arjoly commented Jan 25, 2014

ogrisel commented Jan 26, 2014