-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Regression] Memory Leak in decision trees #2787
Comments
Are you able to run the same test before my data structure PR was merged? On 24 January 2014 00:01, Olivier Grisel notifications@github.com wrote:
|
I will adapt the script to make it bisect able. |
Great. On 24 January 2014 00:14, Olivier Grisel notifications@github.com wrote:
|
I just added a
Looks like |
I confirm this is the culprit:
|
I looks like |
Oh wait, I should be looking at the refcount of |
@larsmans have you tried valgrind with http://svn.python.org/projects/stackless/trunk/Misc/valgrind-python.supp ? I have not tried yet myself but I have to go offline now. Maybe later. |
Well indeed, since |
Never mind my previous comment, there was a flaw in my reasoning. Still, the trick to break the cycle is, I think, to introduce an intermediate class, say |
Sorry about this. I've just skimmed your comments and will try to look into On 24 January 2014 04:44, Lars Buitinck notifications@github.com wrote:
|
@larsmans I am not sure your solution is correct. Since there is still a cycle between |
Indeed, this will work. Never use (and store) arrays within the tree, but wrap the memory segment when exposing it to the external world. Remind me of something... ;) |
After some thought, I think we can make it work. Since with this solution the tree gets deallocated, the trick would be to break the cycle in |
I still don't see the need for an intermediary. However, when having a first hack ( On 24 January 2014 07:22, Gilles Louppe notifications@github.com wrote:
|
You are right this is not strictly necessary. I would be happier without adding such an intermediary. |
@glouppe In the setup I had in mind, the |
See #2790 where I've got a version without reference cycles. On 24 January 2014 08:47, Lars Buitinck notifications@github.com wrote:
|
Here is a reproduction script. I used a random tree to make it run much faster but I think this impacts all tree implementations. I used a very large output dimension to make the leak more visible:
Here is the output on the current master:
The output on sklearn 0.14.1:
The text was updated successfully, but these errors were encountered: