Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

garbage collection problem (revisited) #141

Closed
fperez opened this Issue · 23 comments

2 participants

@fperez
Owner

This bug had originally been reported in Launchpad by Kilian as:

https://bugs.launchpad.net/ipython/+bug/269966

and we closed it with a test. Unfortunately, when I wrote the test case and closed the bug, my test case was slightly different than Kilian's actual report. And while now the test case we have does run correctly, it turns out that Kilian's original problem appears never to have been fixed.

So we should still try to find how some of these references are being kept by %run.

What follows is Kilian's original report for completeness here.
If a variable is created within a script that was executed using the %run command, there is a reference to it created somewhere
that prevents it from being garbage collected after deleting of the variable.

I am not sure if this is really a bug or just a misunderstanding on my side. However, it would be good if it were possible to delete
a variable created inside a script without deleting the whole name space using %reset.

This example is from an email thread in ipython-user: http://lists.ipython.scipy.org/pipermail/ipython-user/2008-July/005599.html

example:
create the following script named test_destructor.py and execute it using the ipython %run command:

kilian@chebang:~$ cat test_destructor.py
class C(object):
    def __del__(self):
        print 'deleting object...'

c = C()

kilian@chebang:~$ python test_destructor.py
deleting object...

now, let's try in ipython:

In [1]: run test_destructor.py

In [2]: del c

In [3]: import gc

In [4]: gc.collect()
Out[4]: 47

(object still not deleted)

In [5]: %reset
Once deleted, variables cannot be recovered. Proceed (y/[n])? y
deleting object...

Finally!

@fperez
Owner

I should add that the last behavior with %reset, which did work when Kilian reported the problem (2008-09-13), now is also broken! I just tested by doing a checkout of IPython on that date, and indeed it works as Kilian said:

git checkout `git rev-list -n 1 --before="2008-09-13" master`
python setup.py install --prefix=~/tmp/junk/

amirbar[scratch]> ip
/home/fperez/tmp/junk/lib/python2.6/site-packages/IPython/Magic.py:38: DeprecationWarning: the sets module is deprecated
  from sets import Set
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
Type "copyright", "credits" or "license" for more information.

IPython 0.9.rc1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]: run scratch.py

In [2]: del zz

In [3]: reset
Once deleted, variables cannot be recovered. Proceed (y/[n])?  y
deleting object...

But with current IPython, even %reset does not get the del method called!

amirbar[scratch]> ip
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
Type "copyright", "credits" or "license" for more information.

IPython 0.11.alpha1.git -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]: run scratch.py

In [2]: del zz

In [3]: reset
Once deleted, variables cannot be recovered. Proceed (y/[n])?  y

In [4]: 

Nothing!

So not only did we not include a fix for Kilian's actual problem with an identical test (instead using another related test that might be a good idea also, but that tests something else), but we've managed to lose the ability to clear these references even with %reset.

@takluyver
Owner

Replicated in trunk (0.11 dev).

@takluyver
Owner

OK, it looks like, among other possible places, there's a reference to it in ip._user_main_module which isn't getting deleted.

@fperez
Owner

@takluyver, is this fixed by your recent %reset work? If so, let's close this guy.

@takluyver
Owner

Unfortunately not. The reference in ip._user_main_module is left behind even after a %reset. I'll have a look, although I remember getting confused about the way namespaces are handled for %run before.

@takluyver takluyver was assigned
@takluyver
Owner

OK, I can get %reset to delete the reference, but that causes a test failure in core.tests.test_run.test_tclass. That likewise has a destructor method, but that method refers to sys. By the time the garbage collector triggers the destructor, sys in that namespace is gone, so it fails with a NameError.

So, what's more important? I seem to remember you've said previously that destructor methods shouldn't rely on anything other than self and builtins, so should I change tclass.py and go ahead?

@fperez
Owner
@takluyver
Owner

I could attach a reference to the object: self.flush_stdout = sys.stdout.flush. Ugly, but it works*.

*The test still fails because it now sees the second c being deleted as IPython exits, but I assume I should change the test to accommodate that.

@fperez
Owner
@takluyver
Owner

I believe it's still passing both of those things. The instance created in the first run isn't deleted until after the second run (because there's a copy in ip._main_ns_cache. The only difference is that, as IPython exits, its .reset() method deletes the instance from the second run, so you see the output from that as well.

@fperez
Owner

But is the first run's instance automatically deleted when the second run finishes, or do we have to wait for a reset to happen (either manually via %reset or on exit)? The original scenario was that Kilian was running a script repeatedly, and large objects in memory were effectively killing his session after a while...

@takluyver
Owner

Testing manually, it behaves as expected (C-first is deleted immediately after the second run), but I take the point that the automatic check doesn't check this properly. I'm adding a third run to make the test more rigorous.

@fperez
Owner
@takluyver
Owner

OK, so with those changes now merged, we're back where we were when this bug was filed. After %running a script, its namespace is stored in ip._user_main_module and ip._main_ns_cache. The references will be gone if either: you %run the same script again, you %reset (without the -s option), or you exit IPython.

@fperez
Owner

Idea: how about defining a %del magic that would do extra hunting of references to a name? Most users don't need this, but for those who really need to nuke one variable without resetting their whole namepace, this would do the job.

@takluyver
Owner

Not a bad idea. Is there a better short name, though, because for automagics, del gets tricky - it's not in any namespace, because it's a statement. So it could be translated to a magic call, when the user just expected a normal del.

Also, should this be combined with %reset_selective?

@fperez
Owner
@takluyver
Owner

Hmmm. I'm not convinced that we should override plain Python. Although in this case, it's somewhat ambiguous, because our default behaviour is different to the Python interpreter, in that we keep hidden references to user-defined objects. Maybe we should call it something like xdel (extra delete).

It's also not simple to do it reliably: if I enter c as an expression, the object it refers to gets cached in output history, but not under the name c. So if I then remove c from every namespace, it still exists in history.

@fperez
Owner
@takluyver
Owner
@fperez
Owner
@takluyver
Owner

I've added the magic xdel in PR #419.

@takluyver
Owner

And with xdel merged in (commit 9979966), this is as resolved as it's likely to get, at least for now, so I'm closing it.

@takluyver takluyver closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.