Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circular references to proxy objects cause files to be locked even after leaving scope #684

Open
jpgill86 opened this issue Jul 6, 2019 · 3 comments
Assignees
Milestone

Comments

@jpgill86
Copy link
Contributor

jpgill86 commented Jul 6, 2019

With code like this,

import neo
io = neo.io.get_io(filename)
blk = io.read_block(lazy=True)
del io
del blk

the file filename will remain locked until the Python process ends.

I believe the cause is circular references to proxy objects, such as this:

anasig.segment = seg
seg.analogsignals.append(anasig)

These circular references appear all over the place in Neo and are not unique to proxy objects. I suspect that because of them, most Neo objects in memory are not destroyed even after commands like del io; del blk. Not only is this a source of a memory leak, but in the case of lazy loading using proxy objects, the file becomes locked (e.g., it cannot be deleted in Windows at least).

Manually performing garbage collection is a workaround:

import gc
gc.collect()

However, the best solution I think would be to use weak references.

@samuelgarcia
Copy link
Contributor

Hi jeffrey,
thank you to catch this.Andrew have a long date plan to make a pseudo list object to deal with relationship in neo. It is catually a mess.
At implementtation time, we could consider make weakref at least for one way bottum up.

The fact that file are still open when del to all object is strange.
Of course when rawio many object have an internal memmap to data.
So yes the file stay open when object is still alive.
Note that for proxyobject there is a link to the IO : self._rawio
I guess this could be a weakref. The ptach is easy to do.

@jpgill86
Copy link
Contributor Author

jpgill86 commented Jul 8, 2019

Andrew have a long date plan to make a pseudo list object to deal with relationship in neo. It is catually a mess.
At implementtation time, we could consider make weakref at least for one way bottum up.

Yes, this sounds like a good way to do it eventually.

Of course when rawio many object have an internal memmap to data.
So yes the file stay open when object is still alive.

Exactly. A strong reference to the RawIO persists in each proxy object, and the memmaps keep the file open. Each proxy object persists, even after del io; del blk, because of the circular references, until garbage collection runs.

Note that for proxyobject there is a link to the IO : self._rawio
I guess this could be a weakref. The ptach is easy to do.

Good point, it may be easier to solve the file lock issue (not the memory leak problem) with a few targeted weakrefs.

However, the first few things I tried didn't work.

I tried replacing in each proxy object

self._rawio = rawio

with

self._rawio = weakref.proxy(rawio)

This had the effect of allowing the file to unlock with del io, and del blk is not needed because there are no strong references to the RawIO in blk. Unfortunately, this means that the following variant of reading the file will lose all its strong references, and the RawIO will be destroyed immediately:

blk = neo.io.get_io(filename).read_block(lazy=True)

Here I left out the intermediate step of creating io, which was the only strong reference to the RawIO.

This would probably break a lot of existing code, so that's not a good route to go.

I briefly thought maybe we needed weakrefs to the proxy objects themselves in BaseFromRaw (e.g., anasig = weakref.proxy(AnalogSignalProxy(rawio=self, ...))), but this doesn't work either because the proxy objects are destroyed instantly.

I next tried adding a single strong reference to the RawIO in the seg returned by read_segment (as a first try, I did if lazy: seg.rawio = self), thinking we could then use self._rawio = weakref.proxy(rawio) in each proxy object. However, this leads us back to the original problem: The segment is never destroyed even after del blk because of all the circular references.

The best solution I can think of right now, other than a widespread overhaul of relationships that uses weakrefs in all the right places, is to add a close_file method to blk objects returned by read_block when lazy=True, which could traverse its children and get rid of every reference to the RawIO.

Does that sound like a good idea or a bad one? Would it be better to live with the current behavior that locks the file and use manual garbage collection, until relationships can be overhauled? Or is the "quick fix" of adding close_file a reasonable short term alternative?

@samuelgarcia
Copy link
Contributor

Thanks for this big investigation.

I did not figure out alll theses details.

The wiser way is working of relationship with weakref, so the Andrew's plan.

@apdavison @JuliaSprenger ?

@apdavison apdavison added this to the 0.9.0 milestone Jul 23, 2019
@apdavison apdavison modified the milestones: 0.9.0, 0.10.0 Sep 7, 2020
@samuelgarcia samuelgarcia self-assigned this Mar 18, 2021
@samuelgarcia samuelgarcia modified the milestones: 0.10.0, 0.11.0 Jul 26, 2021
@apdavison apdavison modified the milestones: 0.10.3, 0.11.0 Aug 30, 2022
@apdavison apdavison modified the milestones: 0.11.0, 0.12.0 Sep 29, 2022
@JuliaSprenger JuliaSprenger modified the milestones: 0.12.0, 0.12.1 Apr 2, 2023
@JuliaSprenger JuliaSprenger modified the milestones: 0.12.1, 0.13.0 Jul 19, 2023
@apdavison apdavison modified the milestones: 0.13.0, 0.14.0 Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants