Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in parameter system. #1089

Closed
jgosmann opened this issue Jun 11, 2016 · 6 comments · Fixed by #1090
Closed

Memory leak in parameter system. #1089

jgosmann opened this issue Jun 11, 2016 · 6 comments · Fixed by #1090
Assignees
Labels
Milestone

Comments

@jgosmann
Copy link
Collaborator

I think there is a memory leak in the parameter system (I haven't verified it yet). We haven't noticed it yet because parameters tend to be small and at least I usually run only one simulation per process. @ikajic however is running more than one simulation and is experiencing severe memory leaks. Also because in those simulations large SPA vocabularies are stored as part of a parameter (on the bleeding-edge branch using the rewritten SPA system).

First look at params.py:76, the assigned parameters are stored in a WeakKeyIDDictionary. Note that the Params class is not instantiated per object instance, but for each parameter a such an instance is created when the parameter is defined in a class. Thus, the Params instance including the WeakKeyIDDictionary lives usually for the whole runtime of the script.

Now take a look at the implementation of WeakKeyIDDictionary. It uses two dictionary-like objects _keyrefs and _keyvalues to map object ids to the "key" and value respectively. _keyrefs is a WeakValueDictionary. As such the objects used as key (that is the objects on which parameters are be set) will be garbage collected when no other references exist. _keyvalues however is a normal dictionary. That means the value assigned to the parameter will never be garbage collected (and that can be large vocabularies as stated at the beginning).

@jgosmann jgosmann added the bug label Jun 11, 2016
@jgosmann jgosmann added this to the 2.1.1 release milestone Jun 11, 2016
@jgosmann jgosmann mentioned this issue Jun 11, 2016
12 tasks
@tbekolay
Copy link
Member

b4c37d7 is the commit that changed this; previously it was a normal WeakKeyDictionary, which shouldn't leak memory in this way. I believe the reason for the change (@hunse can correct me if I'm wrong) was to ensure that equality of keys uses an is check instead of ==? Brief example:

l1 = nengo.LIF()
l2 = nengo.LIF()
d = WeakKeyDict()
d[l1] = 'hey'
print(d[l2]) # prints 'hey' because l1 == l2
d = WeakKeyIDDict()
d[l1] = 'hey'
print(d[l2])  # raises KeyError because l1 is not l2

The fix is not super straightforward as I think it means we essentially have to implement our own garbage collection ... scan the _keyrefs to look for stale entries at some point and remove them from _keyvalues. That's an annoying piece of code to write, so I'd be happier if we could switch to something else from collections, or subclass WeakKeyDict to make it work as above (assuming I haven't missed something about what WeakKeyIDDict is doing?)

@jgosmann jgosmann self-assigned this Jun 11, 2016
@jgosmann
Copy link
Collaborator Author

is check instead of ==

I think that is the behaviour we want. (Otherwise setting parameters on one object may overwrite parameters on another.)

The fix is not super straightforward as I think it means we essentially have to implement our own garbage collection

I don't think it is as bad as you think. I'm working on a fix right now.

@tbekolay
Copy link
Member

I don't think it is as bad as you think.

I don't think it's hard to get something working. But I think it's hard to do it such that it's performant and correct in all cases... if it were, there wouldn't be 3 or 4 different garbage collection schemes currently in us by modern programming languages.

@jgosmann
Copy link
Collaborator Author

With "not as bad as you think", I was meaning to say, I don't think we need to implement our own garbage collection.

@tbekolay
Copy link
Member

I don't think we need to implement our own garbage collection.

Cool, I'll wait patiently for the fix then ;)

jgosmann added a commit that referenced this issue Jun 11, 2016
Fixes #1089.

We create a weak reference to the key with a callback here that will
free the value when no strong references to the key are left. To be
able to do this we need the id of the key which will be retrieved
with the _ref2id dictionary. Note that weak reference use the hash
function of the object they are referencing. Thus, we have to use the
id of the weak reference to index that dictionary. Finally, to support
the del operation we need to be able to map the object id to the weak
reference which requires another dictionary _id2ref (instantiating
another weak ref would give us an object with a different id). This
dictionary has another important function: It stores strong references
the weak reference to ensure that the callback gets called (it would
not if the weak reference gets garbage collected beforehand and we
need to store ids as keys in the _ref2id dictionary).
@jgosmann
Copy link
Collaborator Author

👉 #1090

hunse pushed a commit that referenced this issue Jun 22, 2016
WeakKeyIDDictionary was not removing values when the key was freed.

We create a weak reference to the key with a callback here that will
free the value when no strong references to the key are left. To be
able to do this we need the id of the key which will be retrieved
with the _ref2id dictionary. Note that weak reference use the hash
function of the object they are referencing. Thus, we have to use the
id of the weak reference to index that dictionary. Finally, to support
the del operation we need to be able to map the object id to the weak
reference which requires another dictionary _id2ref (instantiating
another weak ref would give us an object with a different id). This
dictionary has another important function: It stores strong references
the weak reference to ensure that the callback gets called (it would
not if the weak reference gets garbage collected beforehand and we
need to store ids as keys in the _ref2id dictionary).

Fixes #1089.

Also added a test for WeakKeyIDDictionary value garbage collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

2 participants