Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse old names if object no longer exists #216

Closed
thesamovar opened this issue Feb 17, 2014 · 14 comments
Closed

Reuse old names if object no longer exists #216

thesamovar opened this issue Feb 17, 2014 · 14 comments
Assignees
Milestone

Comments

@thesamovar
Copy link
Member

Creating a new NeuronGroup (or other object) without specifying the name gives a series of names like neurongroup_1, neurongroup_2, etc. This happens even if the first object is deleted before the second one is created. This is problematic because it leads to unnecessary recompiles of weave code (the name of the group is embedded in the array name).

One thing we might also consider is making weave code not have the name of the group embedded in it, since it's not clear if this is necessary.

@thesamovar thesamovar added this to the 2.0alpha2 milestone Feb 17, 2014
@thesamovar
Copy link
Member Author

So actually this turns out not to be a problem with the naming system that already does this. The problem is that references to NeuronGroup are being kept somehow - maybe there is a circular reference somewhere? You don't get the same behaviour with NetworkOperation for example. Here's a script that demonstrates the problem:

from brian2 import *
import gc

for init_func in [lambda: NetworkOperation(lambda:None),
                  lambda: NeuronGroup(1, 'v:1'),
                  ]:    
    for i in xrange(3):
        G = init_func()
        print i, G.name
        del G
        gc.collect()
        print [obj().name for obj in Nameable.__instances__()]
        print
    print
    print '*************'
    print

@thesamovar
Copy link
Member Author

I guess this sort of thing is going to happen a lot because we have to be strict and careful about using weakref everywhere and if we ever forget it causes this problem. Do you see any way we can make this easier on ourselves?

@thesamovar
Copy link
Member Author

OK I found the problem, it was that Variable was setting owner without creating a weakref.proxy. I fixed this directly in master since it's a small change, but maybe take a look @mstimberg?

By the way, I found a cool tool for debugging these sorts of problems: http://mg.pov.lt/objgraph/

@mstimberg
Copy link
Member

OK I found the problem, it was that Variable was setting owner without creating a weakref.proxy. I fixed this directly in master since it's a small change, but maybe take a look @mstimberg?

Right, I think I run into the "owner is already a weakproxy" problem and then didn't want to be bothered with the try/except solution everywhere. I pretty sure I made a mental note of "I'll come back later and do that properly" at some point, though :). Using the weakproxy_with_fallback solution looks good to me.

There are still some tricky issues, however. Replacing an object with a new one of the same name will not work (e.g. in a loop, similar to your model explorer example), you always have to delete the old variable first and run the garbage collection. This can also make it quite annoying in ipython notebooks, where it's quite common to re-execute cells -- this will always raise an error if you explicitly set names for objects. I wonder whether we can come up with a more sophisticated system? Maybe we somehow invalidate other objects of the same name whenever we create a new one with the same name? Accessing the old object would then somehow raise an error.

Since it's really nice for debugging and documentation, I'd rather like it if giving explicit names to objects would be the recommended way of doing things in Brian. But as I mentioned before, this does not work well with interactive uses:

>>> from brian2 import *
>>> G = NeuronGroup(10, 'dv/dt = -v / tau :', name='exc')
[...]
EquationError: Parsing failed: 
dv/dt = -v / tau :
^
Expected end of text (at char 0), (line:1, col:1)
>>> G = NeuronGroup(10, 'dv/dt = -v / tau : 1', name='exc')
[...]
ValueError: An object with name exc is already defined.
>>>

And I think in this case, even garbage collection will not help since the interactive interpreter still holds some reference.

@thesamovar
Copy link
Member Author

I don't think we should necessarily encourage people to give names to all the objects. I think it's mostly going to be useful for standalone.

For IPython, we could have a cell magic that forces names to override the old one, but honestly I'm not sure how much value there is in it and it potentially complicates things (since now everything has to be aware that the old one is invalid). Deleting a variable is quickly and easily done, and you don't need to call garbage collection unless there are circular references (which is what we're trying to avoid). If the interactive interpreter still holds a reference internally, though, I'm not sure there's much that can be done about it.

If you can find a quick and easy way of invalidating the old object without having to make the code be aware of it everywhere, and you can think of a good syntax, then OK, but I'm not sure it's a high priority. We could do something like in Brian 1, where clearing an object removes it from the magic tracking, and deletes all its attributes (meaning that pretty much any attempt to use it will raise an error). I don't think this should be default behaviour though, I think raising an error if you try to create two objects with the same name should rather be the default.

@mstimberg
Copy link
Member

I don't think we should necessarily encourage people to give names to all the objects. I think it's mostly going to be useful for standalone.

I think the name mechanism has a big potential in many areas, especially when interactively investigating objects and debugging and also when functions create a network and just return the Network object (which now allows you to access the contained objects with net['name'].

Deleting a variable is quickly and easily done, and you don't need to call garbage collection unless there are circular references (which is what we're trying to avoid).

You are probably right about the garbage collection, I was confused by the weakref docs stating "when the only remaining references to a referent are weak references, garbage collection is free to destroy the referent" -- I guess "garbage collection" here refers to the general mechanism including the standard reference counting, not the cyclical reference check in gc.collect(). In general you are of course right that deleting a variable is trivial but nevertheless I already experienced that it can be highly annoying if you are forced to do so (and in an ipython notebook, adding del G to the beginning of a cell then means that this cell no longer works when called for the first time...). As a user, it seems crazy that re-executing G = NeuronGroup(..., name=...) does not work and I don't think that it's obivous that deleting G helps.

I think the relevant question is what kind of error are we protecting the user from? In the "re-execute G = NeuronGroup(..." example, I don't see anything that the user gains from not being able to do what they want.

What about checking the names for uniqueness when they are added to the network? If this raises an error, there is a genuine problem (because e.g. array names are no longer unique). Is there any potential problem we would not detect using this approach?

In some way it would be more elegant to defer all name assignment (i.e. the resolution of neurongroup* into neurongroup_7 etc.) until this point in time but this would be quite a mess to implement since a lot of code assumes that objects have a name from the beginning, so I would rather not do this. Also, if one doesn't care about the exact name then having an object named neurongroup_3 even though it could be called neurongroup (because of a lot of object deletion/recreation) isn't an issue.

@thesamovar
Copy link
Member Author

Interesting. Yes, we could raise errors at a later point but allow multiple objects to be defined with the same name. That wouldn't work for standalone because everything goes into the same global namespace. I wonder if it could end up being annoying that we would have to be potentially aware everywhere that there might be multiple objects with the same name, but if not then I'd be satisfied with this solution. One thing to be aware of is that every object that is created goes into magic_network so we'd have to check at Network.run and not at Network.add.

@mstimberg
Copy link
Member

Interesting. Yes, we could raise errors at a later point but allow multiple objects to be defined with the same name. That wouldn't work for standalone because everything goes into the same global namespace.

I think we'd have to change standalone slightly so that in build it does not go through all created Synapse objects, for example, but rather through all Synapse objects contained in a Network. Actually, do we want to support multiple Network objects in standalone? Shouldn't build rather have a network argument (defaulting to the use of the MagicNetwork)? Or maybe we should just raise an error if Network.__instances__() has more than one entry? One other problem is that all created arrays are stored in CPPStandaloneDevice but we should only generate code for arrays belonging to objects in the Network.

One thing to be aware of is that every object that is created goes into magic_network so we'd have to check at Network.run and not at Network.add.

Right, good point. Network.before_run would be the place, I guess.

@thesamovar
Copy link
Member Author

I don't want to restrict standalone to only working with a single Network. Now that users can easily modify and insert code into the generated standalone code, you can imagine uses for multiple Network objects.

We would need to modify CPPStandaloneDevice to not use string dicts I guess if names are not necessarily unique. I don't mind changing this though, I'm not sure the way that CPPStandaloneDevice works at the moment is optimal anyway.

@mstimberg
Copy link
Member

I don't want to restrict standalone to only working with a single Network. Now that users can easily modify and insert code into the generated standalone code, you can imagine uses for multiple Network objects.

Ok.

We would need to modify CPPStandaloneDevice to not use string dicts I guess if names are not necessarily unique.

I don't think we are using string dicts, we normally use Variable objects as the keys. Maybe there isn't actually any problem if we have duplicate names at some point?

@thesamovar
Copy link
Member Author

OK we can try that out.

Since this is now a bit more ambitious, I'm going to reassign to beta milestone (although maybe we'll do an alpha3 and put it there).

@thesamovar thesamovar modified the milestones: 2.0beta, 2.0alpha2, 2.0alpha3 Feb 19, 2014
@mstimberg
Copy link
Member

I think this is still something we should look into (mostly because of the interactive/ipython notebook use cases)., But since it is also something that only offers a small gain and can potentially break things (especially in standalone) we should implement and test it carefully. I'm therefore moving the milestone to post beta.

@mstimberg mstimberg modified the milestones: 2.0beta2, 2.0beta Oct 17, 2014
@mstimberg mstimberg modified the milestones: 2.0beta3, 2.0beta2 Feb 26, 2015
@mstimberg
Copy link
Member

Basically same as before: I think this is important but we shouldn't rush it, moving its milestone again.

@mstimberg mstimberg modified the milestones: 2.0beta3, 2.0beta4 Apr 30, 2015
@thesamovar thesamovar modified the milestones: 2.0beta4, 2.0 Jul 21, 2015
@thesamovar thesamovar modified the milestones: 2.0, 2.0beta4 Jul 21, 2015
@thesamovar
Copy link
Member Author

CNS meeting update: relax the unique names, instead insist that (1) names within a Network are unique, checked at run(), and (2) names are unique for a device.build(), checked within Device.

@mstimberg mstimberg self-assigned this Jul 22, 2015
mstimberg added a commit that referenced this issue Aug 4, 2015
For wildcard names, the same mechanism as before is used but explicitly given names are not checked for uniqueness at creation time. Instead, `Network.before_run` checks that all names within the network are unique. In addition, `CPPStandaloneDevice.build` checks that names are globally unique. Closes #216
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants