Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index refactoring #135

Merged
merged 33 commits into from
Oct 10, 2013
Merged

Index refactoring #135

merged 33 commits into from
Oct 10, 2013

Conversation

mstimberg
Copy link
Member

This is not quite ready to merge yet (the usual: testing and documentation is still too sparse), but after some major wrestling it is now working (i.e. the test suite passes and cpp_standalone works) and I'd like to get some feedback whether you see anything wrong with the approach that I missed so far.

I wrote, deleted and rewrote several approaches to the indexing system that had all their advantages and drawbacks, from fully automated nested indexing that is completely handled by codegen to almost no support from codegen where everything (synapse indices, subgroups) is handled in the templates. I think I finally found some middle ground that is reasonably understandable and simple and should work in complicated scenarios (accessing post-/presynaptic state variables in synapses via subgroups, etc.).

The major changes/assumptions/approaches are the following:

  • Each NeuronGroup now stores an actual array containing all its indices (which is quite boring, just an arange from 0 to N). This is a bit of a waste of memory but not too bad since we know the number of neurons and can use the appropriate data type. And we don't store indices twice, the indices i and j in a Synapses class refer to the respective indices of their source/target, indexed appropriately.
  • To make the above work with standalone, devices have to implement an additional arange method that works like array but initializes the array with integers from start to stop instead of with zeros. I added this method for the C++ standalone.
  • The ItemMapping classes disappeared. There is now only a calc_indices method that returns a mapping from arbitrary (tuple/slice/array) indices to flat 1d indices appropriate for the group. For a NeuronGroup this does basically nothing but for synapses it converts from the [i, j, k] index to the synapses index. This function no longer takes care of string indices.
  • There is one special index _idx that is defined in the templates and loops up to N (formerly _num_idx). If the index of an ArrayVariable is not '_idx', it has to be the name of another ArrayVariable in the variables dictionary. There is no longer a special indices dictionary or Index objects. Note that this does currently not allow for arbitrary nesting: The index for the ArrayVariable used as an index has to be _idx, it can't be another ArrayVariable, therefore we do not have to do a sorting of the dependencies or something like that. Currently, this is not checked, though, it will just fail in the generated code.
  • Subgroups are implemented via Subexpression objects that subtract the offset from the index array of the referred group (i.e. subgroups do not store another array). The Synapse object stores the indices of the real targets (this allows direct access to target variables) but the list-of-synapses-per-neuron structure uses relative subgroup-specific indices. All this greatly reduces the number of places that have to be "subgroup-aware": everything that is going through code generation automatically takes subgroups into account, the only remaining spots in the code (apart from Subgroup itself) are the push_spikes method of Synapses, the calc_indices method of Synapses (used when indexing i and j with numbers, not with code), the creation of synapses, and the SpikeMonitor template.
  • This indexing system also means that _presynaptic_idx and _postsynaptic_idx are automatically taken care of, we no longer need special synaptic templates for stateupdate or variable setting.
  • Finally, variable setting now works in many cases for standalone, for convenience G.v = -60*mV is automatically interpreted as G.v = '-60*mV'.
  • Because the i index of a NeuronGroup is now a standard (integer) array stored in the group I needed to get the data types right. I introduced AuxiliaryVariable objects that one can use for specifying the type of auxiliary variables (very similar to the previous OutputVariable objects). This is not used everywhere yet, though. This fixes _cond is double not bool in codegen #63.
  • There's a tricky issue surrounding the use of post- or presynaptic subexpressions in synaptic expressions: if the expression is I = x + y and is referred to as I_post in the synapse code, this will simply use the code x + y -- but these two variables could potentially be defined as synaptic variables and therefore evaluated incorrectly. I changed the way subexpressions are handled so that this is now checked and if there is a clash, x_post + y_post would be used instead.

In general, with all these changes it should now be possible to use i, j, references to neuronal state variables, etc. everywhere which is pretty nice I think :)

Some known issues:

  • Using small data types to store the indices for neurons can lead to nasty problems: if a group with 100 neurons stores its indices as int8, then an expression like i + j in a synapse will lead to a wrap around... I currently reduce the number of situations where this occurs by treating all literal numbers as floats (i.e. 1 becomes 1.0) but still... Probably we should simply not care about saving some kilobytes of memory and always use int32 to be safe?
  • One currently can't use k in synaptic string expressions.
  • Related to the Subexpression issue I mentioned above, there's still some possibility for mistakes if the external namespace is manually defined for a NeuronGroup. If the Subexpression is evaluated in the context of a Synapse, it will use the synapses's namespace, not the one from the NeuronGroup.

thesamovar and others added 25 commits September 24, 2013 22:05
standalone mode, works in runtime but not used automatically, doesn't
work in standalone because _array_i is not defined
…nGroup (including subgroups), Synapses are broken.
…illed with numbers from 0 to size-1. This will be used for the variable "i" in a `NeuronGroup`.
…factoring

Conflicts:
	brian2/groups/group.py
… translate_statement_sequence). For C++ code, SUPPORT_CODE, POINTERS, etc. have to be handled for all blocks (the variables/functions used might be different or the same)
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling f58f114 on index_refactoring into 9429991 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling f58f114 on index_refactoring into 9429991 on master.

@thesamovar
Copy link
Member

Wow, big changes! I'll need to go through this code quite carefully I think. Mostly it looks very good and resolves a lot of issues. One important issue though:

Each NeuronGroup now stores an actual array containing all its indices

This worries me slightly: does this mean that the main loop of a state updater will do something like this?

for(int _i=0; _i<N; _i++)
{
  int i = _array_i[_i];
  double V = _array_V[i];
  ...
}

If so, this is a problem because it means we will lose many compiler optimisations, most importantly cache predictability and vectorised instructions.

Probably we should simply not care about saving some kilobytes of memory and always use int32 to be safe?

Agreed!

@mstimberg
Copy link
Member Author

This worries me slightly: does this mean that the main loop of a state updater will do something like this?

No, sorry that was phrased a bit misleadingly. The standard loop still looks like:

for(int _idx=0; _idx<N; _idx++)
{
    double V = _ptr_array_neurongroup_V[_idx];
    ...
}

"Each NeuronGroup now stores an actual array containing all its indices" only referred to the use of the i variable. So, if some of the code that is executed every time step (e.g. state update, reset, threshold, ...) contains a reference to i, this would indeed appear as:

const int i = _ptr_array_neurongroup_i[_idx];

which is admittedly suboptimal but I don't think will have a noticeable performance impact. Besides, I don't see many uses for i in these contexts, it's rather a initialisation thing where performance isn't that much of an issue. In principle, we could also add a flag to ArrayVariable stating that an array is a simple arange and then C++ codegen would replace array[_idx] by _idx. Maybe I should add that the main reason for having the explicit array here is for the Python code: Before, every template where i (or j) was potentially used basically had an i = np.arange(N) that was executed again and again...

Probably we should simply not care about saving some kilobytes of memory and always use int32 to be safe?

Agreed!

Alright, I'll change this right now and revert the int-to-float conversion for literals.

mstimberg added 3 commits October 8, 2013 22:26
@thesamovar
Copy link
Member

OK that's great then. I'll take a look at this asap.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling a971e89 on index_refactoring into 9429991 on master.

@thesamovar
Copy link
Member

One thought: we don't need to add arange to Device, we can initialise them as a standard array and set them with a statement varname = _idx. This slightly reduces the work that needs to be done to implement a new device. What do you think?

@thesamovar
Copy link
Member

Question: for multiple abstract code blocks, did you make sure that it all worked sensibly or just refactor the slightly hacky code I wrote for that? I wrote this on the wiki:

Multiple code blocks: currently implemented in devices2 branch but I'm not very happy with the design. Also, some thing I'm not sure if they're implemented correctly, e.g. the behaviour of known/unknown variables is a bit unclearly defined for multiple code blocks.

@thesamovar
Copy link
Member

OK I went through in some detail but quite quickly. It looks great - much better! It seems to go some way towards resolving the issues mentioned in the wiki page referred to in #131. I would be happy to merge this as soon as it's ready (although would also like to see what you think about the previous two comments).

@mstimberg
Copy link
Member Author

One thought: we don't need to add arange to Device, we can initialise them as a standard array and set them with a statement varname = _idx

You are right, we could do this using the standard mechanisms. I had the impression that doing all this would unnecessarily complicate things because this is done during initialisation where the normal Group mechanisms are not yet available (i.e. the code can't simply call G.i = '_idx' after creating the variable). This means one would have to create a codeobject, etc. Maybe there's an easy solution, though: The standard Device.arange method would already have an implementation based on Device.array and an abstract code statement setting the values, therefore devices don't necessarily have to implement it. RuntimeDevice would however overwrite this method in the way it currently is implemented (using numpy.arange) and the C++ standalone could also overwrite it with its current implementation or stay with the standard implementation.

Um, now that I wrote this, I realize that there is actually a problem with G.i = '_idx': This doesn't work for Python because _idx is slice(None) in this case...

Question: for multiple abstract code blocks, did you make sure that it all worked sensibly or just refactor the slightly hacky code I wrote for that?

No, I'm afraid I did not change it fundamentally. The only thing I changed is to make translate_statement_sequence aware of the multiple blocks and return multiple blocks but only one keyword dictionary (support code, pointers, etc.). So far it seems to work :) Maybe deal with this as part of the general codegen refactoring?

I would be happy to merge this as soon as it's ready

Ok, I'll add some more tests and documentation tomorrow but then it's good to go from my side (if the additional testing does not reveal new bugs, of course ;) )

@thesamovar
Copy link
Member

So if you did something like syn.w = 'f(i,j)' then what will i and j be here? Computed or stored in arrays? How will this work for devices? Assuming it's computed, we should be able to do the same thing for G.V='i' right? So why do we need the arange?

Also, it should be straightforward to do synapses on devices now, right?

@mstimberg
Copy link
Member Author

As I said before, i and j are stored -- not directly in the synapses, but as the i arrays in the respective neurongroups. So syn.w = 'f(i, j)' roughly translates to the following in C++:

for (int _idx=0; _idx<N; _idx++)
{
    const int _presynaptic_idx = _ptr_array_synapses_synaptic_pre[_idx];
    const int _postsynaptic_idx = _ptr_array_synapses_synaptic_post[_idx];
    const int i = _ptr_array_neurongroup_i[_presynaptic_idx];
    const int j = _ptr_array_neurongroup_1_i[_postsynaptic_idx];
    double w;
    w = f(i, j);
    _ptr_array_synapses_w[_idx] = w;
}

Again, for C++ mapping into the arrays is quite a waste but we need the arrays for Python. At some point we have to do a np.arange(N). I think it is clearer to do it once for every NeuronGroup instead of sprinkling it around in templates wherever one could need i (and then having additional templates for everything synaptic and also include j = np.arange(N) there).

For the moment, I'm not seeing that the array indexing approach costs us a lot of performance but I see two ways how to improve it: either, ArrayVariable gets a new flag is_arange or something like that and the C++ translation converts array[idx] into idx for these arrays, or, Device.arange returns an ArangeArrayVariable (with a better name, I guess) which does the same thing.

Also, it should be straightforward to do synapses on devices now, right?

The main remaining problem is that we still have the second datatstructure (list of dynamic arrays) storing synapses that is needed for efficient propagation. I'm wondering whether we can abstract it away. Maybe a first simplifying step: let's not fill this data structure during synapse creation but instead calculate it during before_run?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 43e3daa on index_refactoring into 9429991 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 494d78c on index_refactoring into 9429991 on master.

@mstimberg
Copy link
Member Author

For the record: from my side this is ready for merging :)

@thesamovar
Copy link
Member

OK I see now. Let's test whether or not the is_arange idea is significant and implement it later if so: it's just an optimisation.

Not quite sure I understood the issues about doing synapses, but I don't think it affects whether or not this is ready to merge so I'll go ahead and do that now.

thesamovar added a commit that referenced this pull request Oct 10, 2013
@thesamovar thesamovar merged commit ae07f07 into master Oct 10, 2013
@thesamovar thesamovar deleted the index_refactoring branch October 10, 2013 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_cond is double not bool in codegen
3 participants