Preliminary MPI support for GeNN #158

brad-mengchi · 2017-09-04T14:35:40Z

This now works pretty nicely as a very minimal MPI implementation for GeNN:

Linux only
Only supports spike-based communication over MPI links
Doesn't use multicast MPI (I believe there is a multicast API which could be helpful for larger simulations)

Essentially what's changed is:

NNmodel now has seperate maps containing local and remote neuron groups - representing those simulated on the local machine and those simulated on other nodes on the network (which one a neuron group goes into is determined by the host id you pass to NNmodel::addNeuronGroup)
NNmodel also has seperate maps of local and remote synapse groups (which one a synapse group goes into is automatically determined by the host id of it's target neuron group)
There is a new NeuronGroup::hasOutputToHost method which tests whether a neuron group has any outputs to a given host ID (MPI rank) - this is used to determine which remote neuron groups needs to be synchronised every time step
Basic data structures to hold incoming spikes and GPU push methods are generated for remote neuron groups which needs synchronising.
New mpi.cc and mpi.h files is generated for models build with the -m genn-buildmodels.sh flag. This contains methods to transmit local neuron groups' current spikes using MPI to a specific host and to receive remote neuron group current spikes from a specific host. Typically these are called automatically from the synchroniseMPI function which automatically sends and receives all required spikes each timestep.
Macros are generated for each remote neuron group so user code can test whether it's there with #ifndef POP_NAME_REMOTE and e.g. not try and generate afferent connectivity .

1 and 2 resulted in a fairly large number of search-and-replace changes but this nice thing with this is that 99% of the code just needs to work as before, just on the local neuron and synapse groups.
Example model using this is here https://github.com/neworderofjamie/genn_examples/tree/master/va_benchmark_mpi - it can run on a local MPI install or using the SGE system on our cluster.

NOTE I am going to merge this branch manually as I don't think the changes to the userprojects are useful

…the postsynaptic neurons Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

… and pull functions)

* ``NNmodel::isDeviceInitRequired`` checks for remote neuron groups who have outputs to local machine and have spike variables which should be initialised on device * ``genInitializeDeviceKernel`` now also intialises remote neuron group spike variables

…cal host which have delayed projections!

…of CPU_ONLY and MPI)

* Previously ``StandardGeneratedSections::neuronOutputInit`` was only being used in a subset of the locations this code was being run in * ``StandardGeneratedSections::neuronOutputInit`` should only advance device spike queues * For consistency all host spike queues are now advanced in stepTimeCPU/stepTimeGPU

* Pull functions shouldn't be generated for remote populations which don't output to the local host * Tidied up some auto-generated comments

I've now made these changes myself!

# Conflicts: # lib/GNUmakefile # lib/include/modelSpec.h # lib/src/generateRunner.cc

tnowotny · 2018-01-29T18:53:15Z

I have had a look at the example you made and had a bit of poking around.
I know I am coming late to this (as usual) but I was surprised to see the MIMD approach (separate executable for each MPI host). I always thought MPI was meant to be more SIMD than that. On the practical side the disadvantage of the current solution would be that it's not very scalable. If one could have a loop over populations E1 to E100 and rather than a macro testing for locality one had a simple if branching, then I could see how this scales to big machines. I.e.

if (mpi_local("E1")) {
pull ...;
}

would also allow something like

for (int i= 0; i < 100; i++) {
  std::string pop= std::string("E")+itoa(i);
  if (mpi_local(pop)) {
     pull...;
  }
}

With the macros this would be difficult ... but maybe it's too late now to rethink the entire design?
Maybe we can also discuss offline.

tnowotny · 2018-01-29T18:55:06Z

Argh ... maybe this is rubbish. The pull commands etc are also named at compile time, so what I was thinking wouldn't work anyway .... would it?

neworderofjamie · 2018-01-29T19:55:58Z

I think those issues are kinda two sides of the same coin. Because the structure of the network exists largely as generated code in GeNN - rather than in Nest where it's builds in memory at runtime - I think it makes sense that the MPI code is more MIMD than would be typical.

However I think the problem of not being able to loop through populations is more general than just MPI - the simulation code for the Potjans, Diesmann microcircuit (https://github.com/neworderofjamie/genn_examples/blob/master/potsjan_microcircuit/simulator.cc#L51-L114) is heading towards macro hell and that only has 8 populations. As you say there's not much choice as everything is compile-time. BUT I actually think the way the SpineML simulator works solves this quite neatly:

Building the generated code into a dynamic library
Loading it at runtime (https://github.com/genn-team/genn/blob/development/spineml/simulator/main.cc#L621-L629)
Looking up functions/variables within that at runtime (https://github.com/genn-team/genn/blob/development/spineml/simulator/main.cc#L642)

May be a good future direction for building the simulation code for larger GeNN models and the actual simulation executable would then be the same across all nodes, it just loads in a different dynamic library of generated code.

tnowotny · 2018-01-29T20:06:28Z

This looks like an interesting solution. I remember back then there was a deliberate decision to make it all compile-time and have explicitly named functions for each population etc to make it easy for users not to have to index anything but only call things by name ...

What is your gut feeling - is it worth just merging this solution (with the macros) now even if we may later do something more like the spineML2GeNN design?

neworderofjamie · 2018-01-29T20:19:51Z

Well, all the options are there to build models using the SpineML approach (there's a GENN_PREFERENCE to build a dynamic library) so you would be able to build models like this using the current version. The example model is in my personal github so no one ever needs to see it :)

I am keen to merge this PR into development to prevent it being lost (the MPI communications stuff and the splitting of remote and local populations are useful whatever), but perhaps I'll merge after I make the 3.1.0 release. Then I can experiment with building some models using the dynamic library approach and, if that's clearly a better fit for MPI, roll back some of the hacky bits for making filenames unique...

tnowotny · 2018-01-29T20:28:23Z

Ok - you have my blessing to merge this when it fits into your workflow. Overall it's not an awful approach. That the macros exist doesn't hurt anyone who may not want to use them ;-)

tnowotny

Merge when it suits best ... as discussed.

neworderofjamie · 2018-02-01T11:26:08Z

FYI @tnowotny I had a go at re-implementing the microcircuit model using a shared library here https://github.com/neworderofjamie/genn_examples/blob/master/potsjan_microcircuit/simulator_shared_library.cc. I think the result is already somewhat terser and less riddled with macros and there's still quite a lot of boilerplate that could be provided by the SharedLibraryModel helper class.

brad-mengchi added 30 commits June 26, 2017 11:13

Add HostID and DeviceID for neurons, synapses will be simulated with …

8e4fe5d

…the postsynaptic neurons Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Remove unused variable

7b12015

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add MPI_ENABLE option

4399522

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Quite unused variable for non MPI mode

138a33c

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Change addNeuronPopulation() interface for hostID and deviceID

196ee4d

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Modify addSynapsePopulation to bind to its postsynaptic neurons

2fc09ee

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Remove MPI_ENABLE option until real MPI operation is used

734af87

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Fix default argument in declaration

12b51f1

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add a naive MPI init

4b8ed35

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add MPI initialize and finalize for generator and simulator code

3934879

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Get hostID from MPI_Comm_rank for addNeuronPopulation

6433075

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add MPI_Finalize for generator

4277f0b

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Generate header for MPI infrastructure

05e6675

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add framework for infraMPI.cc

d0cd9f4

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add infrastructure for Spikes with MPI send and receive

985fe1f

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Modify getCluster*() function to const function

d6b7e0d

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Fix one missing MPI host ID comparison

531f6a8

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add group connection for Local/RemoteSynapse

aa290bb

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add communicateSpike function skeleton

cfc7461

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Fix isSpikeZeroCopyEnabled for MPI code

7a91ea9

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add comment for neuron and synapse communication

bcc4d6b

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Switch order to guarantee send before receive

64dd8ae

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add MPI tag for different neuron sender

1d33dee

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Change map to hash function to hash neuron name for MPI tags

9181e08

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Define different MPI infrastructure file for each host

db5f208

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Duplicate generateALL for conflicts

09608a6

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Preserve runner.cubin for multiple process to use

368ace7

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Eliminate neuronKrnl for local neurons

d180deb

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Add compile and link for infraMPI

3239baa

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

Change back to map table and introduce variables

9dbc471

Signed-off-by: Mengchi Zhang <zhan2308@purdue.edu>

neworderofjamie added 9 commits January 17, 2018 17:37

Somehow a line of this makefile went awol

3b946d3

Tidied host initialisation of spike variables (in similar way to push…

76ec193

… and pull functions)

Include spike queue pointers for remote populations with output to lo…

668054b

…cal host which have delayed projections!

Fixed nasty typo in pullXXXCurrentSpikesFromDevice function

92cf31c

Initialise spkQuePtr for remote synapse groups

afc9c39

Update on-device spike queue pointers

5909d80

Unique libgenn/generator names for MPI builds (supports combinations …

f4091b3

…of CPU_ONLY and MPI)

neworderofjamie requested review from tnowotny and jamesturner246 January 19, 2018 17:08

neworderofjamie added 2 commits January 19, 2018 17:14

Tweaks

994b99d

* Pull functions shouldn't be generated for remote populations which don't output to the local host * Tidied up some auto-generated comments

Forgot to free remote spike vars

0e77e20

Merge remote-tracking branch 'origin/development' into MPI_support

2026729

# Conflicts: # lib/GNUmakefile # lib/include/modelSpec.h # lib/src/generateRunner.cc

tnowotny approved these changes Jan 29, 2018

View reviewed changes

neworderofjamie removed this from the GeNN 3.1.0 milestone Jan 30, 2018

neworderofjamie closed this Feb 14, 2018

neworderofjamie changed the base branch from development to master February 14, 2018 12:22

neworderofjamie reopened this Feb 14, 2018

neworderofjamie merged commit 2026729 into master Feb 28, 2018

neworderofjamie deleted the MPI_support branch February 28, 2018 13:39

neworderofjamie mentioned this pull request Mar 16, 2018

mpi implementation for multiple hosts #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preliminary MPI support for GeNN #158

Preliminary MPI support for GeNN #158

brad-mengchi commented Sep 4, 2017 •

edited by neworderofjamie

Loading

tnowotny commented Jan 29, 2018 •

edited

Loading

tnowotny commented Jan 29, 2018

neworderofjamie commented Jan 29, 2018 •

edited

Loading

tnowotny commented Jan 29, 2018

neworderofjamie commented Jan 29, 2018

tnowotny commented Jan 29, 2018

tnowotny left a comment

neworderofjamie commented Feb 1, 2018 •

edited

Loading

Preliminary MPI support for GeNN #158

Preliminary MPI support for GeNN #158

Conversation

brad-mengchi commented Sep 4, 2017 • edited by neworderofjamie Loading

tnowotny commented Jan 29, 2018 • edited Loading

tnowotny commented Jan 29, 2018

neworderofjamie commented Jan 29, 2018 • edited Loading

tnowotny commented Jan 29, 2018

neworderofjamie commented Jan 29, 2018

tnowotny commented Jan 29, 2018

tnowotny left a comment

Choose a reason for hiding this comment

neworderofjamie commented Feb 1, 2018 • edited Loading

brad-mengchi commented Sep 4, 2017 •

edited by neworderofjamie

Loading

tnowotny commented Jan 29, 2018 •

edited

Loading

neworderofjamie commented Jan 29, 2018 •

edited

Loading

neworderofjamie commented Feb 1, 2018 •

edited

Loading