Efficient spike/spike-like event recording #372

neworderofjamie · 2020-10-12T16:20:13Z

The standard GeNN idiom for recording spikes was to use synchronous cudaMemcpy to read the number of spikes for each population in the current timestep, then use another synchronous cudaMemcpy to read that many spikes and then copy the host data into a host-side data structure. Especially with simulation with a 0.1ms timestep and lots of populations, this pretty much prevents any chance of real-time. This PR introduces very simple system which lets you allocate any remaining GPU memory for spike recording meaning that, in many simulations, you can run for a large number of timesteps without any device->host memory transfers.

Mark a neuron population for spike recording in C++:

e->setSpikeRecordingEnabled(true);

or Python:

excitatory_pop.spike_recording_enabled = True

Allocate a number of timesteps of spike recording buffer in C++:
```
allocateRecordingBuffers(10000);
```
or Python:
```
model.load(num_recording_timesteps=10000)
```

Pull the spike recording buffers in C++:

pullRecordingBuffersFromDevice();

or Python:

model.pull_recording_buffers_from_device()

Access recording data in C++ (where this helper function is in userprojects - you can do whatever you want with the data):

writeTextSpikeRecording("spikes.csv", recordSpkE,
                                        Parameters::numOutput, 10000, 0.1,
                                        ",", true);

or Python:

spike_times, spike_ids = excitatory_pop.spike_recording_data

I think recording spikes is the most common and most inefficient case but this could easily be extended in future to record subsets of neurons/state variables..

From the upcoming PyGeNN paper, this disproportionately helps Python and slow CPUs where the copying of data and sticking it in a host-side data structure component is particularly costly:

…tructs and update pointers

* Allocate and zero shared memory for block's spikes * Atomic or shared memory if spike is emitted

…mory

# Conflicts: # generate_swig_interfaces.py # include/genn/backends/cuda/backend.h # src/genn/backends/cuda/backend.cc

…rather than in nasty !init block

codecov · 2020-10-12T17:11:17Z

Codecov Report

Merging #372 into master will increase coverage by 0.20%.
The diff coverage is 94.76%.

@@            Coverage Diff             @@
##           master     #372      +/-   ##
==========================================
+ Coverage   86.25%   86.46%   +0.20%     
==========================================
  Files          70       70              
  Lines       12072    12327     +255     
==========================================
+ Hits        10413    10658     +245     
- Misses       1659     1669      +10

Impacted Files	Coverage Δ
include/genn/genn/modelSpec.h	`83.62% <ø> (ø)`
include/genn/genn/neuronGroup.h	`74.00% <50.00%> (-2.09%)`	⬇️
include/genn/genn/code_generator/backendBase.h	`90.47% <66.66%> (+0.23%)`	⬆️
src/genn/backends/single_threaded_cpu/backend.cc	`56.77% <69.23%> (+0.11%)`	⬆️
src/genn/genn/neuronGroup.cc	`92.92% <76.47%> (+0.23%)`	⬆️
src/genn/backends/opencl/backend.cc	`91.09% <86.36%> (-0.23%)`	⬇️
include/genn/backends/opencl/backend.h	`98.79% <100.00%> (+0.01%)`	⬆️
...nclude/genn/backends/single_threaded_cpu/backend.h	`76.00% <100.00%> (ø)`
include/genn/genn/code_generator/backendSIMT.h	`98.30% <100.00%> (ø)`
include/genn/genn/code_generator/modelSpecMerged.h	`98.56% <100.00%> (ø)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc7e03e...afcbda5. Read the comment docs.

jamesturner246 · 2020-10-13T12:49:21Z

I'm getting

In file included from /its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/../../../userproject/include/sharedLibraryModel.h:24:0,
                 from /its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/sharedLibraryModelNumpy.h:4,
                 from pygenn/genn_wrapper/generated/SharedLibraryModelNumpy_wrap.cpp:3132:
/its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/../../../userproject/include/spikeRecorder.h: In function ‘void writeTextSpikeRecording(const string&, const uint32_t*, unsigned int, unsigned int, double, const string&, bool, bool)’:
/its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/../../../userproject/include/spikeRecorder.h:223:43: error: invalid conversion from ‘int’ to ‘std::ios_base::openmode {aka std::_Ios_Openmode}’ [-fpermissive]
     std::ofstream stream(filename, append ? std::ofstream::app : 0);
                                    ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
In file included from /its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/../../../userproject/include/spikeRecorder.h:5:0,
                 from /its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/../../../userproject/include/sharedLibraryModel.h:24,
                 from /its/home/jt273/workspace/genn-recording/pygenn/genn_wrapper/include/sharedLibraryModelNumpy.h:4,
                 from pygenn/genn_wrapper/generated/SharedLibraryModelNumpy_wrap.cpp:3132:
/usr/include/c++/7/fstream:715:7: note:   initializing argument 2 of ‘std::basic_ofstream<_CharT, _Traits>::basic_ofstream(const string&, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits<char>; std::__cxx11::string = std::__cxx11::basic_string<char>; std::ios_base::openmode = std::_Ios_Openmode]’
       basic_ofstream(const std::string& __s,
       ^~~~~~~~~~~~~~

when compiling PyGeNN. Isn't openmode a c-string?

neworderofjamie · 2020-10-13T12:52:34Z

Well that last minute addition was clearly not great 😄 can you get latest and try?

jamesturner246

Okay, it compiles and the results look sensible. Thought they looked short, but I wasn't setting save steps high enough.

API is fine, but maybe convenient for lazy people to set layer recording in the neuron population constructor? Also see the per-neuron push/pull recording buffer note.

I see you plan to do it for neuron state vars (and synapse state?) as well, which could be handy.

jamesturner246 · 2020-10-13T13:42:34Z

pygenn/genn_groups.py

@@ -278,6 +300,14 @@ def delay_slots(self):
    def size(self):
        return self.pop.get_num_neurons()



Would be cool to have a per-layer pull_recording_buffers_from_device method, so you can selectively pull recording buffers without pulling anything else.

What would be the use case for doing that? If you've enabled recording on a neuron population wouldn't you always want the data at the end of the recording period?

Good point, I suppose you would.

… implement for single-threaded CPU...

…change

neworderofjamie · 2020-10-20T10:35:22Z

Turning over the metaphorical rock by adding tests uncovered a number of loose ends/bugs (including what appears to be a bug in NVIDIA's OpenCL implementation which needs further investigation). These are all now addressed and this should be good to go.

tnowotny

Did I miss it or should we add a few words about this to the manual? - otherwise approved.

# Conflicts: # src/genn/genn/code_generator/groupMerged.cc

neworderofjamie added 23 commits July 13, 2020 17:34

basic API for controlling neuron group recording

b6ac000

only neuron groups with same recording config should be merged

5645e82

remove var recording for now to simplify PR

fe9f71c

helper to test whether model has recording enabled anywhere

75ecf25

function to allocate recording buffer

caa617c

reuse existing EGP functionality to add recording buffers to merged s…

af5bae5

…tructs and update pointers

CUDA backend support:

d6ead6c

* Allocate and zero shared memory for block's spikes * Atomic or shared memory if spike is emitted

final part of CUDA backend support - writing spikes back to global me…

74285e4

…mory

generate runner infrastructure for allocating and pulling record buffer

2fb73af

simple helper to write out raw binary spikes

ae7ad3e

whitespace

e0fb15d

SharedLibraryModel interface to allocating and pulling recording buffers

18da620

Python interface for spike recording

1ecf2e5

hacky workaround for SWIG bug (swig/swig#1840)

b9be284

fixed typos

51beeff

removed debug, convert timesteps to ms

a7a5d50

**THINK** required unpackbits functionality requires numpy 1.17

8b00cf1

Only try and hook up spike arrays if spikes are present on host

006336f

**WIP** optimized bitmask unpacking

48727b3

Merge branch 'opencl' into recording

26fe3b7

# Conflicts: # generate_swig_interfaces.py # include/genn/backends/cuda/backend.h # src/genn/backends/cuda/backend.cc

Implemented optimized bitmask unpacking and CSV writing

c6238d0

fixed bug in bit order

b94495e

moved spike recording struct building into correct class constructor …

64fe9b7

…rather than in nasty !init block

neworderofjamie added the enhancement label Oct 12, 2020

neworderofjamie added this to the GeNN 4.4.0 milestone Oct 12, 2020

neworderofjamie requested review from jamesturner246 and tnowotny October 12, 2020 16:20

neworderofjamie added 2 commits October 12, 2020 17:21

Merge branch 'master' into recording

859a8bf

option for appending to existing spike recordings

922cac9

updated Potjans microcircuit user project to use recording system

9b397d3

zero not a valid flag

37e70ea

jamesturner246 reviewed Oct 13, 2020

View reviewed changes

neworderofjamie added 11 commits October 19, 2020 15:42

spike recording test

02ff74c

makefile for *nix

585ee76

spike event recording test

b764556

clearly testing spike event recording was a good idea....

debce65

makefile for *nix

ee7b100

more to the point it also reminded me that I had totally forgotten to…

cf68bd5

… implement for single-threaded CPU...

fixed totally not-platform independent code in BackendSIMT

44ee98a

fixed broken code in OpenCL Backend::genExtraGlobalParamAllocation

98f6016

correct test name

2774731

workaround for NVIDIA OpenCL bug

2eaa1bf

fix for broken host connectivity init with OpenCL broken by previous …

3664640

…change

tnowotny approved these changes Oct 20, 2020

View reviewed changes

neworderofjamie added 2 commits October 20, 2020 12:56

Merge branch 'master' into recording

29ec2ae

# Conflicts: # src/genn/genn/code_generator/groupMerged.cc

added new section to documentation describing recording system

afcbda5

neworderofjamie merged commit a1684b0 into master Oct 20, 2020

neworderofjamie deleted the recording branch October 20, 2020 14:21

neworderofjamie mentioned this pull request Jan 20, 2021

Built in batching support #392

Merged

neworderofjamie mentioned this pull request Apr 6, 2021

Fixed bugs with spike recording genn-team/ml_genn#23

Merged

neworderofjamie mentioned this pull request Aug 22, 2022

New mlGeNN syntax genn-team/ml_genn#44

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient spike/spike-like event recording #372

Efficient spike/spike-like event recording #372

neworderofjamie commented Oct 12, 2020

codecov bot commented Oct 12, 2020 •

edited

jamesturner246 commented Oct 13, 2020

neworderofjamie commented Oct 13, 2020

jamesturner246 left a comment

jamesturner246 Oct 13, 2020

neworderofjamie Oct 13, 2020

jamesturner246 Oct 13, 2020

neworderofjamie commented Oct 20, 2020

tnowotny left a comment

		@@ -278,6 +300,14 @@ def delay_slots(self):
		def size(self):
		return self.pop.get_num_neurons()

Efficient spike/spike-like event recording #372

Efficient spike/spike-like event recording #372

Conversation

neworderofjamie commented Oct 12, 2020

codecov bot commented Oct 12, 2020 • edited

Codecov Report

jamesturner246 commented Oct 13, 2020

neworderofjamie commented Oct 13, 2020

jamesturner246 left a comment

Choose a reason for hiding this comment

jamesturner246 Oct 13, 2020

Choose a reason for hiding this comment

neworderofjamie Oct 13, 2020

Choose a reason for hiding this comment

jamesturner246 Oct 13, 2020

Choose a reason for hiding this comment

neworderofjamie commented Oct 20, 2020

tnowotny left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 12, 2020 •

edited