-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad performance on CUBAFixedConnectivity benchmark #68
Comments
I believe this is a genuine result where GeNN isn't doing very well due to our default parallelisation strategy for synapses. In brief, by default (and as used by brian2genn), synapse event processing is parallelised across post-synaptic neurons and loops across the entries of the queue of incoming spikes. If there are many incoming spikes and few incoming synapses per neuron, this is extremely inefficient. |
@tnowotny Thanks for clarification. When you say "default synapse processing", is there a GeNN peference that might give better results in such a constructed scenario? |
You can try |
Hi again - I have now run this test and indeed with the |
@tnowotny Cool. Will include that / try it out in our benchmarks when its exposed to brian2genn. |
it has been exposed in brian2genn as preference |
PS: It might be in the 'blocksize_test' branch ... forgot whether we already merged it into master. |
Sorry, forgot to reply to this... The change was in the |
As we haven't touched this in a long while I think we can close the issue. |
In our CUBA benchmark with fixed connectivity (constant number of synapses per neuron), brian2GeNN performs surprisingly bad, see plot below. In other benchmarks, brian2CUDA and brian2GeNN performance is comparable, not in this one. This behaviour is also not new, I have similar plots for this benchmark from April this year (using older brian2, GeNN and brian2GeNN versions). Does anyone have an idea why this is the case?
You can reproduce this behaviour by running the script below (with either
dev = 'genn'
or'dev = 'cpp'
), which runs the benchmark forN = 1e6
neurons and prints thedevice._last_run_time
value (so no synapse creation and device memory initialisation included). The figure above also plots the_last_run_time
values. You need to incorporate the changes from PR #65 in order to get the_last_run_time
in brian2GeNN.CUBAFixedConnectivity.py
I just ran these on our GeForece GTX TITAN Black (Kepler architecture) with brian2GeNN commit 8c6da48b3ae (
benchmarking
branch), brian-team/brian2@6c50e3a22d(
master
), genn-team/genn@3b794457b81 (3.0.0
). I get fordev == cpp
:dev == 'genn'
:Could someone reproduce this? And is this something you would have expected? To me this benchmark looks like a standard example of pre spikes, post effects.
The text was updated successfully, but these errors were encountered: