Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StateMonitor idexing with slice syntax doesn't work #50

Closed
denisalevi opened this issue Jan 17, 2017 · 6 comments · Fixed by #202
Closed

StateMonitor idexing with slice syntax doesn't work #50

denisalevi opened this issue Jan 17, 2017 · 6 comments · Fixed by #202

Comments

@denisalevi
Copy link
Member

For some reason using [1:] style indexing for a StateMonitor returns an empty array while e.g. [1] indexing returns the correct value.

...
mon = StateMonitor(...)
...
run(2*defaultclock.dt)  # -> len(mon.t) is 2
print(mon.t[1:])  # returns []
print(mon[1])  # returns Quantity([defaultclock.dt])

This seems to apply for any kind of [3:] style indexing of the StateMonitor.
This breaks e.g. brian2.tests.test_monitor.test_state_monitor_indexing.

@denisalevi
Copy link
Member Author

The StateMonitor is also not working working for many neurons. Currently the StateMonitor is called with 1 block and as many threads as there are Neurons (or Synapses etc) to record. So if there are more Neurons then the maximum number of threads, the kernel fails to launch.

@denisalevi
Copy link
Member Author

And the global memory writes are not coalesced. Currently we have a 2D data structure of dimensions indices x record_times (vector of vectors) for each variable monitor. And we fill that in the kernel like this

monitor[tid][current_iteration] = ...

For coalesced writes we could just "transpose" the monitor data structure so we can use

monitor[current_iteration][tid] = ...

We might have to resort the monitor in the end though since it might then not fit with the format that Brian expects to read back.

@denisalevi
Copy link
Member Author

denisalevi commented Jul 11, 2018

There also seems to be a problem with the ordering of returned values. Might be related to #46 . In this test, the first assert sums over the not ordered dimension and passes. The second one failes (because of wrong ordering). The monitors return arrays are 2D arrays with shape (num_indices, num_times).

import os                                                                                             
                                                                                                      
from numpy.testing.utils import assert_allclose                                                       
                                                                                                      
from brian2 import *                                                                                  
from brian2.devices.device import reinit_devices, set_device                                          
                                                                                                      
import brian2cuda                                                                                     
                                                                                                      
directory = os.path.splitext(os.path.basename(__file__))[0]                                           
                                                                                                      
results = {}                                                                                          
                                                                                                      
n_cells = 5                                                                                           
n_recorded = 10                                                                                                                                                                                                                 
delay = np.arange(n_cells) * defaultclock.dt                                                          
                                                                                                      
for devicename in ['cpp_standalone', 'cuda_standalone']:                                              
    set_device(devicename, directory=directory + '_' + devicename)                                    
    Synapses.__instances__().clear()                                                                  
    reinit_devices()                                                                                  
    P = NeuronGroup(n_cells, model='', threshold='True')                                              
    S = Synapses(P, P,                                                                                
                 model='''w : 1''',                                                                   
                 on_pre='''w += 1''')                                                                 
    S.connect(j='i')                                                                                  
    S.pre.delay = delay                                                                               
                                                                                                      
    state_mon = StateMonitor(S, 'w', record=range(n_recorded))                                        
                                                                                                      
    run(defaultclock.dt * (n_cells + 1), report='text')                                               
                                                                                                      
    results[devicename] = state_mon.w.astype(int)                                                     
                                                                                                      
assert_allclose(results['cpp_standalone'].sum(axis=0), results['cuda_standalone'].sum(axis=0))        
assert_allclose(results['cpp_standalone'], results['cuda_standalone'])                                

EDIT 13.04.21: This test seems to be passing without issues now (I also added it to our test suite). test_statemonitor_indexing is still failing though.

@denisalevi
Copy link
Member Author

Could this be a similar issue as here: brian-team/brian2#1119?

denisalevi added a commit that referenced this issue Apr 14, 2021
Fixes #50. Issue was the same as in PR brian-team/brian2#1119. We need
to update the monitor's `N` variable on CUDA side and we need to update
the device variable since it is copied to the host variable in
`write_arrays()`. And we need to use `WRITES_TO_READ_ONLY_VARIABLES`,
otherwise `N` will be cached in brian2 and not loaded after simulation.
@denisalevi
Copy link
Member Author

This is being fixed in PR #202, follow-up issue for fixing the single block usage and optimizing statemonitor template is #201.

There were two problems that needed to be fixed here:

  1. We need the _array_<monitor>_N variables in the standalone code to hold the correct monitor size, since this value is stored in the results folder and read when indexing with slice syntax.
  2. The dev_array_<monitor>_N device variable is copied inte _array_<monitor>_N in write_arrays(). Therefore, we need to set the device variable instead of the host variable for the written array to be correct! (or finally clean up the copying and writing mechanism...).
  3. We need to mark N in WRITES_READ_ONLY_VARIABLES in the templates, otherwise brian2 will used a cached value instead of reading it from the results folder (see cpp_standalone/device.py:Device.get_value()).

The brian-team/brian2#1119 had a very similar problem, which I also fixed in PR #202.

denisalevi added a commit that referenced this issue Apr 14, 2021
Fixes #50. Issue was the same as in PR brian-team/brian2#1119. We need
to update the monitor's `N` variable on CUDA side and we need to update
the device variable since it is copied to the host variable in
`write_arrays()`. And we need to use `WRITES_TO_READ_ONLY_VARIABLES`,
otherwise `N` will be cached in brian2 and not loaded after simulation.
denisalevi added a commit that referenced this issue Aug 4, 2021
Fixes #50. Issue was the same as in PR brian-team/brian2#1119. We need
to update the monitor's `N` variable on CUDA side and we need to update
the device variable since it is copied to the host variable in
`write_arrays()`. And we need to use `WRITES_TO_READ_ONLY_VARIABLES`,
otherwise `N` will be cached in brian2 and not loaded after simulation.
denisalevi added a commit that referenced this issue Aug 6, 2021
Fixes #50. Issue was the same as in PR brian-team/brian2#1119. We need
to update the monitor's `N` variable on CUDA side and we need to update
the device variable since it is copied to the host variable in
`write_arrays()`. And we need to use `WRITES_TO_READ_ONLY_VARIABLES`,
otherwise `N` will be cached in brian2 and not loaded after simulation.
@denisalevi
Copy link
Member Author

PR #202 is merged now into update-brian2-submodule branch and will be merged into master with #217. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant