-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spike cross-correlation histogram #40
Conversation
Altered function and parameter names of core CCH function Clarified in documentation which spike train is lagging and which is leading for CCH calculation. Use new binning function in spikecorr.cch(). Removed old code from cch() function that is not needed with BinnedSpikeTrain. Simplified the way in which the smoothing kernel is supplied to the cch() function. Only arrays are allowed as kernels, the option to supply a string leading to a function call to a specific numpy function creating a kernel within was removed. Added TODOs for the future to cch() function header. Long function name is better than cch Changed the output to the cch to a neo.AnalogSignalArray and adjusted the relative unit_test Fixed bug in the trasformation of the histogram to AnalogSignalArray, before the axis of the signal were inverted and following corrections to the unit test Fixed bug in the border_correction Moved CCH function from to correct elephant module. Added TODO where it is unclear how to compute the default window size. Latest additions from code sprint. Removed wrong auto-import Fixed bug of the time axis of the cch, now the center bin is zero and the axis is symmetric respect to the center bin, udjusted also the check for the time axis check in the unit_test that now does not fail Changed the definition of the histogram length to the length of the longest spike trains. Included two more tests, one for different bin sizes and one for when a window size is given. Fixed bug where normalization and window options did not work together (use of Hlen instead of Hbins) 3: Cross-correlation Histogram Task-Url: http://github.com/INM-6/elephant/issues/issue/3 Signed-off-by: Michael Denker <m.denker@fz-juelich.de> Put all data items of cch unit test into setup Full coverage achieved with the test functions. Ready for pull request. 3: Cross-correlation Histogram Task-Url: http://github.com/INM-6/elephant/issues/issue/3 Changed definition of time axis in CCH to be left bins. Smaller fixes in documentation throughout. Added an example for the CCH to the documentation. 3: Cross-correlation Histogram Task-Url: http://github.com/INM-6/elephant/issues/issue/3 Changed all cch() to full name in unit tests 3: Cross-correlation Histogram Task-Url: http://github.com/INM-6/elephant/issues/issue/3 Small changes to make unit tests more legible 3: Cross-correlation Histogram Task-Url: http://github.com/INM-6/elephant/issues/issue/3 Change in error message
hi Michael - this is perfect. I was sure someone must have worked on something similar. It just proves that there is much need for this functionality. Our implementations indeed differ in the representation of spikes. It would be nice to do some benchmarking to see which one is faster in several use cases (for short and long spike trains). I have another implementation which is closer to yours, but uses numba for better performance. Another question I have is why you decided to take binned spike trains as the argument. Since you are working with spare spike time representation, you could just take original spike train object. Finally, I noticed that you can use kernel for smoothing. Another thing worth adding would be the spike dithering method. |
Regarding your question on the use of the binned object, we use this in order to define the binning in a consistent way with other functions. Here, the binning parameters are important for the "binary" option (or, clipped spike trains), and in order to make the normalization consistent with the classic In terms of profiling, it's a good idea to get a feeling. However, the main motivation for the sparse representation function was also the memory efficiency aspect, so possibly it's even a good idea to consolidate the two functions in terms of input parameters, and keep both approaches. |
It makes sense to have two implementations of I agree that each of the method has its use cases - yours should work best for sparse firing (low firing rates) and very fine binning, mine for high firing rates and coarse binning. I think we can safely assume that people have >= 2 gb of RAM these days, which should set the limits of how much memory we are ok to allocate. I will try to do some benchmarks later this week. BTW It might be a good idea to have automated benchmarks akin to |
I also think two functions for doing the same thing is confusing. Thus, I would opt for a version which can distinguish between With the newest optimisation for the binning class ( see #32) speed at least shouldn't be a problem. @btel would be interesting to hear more about the pandas automatic benchmarking. Maybe as a new suggestion (issue). |
This would need some more design work, but I imagine one could match the positional arguments to the implementation in a multiple-dispatch way and rise an exception when a function with an appropriate signature is not found. For example,
(There is a python library called multipledispatch, but I have no experience with it.) Regarding the performance, I think that the bottleneck is in the calculation of the |
# Compute the CCH at lags in -hist_bins,...,hist_bins only | ||
for r, i in enumerate(st1_bin_idx_unique): | ||
timediff = st2_bin_idx_unique - i | ||
timediff_in_range = np.all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather inefficient, because for each spike you create a new boolean array timediff_in_range
. Since the spikes are sorted, you could use np.searchsorted
instead to find the indices of the first and last spike in the window.
So, I think we need to clarify a bit terminology. As I understand, we have three concepts, let me suggest terms to distinguish them in order not to get confused:
Please correct me if you see it differently. I think indeed cch_conv and cch_sparse should be in a single function if both are deemd desireable. Both should in my opinion get the BinnedSpiketrain as input, in order to make this conversion transparent to the user. cch_continuous is not only a different function call accepting Neo objects, but it is also conceptually different as it will lead to a different result (spikes are not "split" across bin borders). Therefore I opt to have this as a different function. In either case, I would suggest to focus on the first two, cch_conv and cch_sparse, for now. I would opt for an option |
I see the same three functions that you listed. However, for The selection between We should probably start just with a single, most complete implementation, but keep the api open so that one can include other implementations in the future. |
Replaced by #63 |
To complement and add to the discussion initiated by PR #39, this is an alternate draft implementation of the CCH function we have been preparing. Without having looked at #39 in detail yet, I think the main difference in terms of algorithm is that this implementation does not rely on creating a binary vector, but uses the information of the sparse matrix representation.