This notebook is to compare the timing of the loading and merging of spikes and stims. It demonstrates why any memoization isn't used for the loading of the data.

In [1]:
import morphs
from ephys import core, rigid_pandas



In [2]:
block_path = morphs.paths.BLOCKS[0]

In [None]:
# This is to clear the cache to make sure timing differences aren't due to OS cache paging
# This doesn't work on my notebook because I don't have sudo privilege, instead just run it in terminal
import os
os.system('sudo /sbin/sysctl -w vm.drop_caches=3')

In [3]:
%time spikes = morphs.data.load.ephys_data(block_path)

CPU times: user 11min 20s, sys: 5.24 s, total: 11min 25s
Wall time: 11min 29s


In [4]:
def spikes_stims(block_path):
    spikes = core.load_spikes(block_path)

    stims = rigid_pandas.load_acute_stims(block_path)

    fs = core.load_fs(block_path)
    stims['stim_duration'] = stims['stim_end'] - stims['stim_start']
    rigid_pandas.timestamp2time(stims, fs, 'stim_duration')

    for rec, rec_group in stims.groupby('recording'):
        try:
            rec_group['stim_name'].astype(float)
            print('going to have to remove float stim recording ', rec)
            spikes = spikes[spikes['recording'] != rec]
            stims = stims[stims['recording'] != rec]
        except ValueError:
            if (rec_group['stim_duration'] > .41).any():
                print('removing long stim recording ', rec)
                spikes = spikes[spikes['recording'] != rec]
                stims = stims[stims['recording'] != rec]

    stim_ids = stims['stim_name']
    stim_ids = stim_ids.str.replace(r'_rec', '')
    stim_ids = stim_ids.str.replace(r'_rep\d\d', '')
    stims['stim_id'] = stim_ids
    morphs.data.parse.stim_id(stims)
    return spikes, stims, fs

In [None]:
# This is to clear the cache to make sure timing differences aren't due to OS cache paging
os.system('sudo /sbin/sysctl -w vm.drop_caches=3')

In [5]:
%time spikes, stims, fs = spikes_stims(block_path)

CPU times: user 431 ms, sys: 358 ms, total: 789 ms
Wall time: 5.06 s


In [6]:
%%time
rigid_pandas.count_events(stims, index='stim_id')

spikes = spikes.join(rigid_pandas.align_events(spikes, stims,
                                               columns2copy=['stim_id', 'morph_dim', 'morph_pos',
                                                             'stim_presentation', 'stim_start', 'stim_duration']))

spikes['stim_aligned_time'] = (spikes['time_samples'].values.astype('int') -
                               spikes['stim_start'].values)
rigid_pandas.timestamp2time(spikes, fs, 'stim_aligned_time')

CPU times: user 11min 24s, sys: 3.5 s, total: 11min 27s
Wall time: 11min 27s


Looks like all the time is spent in the the alignment. Its not worth memoizing the loading of spikes and stims, and the alignment has to be done after any shuffling I do.