This notebook is getting the mean Wavefrom of each cell and saves it for the later uses. 

In [1]:
%reload_ext autoreload
%autoreload 2
%run setup_project.py

Project name: autopi_ca1
dataPath: /adata/projects/autopi_ca1
dlcModelPath: /adata/models
Reading /adata/projects/autopi_ca1/sessionList
We have 39 testing sessions in the list
See myProject and sSesList objects


first I want to check the procedure for one session

In [2]:
load_spike_train_project(sSesList)

100%|███████████████████████████████████████████| 39/39 [00:26<00:00,  1.47it/s]


## Get the mean waveform of all cells for local sessions

We will get the mean of the first 2000 spikes for each neuron.
This code should be run on each computer with data. Please avoid getting .dat files over the network when possible

In [3]:
def get_mean_waveform(ses, n_spikes, overwrite=False):
    """
    Function to get the mean waveform for each neuron on the channel with the largest amplitude.
    
    It saves the waveforms and the channel with the largest amplitude for each neuron.
    
    It the file with the waveform is already there, and overwrite is False, the function simply reads the file and returns the data.
    
    It only uses channels that are listed in channel_map.npy
    
    Arguments:
    ses: spikeA session
    n_spikes: how many spikes to consider to get the mean. Fewer is faster but the mean waveforms are more noisy.
    overwrite: If True, the function will overwrite the previous data files with waveforms if they are there. 
    
    Returns a 2D numpy array. The channel with the largest amplitude is used
    """
    print(ses.name)
    wfn= (f"{ses.fileBase}.mean_waveform_{n_spikes}.npy")
    clafn=(f"{ses.fileBase}.channel_largest_waveform_amplitude_{n_spikes}.npy")
    
    if os.path.exists(wfn) and overwrite==False:
        #print('{} exists'.format(wf))
        return np.load(wfn)
    else: # maybe there is no file available or maybe overwrite is True
        
        fn =ses.path+"/channel_map.npy"
        if not os.path.exists(fn):
            raise IOError(fn+" is missing")
        cm = np.load(ses.path+"/channel_map.npy")[:,0]
    
        dfr= Dat_file_reader(ses.dat_file_names,ses.n_channels)
        %time dfr.read_data_to_ram(read_size_gb=4)
        
        for n in tqdm(ses.cg.neuron_list):
            n.set_spike_waveform(session=ses,dat_file_reader=dfr)
            n.spike_waveform.mean_waveform(block_size=60, channels=cm, n_spikes=n_spikes)
            n.spike_waveform.largest_amplitude_waveform()
        
        # get and save the largest waveform of each neuron
        waves= [np.expand_dims(n.spike_waveform.largest_wf,axis=0) for n in ses.cg.neuron_list]
        waves= np.concatenate(waves,axis=0)
        dfr.release_ram()
        np.save(wfn, waves)
        
        # get and save the channel with the largest waveform amplitude
        chan= [n.spike_waveform.max_amplitude_channel for n in ses.cg.neuron_list]
        chan = np.array(chan)
        np.save(clafn,chan)
        
        return waves

#ses= sSesList[0]
#wff= get_mean_waveform(ses, 2000,overwrite=False)

In [4]:
[ (i,ses.name) for i,ses in enumerate(sSesList)]

[(0, 'mn5824-20112020-0107'),
 (1, 'mn5824-22112020-0107'),
 (2, 'mn5824-24112020-0107'),
 (3, 'mn5824-02122020-0106'),
 (4, 'mn711-28012021-0106'),
 (5, 'mn711-30012021-0106'),
 (6, 'mn711-31012021-0107'),
 (7, 'mn711-01022021-0107'),
 (8, 'mn711-02022021-0108'),
 (9, 'mn711-03022021-0107'),
 (10, 'mn711-04022021-0107'),
 (11, 'mn2739-11022021-0107'),
 (12, 'mn2739-15022021-0105'),
 (13, 'mn2739-16022021-0106'),
 (14, 'mn2739-17022021-0106'),
 (15, 'mn2739-21022021-0106'),
 (16, 'mn3246-09042021-0106'),
 (17, 'mn3246-10042021-0106'),
 (18, 'mn3246-12042021-0106'),
 (19, 'mn3246-14042021-0106'),
 (20, 'mn1173-02052021-0107'),
 (21, 'mn1173-06052021-0107'),
 (22, 'mn1173-08052021-0107'),
 (23, 'mn1173-09052021-0108'),
 (24, 'mn1173-11052021-0108'),
 (25, 'TYY9524-16082021-0106'),
 (26, 'TYY9524-18082021-0106'),
 (27, 'mn5618-07072021-0107'),
 (28, 'mn5618-12072021-0110'),
 (29, 'TYY5622-07092021-0106'),
 (30, 'TYY5622-17092021-0106'),
 (31, 'TYY5622-19092021-0106'),
 (32, 'TYY5622-20092

In [5]:
ses= sSesList[38]
wff= get_mean_waveform(ses, 2000,overwrite=False)

mn9686-01112021-0106


Get a list of sessions for which the data is stored locally.

In [6]:
sSesListLocal = get_local_sessions(sSesList)
print("{} local sessions".format(len(sSesListLocal)))

37 local sessions


In [7]:
wff = [get_mean_waveform(ses=ses, n_spikes=2000) for ses in sSesListLocal]

mn5824-20112020-0107
mn5824-22112020-0107
mn5824-24112020-0107
mn5824-02122020-0106
mn711-28012021-0106
mn711-30012021-0106
mn711-31012021-0107
mn711-01022021-0107
mn711-02022021-0108
mn711-03022021-0107
mn711-04022021-0107
mn2739-11022021-0107
mn2739-15022021-0105
mn2739-16022021-0106
mn2739-17022021-0106
mn2739-21022021-0106
mn3246-09042021-0106
mn3246-10042021-0106
mn3246-12042021-0106
mn3246-14042021-0106
mn1173-02052021-0107
mn1173-06052021-0107
mn1173-08052021-0107
mn1173-09052021-0108
mn1173-11052021-0108
mn5618-07072021-0107
mn5618-12072021-0110
TYY5622-07092021-0106
TYY5622-17092021-0106
TYY5622-19092021-0106
TYY5622-20092021-0106
mn9686-20102021-0106
mn9686-26102021-0106
mn9686-27102021-0106
mn9686-28102021-0107
mn9686-29102021-0106
mn9686-01112021-0106


Check that the channel with largest amplitude was saved 

In [8]:
 [ os.path.exists(f"{ses.fileBase}.channel_largest_waveform_amplitude_2000.npy") for ses in sSesList]

[True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 True]

## Get the id for each neuron

In [9]:
def df_Session(ses):
    ''' Get the id of each neuron'''
    cellID= [ses.name + "_"+ n.name for n in ses.cg.neuron_list]
    df= pd.DataFrame({"id": cellID, 
                     "subject": ses.subject,
                     "session": ses.name })
    return df
df= [df_Session(ses) for ses in sSesList]
ci= pd.concat(df,axis=0)
fn= myProject.dataPath+"/results/ci.scv"
ci.to_csv(fn,index=False)

## Get all the data together

Once the waveforms were calculated and stored in a file for each session, we can then get them all to save them together.

In [10]:
wff = [get_mean_waveform(ses, 2000,overwrite=False) for ses in sSesList]
wff = np.concatenate(wff)

mn5824-20112020-0107
mn5824-22112020-0107
mn5824-24112020-0107
mn5824-02122020-0106
mn711-28012021-0106
mn711-30012021-0106
mn711-31012021-0107
mn711-01022021-0107
mn711-02022021-0108
mn711-03022021-0107
mn711-04022021-0107
mn2739-11022021-0107
mn2739-15022021-0105
mn2739-16022021-0106
mn2739-17022021-0106
mn2739-21022021-0106
mn3246-09042021-0106
mn3246-10042021-0106
mn3246-12042021-0106
mn3246-14042021-0106
mn1173-02052021-0107
mn1173-06052021-0107
mn1173-08052021-0107
mn1173-09052021-0108
mn1173-11052021-0108
TYY9524-16082021-0106
TYY9524-18082021-0106
mn5618-07072021-0107
mn5618-12072021-0110
TYY5622-07092021-0106
TYY5622-17092021-0106
TYY5622-19092021-0106
TYY5622-20092021-0106
mn9686-20102021-0106
mn9686-26102021-0106
mn9686-27102021-0106
mn9686-28102021-0107
mn9686-29102021-0106
mn9686-01112021-0106


## Normalise to baseline


In [11]:
wff= wff-np.expand_dims(wff[:,0:10].mean(axis=1), axis=1)

## Save

In [12]:
fn=myProject.dataPath+"/results/mean_waveforms.npy"
np.save(fn,wff)