In this notebook, I aim to roll through an analysis across a single patient which can easily be looped for multiple patients. To do so, we will use the functions that are written out more explicitly in the step-by-step notebooks. 

**This is the one you should copy and edit for your own actual analyses**

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import mne
from glob import glob
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import seaborn as sns
from scipy.stats import zscore, linregress
import pandas as pd
from mne.preprocessing.bads import _find_outliers

In [3]:
from LFPAnalysis import lfp_preprocess_utils, sync_utils, analysis_utils

## Load, pre-process and re-reference the neural data

In [4]:
base_dir = '/sc/arion' # this is the root directory for most un-archived data and results  
subj_ids = ['MS007']
elec_dict = {f'{x}': [] for x in subj_ids}
mne_dict = {f'{x}': [] for x in subj_ids}
photodiode_dict = {f'{x}': [] for x in subj_ids}
for subj_id in subj_ids: 
    # Set paths
    load_path = f'{base_dir}/projects/guLab/Salman/EMU/{subj_id}/neural/Day1'
    elec_path = f'{base_dir}/projects/guLab/Salman/EMU/{subj_id}/anat/'
    elec_files = glob(f'{elec_path}/*labels.csv')[0]
    save_path = f'{base_dir}/work/qasims01/MemoryBanditData/EMU/Subjects/{subj_id}'
    
    # Load electrode data (should already be manually localized!)
    elec_data = pd.read_csv(elec_files)

    # Sometimes there's extra columns with no entries: 
    elec_data = elec_data[elec_data.columns.drop(list(elec_data.filter(regex='Unnamed')))]

    # Load neural data
    mne_data = lfp_preprocess_utils.make_mne(load_path=load_path, 
                                             save_path=save_path, 
                                             elec_data=elec_data, 
                                             format='edf')
    
    # Re-reference neural data
    mne_data_reref = lfp_preprocess_utils.ref_mne(mne_data=mne_data, 
                                                  elec_data=elec_data, 
                                                  method='wm', 
                                                  site='MSSM')
    
    # Save this data so that you don't need this step again:
    mne_data_reref.save(f'{save_path}/wm_ref_ieeg.fif', overwrite=True)

    
    # Append to list 
    mne_dict[subj_id].append(mne_data_reref)
    
    photodiode_dict[subj_id].append(mne.io.read_raw_fif(f'{load_path}/photodiode.fif', preload=True))
    
    elec_dict[subj_id].append(elec_data)
    

Extracting EDF parameters from /sc/arion/projects/guLab/Salman/EMU/MS007/neural/Day1/MS007_MemBandit.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 1867007  =      0.000 ...  1823.249 secs...
Could not find a match for rhplt9.
Setting up band-stop filter

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandstop filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower transition bandwidth: 0.50 Hz
- Upper transition bandwidth: 0.50 Hz
- Filter length: 6759 samples (6.601 sec)

Writing /sc/arion/projects/guLab/Salman/EMU/MS007/neural/Day1/photodiode.fif
Closing /sc/arion/projects/guLab/Salman/EMU/MS007/neural/Day1/photodiode.fif
[done]


  mne_data.save(f'{load_path}/photodiode.fif', picks='dc1', overwrite=True)


Could not find a match for rhplt9.
sEEG channel type selected for re-referencing
Creating RawArray with float64 data, n_channels=114, n_times=1867008
    Range : 0 ... 1867007 =      0.000 ...  1823.249 secs
Ready.
Added the following bipolar channels:
lacas1-lmolf1, lacas10-lacas9, lacas12-lacas9, lacas2-lmolf1, lacas3-lmolf1, lacas4-lacas8, lacas5-lacas8, lacas6-lacas8, lacas7-lacas8, laglt1-lhplt5, laglt10-laglt6, laglt2-lhplt5, laglt3-lhplt5, laglt7-lhplt6, laglt8-laglt6, laglt9-laglt6, laimm1-laglt5, laimm13-laimm12, laimm2-laimm6, laimm3-lmolf6, laimm4-laimm8, laimm5-laimm6, laimm7-laimm6, lcmfo1-lcmfo4, lcmfo12-lcmfo10, lcmfo13-lcmfo10, lcmfo2-lcmfo4, lcmfo3-lcmfo4, lcmfo7-lcmfo6, lcmfo8-lcmfo6, lhplt1-laglt5, lhplt10-lhplt8, lhplt2-laglt5, lhplt3-laglt4, lhplt4-lhplt6, lhplt9-lhplt8, lmcms1-lmcms5, lmcms2-lmcms5, lmcms3-lmcms5, lmcms4-lmcms5, lmcms9-lmcms8, lmolf2-lmolf6, lmolf3-lmolf6, lmolf4-laimm6, lmolf5-laimm6, lmolf8-laimm6, lmtpt1-lhplt5, lmtpt2-lhplt5, lmtpt3-lhplt5, lm

  photodiode_dict[subj_id].append(mne.io.read_raw_fif(f'{load_path}/photodiode.fif', preload=True))


## Extract behavioral information

Here, one should load in their own functions for behavioral stuff. I'll just write the functions relevant to me here for demonstration purposes. 


In [6]:
# Utility functions for image memorability ratings. 
import pandas as pd 
import numpy as np 
import os 
from scipy.stats import norm, zscore, linregress

# Note: Much of the following is ported from: https://github.com/cvzoya/memorability-distinctiveness

def dprime(pHit, pFA, PresentT, AbsentT, criteria=False):
    """
    Note: from: http://nikos-konstantinou.blogspot.com/2010/02/dprime-function-in-matlab.html
    
    
    Parameters
    ----------
    pHit : float
        The proportion of "Hits": P(Yes|Signal)
    pFA : float
        The proportion of "False Alarms": P(Yes|Noise)
    PresentT : int
        The number of Signal Present Trials e.g. length(find(signal==1))
    AbsentT : int
        The number of Signal Absent Trials e.g. length(find(signal==0))

        
    Returns
    -------
    dPrime: float
        signal detection theory sensitivity measure 
    
    beta: float
        optional criterion value
        
    C: float
        optional criterion value
        
    """

    if pHit == 1: 
        # if 100% Hits
        pHit = 1 - (1/(2*PresentT))
    
    if pFA == 0: 
        # if 0% FA 
        pFA = 1/(2*AbsentT)
        
    # Convert to Z-scores
    
    zHit = norm.ppf(pHit) 
    zFA = norm.ppf(pFA) 
    
    # calculate d-prime 
    
    dPrime = zHit - zFA 
    
    if criteria:
        beta = np.exp((zFA**2 - zHit**2)/2)
        C = -0.5 * (zHit + zFA)    
        return dPrime, beta, C
    else:
        return dPrime

def compute_memorability_scores(hits, false_alarms, misses, correct_rejections):
    """
    Parameters
    ----------
    hits : array-like
        TODO
    false_alarms : array-like
        TODO
    misses : array_like 
        TODO
    correct_rejections : array_like 
        TODO
        
    Returns
    -------
    memory_ratings : pandas DataFrame 
        DataFrame with the following ratings added: HR (hit rate), FAR (false alarm rate), ACC (accuracy), DPRIME (d-prime), MI (mutual information)
    """

    len_args = [len(hits), len(false_alarms), len(misses), len(correct_rejections)]
    if not all(len_args[0] == _arg for _arg in len_args[1:]):
            raise ValueError("All parameters must be the same length.")
    
    memory_ratings = pd.DataFrame(columns = ['HR', 'FAR', 'ACC', 'DPRIME'])

    nstimuli = len(hits) 

    hm = hits+misses
    fc = false_alarms+correct_rejections

    hrs = hits/hm
    fars = false_alarms/fc
    accs = (hits+correct_rejections)/(hm+fc)

    dp = []
    for i in range(nstimuli):
        dp.append(dprime(hrs[i], fars[i], hm[i], fc[i]))

    memory_ratings['HR'] = hrs
    memory_ratings['FAR'] = fars
    memory_ratings['ACC'] = accs
    memory_ratings['DPRIME'] = dp
    
    return memory_ratings

In order to analyze the neural data with respect to the behavioral data we need to be able to synchronize the two using the photodiode (or TTLs, eventually?) 


In [13]:
slopes = {f'{x}': [] for x in subj_ids}
offsets = {f'{x}': [] for x in subj_ids}

bandit_evs = {f'{x}': [] for x in subj_ids}
memory_evs = {f'{x}': [] for x in subj_ids}

for subj_id in subj_ids:
    # Set paths
    behav_path = f'{base_dir}/projects/guLab/Salman/EMU/{subj_id}/behav/Day1'

    # Find the timestamps of ONSET and OFFSET of all the sync signals in the photodiode 
    # moving average helps us detect the deflections 
    sig = np.squeeze(sync_utils.moving_average(photodiode_dict[subj_id][0]._data, n=11))
    timestamp = np.squeeze(np.arange(len(sig))/mne_dict[subj_id][0].info['sfreq'])
    # normalize
    sig =  zscore(sig)
    # look for z-scores above 1
    trig_ix = np.where((sig[:-1]<=0)*(sig[1:]>0))[0] # rising edge of trigger
    neural_ts = timestamp[trig_ix]
    neural_ts = np.array(neural_ts)
    print(f'There are {len(neural_ts)} neural syncs detected')
    
    # Get the .log file and/or .csv file, depending on how your task logs the behavioral data. Eventually this should be fairly standardized across tasks.

    log_path = glob(f'{behav_path}/*.log')[0]
    csv_path = glob(f'{behav_path}/*MB_MEM*.csv')[0]
    
    # Now get the relevant timestamps from behavioral logfiles. This will differ depending

    MB1_ts = {'trial_start': [], 
    'deck_start': [], 
    'feedback_start': [],
    'ITI_start': [],
    'ITI_stop': []}

    MEM2_ts = {'trial_start': [], 
    'face_start': [], 
    'slider_start': [],
    'slider_stop': [],
    'ITI_start': [],
    'ITI_stop': []}

    beh_ts = []

    MB1_FLAG = True 
    MEM2_FLAG = False 

    with open(log_path, 'r') as fobj:
        for ix, line in enumerate(fobj.readlines()):
            line = line.replace('\r', '')
            tokens = line[:-1].split('\t')

            if tokens[1] == 'EXP ':
                # Determine which task we are looking at 
                if tokens[2][0:3] == 'MB1':
                    MB1_FLAG = True
                    MEM2_FLAG = False 
                elif tokens[2][0:3] == 'MEM':
                    MEM2_FLAG = True
                    MB1_FLAG = False

                # Grab photodiode timestamp
                if tokens[2][0:4] =='sync':
                    if 'autoDraw = True' in tokens[2]:
                        beh_ts.append(float(tokens[0]))

                # Get MB1 deck 
                if 'MB1_left_draw' in tokens[2]:
                    if 'autoDraw = True' in tokens[2]:
                        MB1_ts['deck_start'].append(float(tokens[0]))

                # Get MB1 feedback
                if 'MB1_face' in tokens[2]:
                    if 'autoDraw = True' in tokens[2]:
                        MB1_ts['feedback_start'].append(float(tokens[0]))

                # Get MB1 ITI cross 
                if 'MB1_ITI_cross' in tokens[2]:
                    if 'autoDraw = True' in tokens[2]:
                        MB1_ts['ITI_start'].append(float(tokens[0]))
                    elif 'autoDraw = False' in tokens[2]:
                        MB1_ts['ITI_stop'].append(float(tokens[0]))

                if 'New trial (rep=0' in tokens[2]:
                    if MB1_FLAG: 
                        # remember to discard the first one later - it's pre-session 
                        MB1_ts['trial_start'].append(float(tokens[0]))
                    elif MEM2_FLAG:
                        MEM2_ts['trial_start'].append(float(tokens[0]))

                # Get MEM2 ITI
                if 'MEM2_jitter' in tokens[2]:
                    if 'autoDraw = True' in tokens[2]:
                        MEM2_ts['ITI_start'].append(float(tokens[0]))          
                    elif 'autoDraw = False' in tokens[2]:
                        MEM2_ts['ITI_stop'].append(float(tokens[0]))  

                # Get MEM2 Face
                if 'MEM2_images' in tokens[2]:
                    if 'autoDraw = True' in tokens[2]:
                        MEM2_ts['face_start'].append(float(tokens[0]))     

                # Get MEM2 slider start
                if tokens[2][:16] == 'MEM2_conf_slider':
                    if 'autoDraw = True' in tokens[2]:
                        MEM2_ts['slider_start'].append(float(tokens[0]))    

                 # Get MEM2 slider stop
                if tokens[2][:16] == 'MEM2_conf_slider':
                    if 'autoDraw = False' in tokens[2]:
                        MEM2_ts['slider_stop'].append(float(tokens[0]))                                               

    beh_ts = np.array(beh_ts)
    print(f'There are {len(beh_ts)} behav syncs detected')

    # Note: fixation crosses need fixing on stop time duplicates
    MB1_ts['ITI_stop'] = np.unique(MB1_ts['ITI_stop']).tolist()
    MEM2_ts['ITI_stop'] = np.unique(MEM2_ts['ITI_stop']).tolist()

    # Get the choice times: 
    csv_data = pd.read_csv(csv_path)
    MB1_ts['choice'] = (csv_data['MB1_draw_key.started'].dropna() + csv_data['MB1_draw_key.rt'].dropna()).tolist()
    MEM2_ts['choice'] = (csv_data['MEM2_recall_key.started'].dropna() + csv_data['MEM2_recall_key.rt'].dropna()).tolist()

    # Do some corrections: 
    # Get rid of first trial start (pre-session)
    MB1_ts['trial_start'].pop(0) 
    
    subj_count = 0

    # Load the database with image DPRIME information
    database_file = f'{behav_path}/all_mem_data.xlsx' 
    all_mem_df = pd.read_excel(database_file, engine='openpyxl')
    all_mem_df = all_mem_df[['img_path', 'DPRIME']]

    # Turn into right format for modeling: 
    MB1_n = 60
    MEM2_n = 120
    li_mb1 = []
    li_mem2 = [] 

    act_rew_rate = {}
    act_rew_rate['pids'] = []

    r1_chance=30

    for elem in ['actions', 'rewards']:
        act_rew_rate[elem] = np.zeros(MB1_n).astype(int) # len(task_files), 

    # Load the merged task data 
    csv_data['trials_2.thisN'] = csv_data['trials_2.thisRepN'].shift(-1)

    ##### First, process the Bandit task: 
    mb_df = csv_data.dropna(subset=['trials_2.thisN'])
    act_rew_rate['pids'].append(mb_df.participant.iloc[0])

    # Change Gender so that female = 2
    mb_df.Gender[mb_df.Gender==0] = 2

    # add score, reward probability and expected value 
    mb_df['choice'] = mb_df.apply(lambda x: x['MB1_draw_key.keys'], axis=1)
    # Make the trials 1-60 
    mb_df['trials_dm'] = mb_df['trials_2.thisN'].shift(+1)
    # mb_df.trials_dm.fillna(60, inplace=True)
    # # get rid of the extra rows in the .csv that populate between trials 
    mb_df = mb_df.drop_duplicates(subset='trials_dm', keep='first')
    mb_df['reward'] = mb_df.apply(lambda x: x['reward']/100, axis=1)
    mb_df.reward[mb_df.reward==100] = 0
    # 0 is male, 1 is female 
    mb_df['choice'] = mb_df['choice']-1
    mb_df.rename(columns={'MB1_draw_key.rt':'draw_rt'}, inplace=True)
    mb_df.dropna(subset=['img_path'], inplace=True)

    ##### Fit RW model to DM data
    mb_df['bic'] = np.nan
    mb_df['alpha']  = np.nan
    mb_df['beta']  = np.nan
    mb_df['RPE']  = np.nan

    # RW_model = RW() 
    sub_act_rew_rate = {}
    sub_act_rew_rate['pids'] = np.array([subj_count])  

    for elem in ['actions', 'rewards']:
        sub_act_rew_rate[elem] = np.zeros([1, MB1_n]).astype(int)
    c = mb_df.choice.dropna().values.astype(int)
    r = mb_df.reward.dropna().values
    sub_act_rew_rate['actions'][0, :] = c
    sub_act_rew_rate['rewards'][0, :] = r

    # Save dict for modeling decision-making performance: 
    c = mb_df.choice.dropna().values.astype(int)
    r = mb_df.reward.dropna().values
    act_rew_rate['actions'] = c # [subj_count, :]
    act_rew_rate['rewards']= r # [subj_count, :]

    ##### Second, process the MEM2 data: 
    rm_df = csv_data.dropna(subset=['MEM2_trials.thisN'])

    # add coding of memory choice: 
    rm_df['hit'] = 0
    rm_df['miss'] = 0
    rm_df['corr_reject'] = 0
    rm_df['false_alarm'] = 0

    # add score, reward probability and expected value 
    rm_df.rename(columns={'MEM2_recall_key.keys': 'response',
    'MEM2_conf_slider.response': 'confidence'
     }, inplace=True) 

    rm_df['trials_mem'] = rm_df['MEM2_trials.thisN'].shift(-1)
    rm_df.trials_mem.fillna(120, inplace=True)

    # Change Gender so that female = 2
    rm_df.Gender[rm_df.Gender==0] = 2

    rm_df = rm_df.merge(mb_df, on='img_path', how='left', indicator=True)
    # Clean up the merge
    rm_df.drop(columns=['participant_y'], inplace=True)
    rm_df.rename(columns={'participant_x': 'participant'}, inplace=True)

    hit_bool = (rm_df._merge=='both') & (rm_df.response==2)
    hits = hit_bool.sum()
    # NEW = 1, which is false in this case 
    miss_bool = (rm_df._merge=='both') & (rm_df.response==1)
    misses = miss_bool.sum()

    # or just the "left" df ('new')
    false_alarm_bool = (rm_df._merge=='left_only') & (rm_df.response==2)
    false_alarms = false_alarm_bool.sum()
    corr_reject_bool = (rm_df._merge=='left_only') & (rm_df.response==1)
    correct_rejections = corr_reject_bool.sum()

    # categorize image by memory choice
    rm_df.hit[hit_bool] = 1
    rm_df.miss[miss_bool] = 1
    rm_df.false_alarm[false_alarm_bool] = 1
    rm_df.corr_reject[corr_reject_bool] = 1

    # compute dprime for the subject
    hm = hits+misses
    fc = false_alarms+correct_rejections

    hrs = hits/hm
    fars = false_alarms/fc

    # Adjust extreme hit-rates or false-alarms
    if hrs == 0: 
        hrs = 0.5/hm
    elif hrs ==1: 
        hrs = (hm-0.5)/hm
    if fars == 0: 
        fars = 0.5/fc
    elif fars ==1: 
        fars = (fc-0.5)/fc

    dp = dprime(hrs, fars, hm, fc)

    # Add in subject-level memory characteristics ("rates")
    rm_df['hit_rate'] = hrs
    rm_df['false_alarm_rate'] = fars
    rm_df['subj_dprime'] = np.nan
    if dp != float("-inf"):
        rm_df['subj_dprime'] = dp    

    # Merge in the image DPRIME 
    rm_df['DPRIME'] = rm_df.merge(all_mem_df, on='img_path', how='right')['DPRIME']
    mb_df['DPRIME'] = mb_df.merge(all_mem_df, on='img_path', how='right')['DPRIME']
    mb_df.rename(columns={'DPRIME': 'image_dprime'}, inplace=True) 

    rm_df.rename(columns={'DPRIME': 'image_dprime',
    'Gender_x': 'image_gender',
    'MEM2_recall_key.rt_x': 'recall_rt',
    'MEM2_conf_slider.rt_x': 'slider_rt'
    }, inplace=True) 


    # dm_df = pd.concat(li_mb1, axis=0, ignore_index=True)
    mb_df['male'] = 0
    mb_df['female'] = 0
    mb_df.male = mb_df.apply(lambda x: 1 if x.choice==0 else 0, axis=1)
    mb_df.female = mb_df.apply(lambda x: 1 if x.choice==1 else 0, axis=1)

    rm_df['phit'] = rm_df.response - 1

    col_mask = ((rm_df.columns.str.startswith('MEM')) | (rm_df.columns.str.startswith('MB')) | (rm_df.columns.str.startswith('trials.')) | (rm_df.columns.str.startswith('trials_2')) | (rm_df.columns.str.endswith('_y')))
    rm_df = rm_df.loc[:,~col_mask]

    # Do regression to find neural timestamps for each event type
    if len(beh_ts)!=len(neural_ts):
        good_beh_ms, neural_offset = sync_utils.pulsealign(beh_ts, neural_ts, window=50, thresh=0.95)
        slope, offset, rval = sync_utils.sync_matched_pulses(good_beh_ms, neural_offset)
    else:
        slope, offset, rval = sync_utils.sync_matched_pulses(beh_ts, neural_ts)

    if rval < 0.99:
        print('sync failed')
    else: 
        print('sync succeeded')
        
    slopes[subj_id].append(slope)
    offsets[subj_id].append(offset)
    bandit_evs[subj_id].append(MB1_ts)
    memory_evs[subj_id].append(MEM2_ts)
    

There are 485 neural syncs detected
There are 486 behav syncs detected


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mb_df.Gender[mb_df.Gender==0] = 2
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mb_df.Gender[mb_df.Gender==0] = 2
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mb_df['choice'] = mb_df.apply(lambda x: x['MB1_draw_key.keys'], axis=1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.py

50 blocks
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . found matches for 36 of 50 blocks
sync succeeded


## Make epochs

In [15]:
# set some windows of interest 

buf = 1.0 # this is the buffer before and after that we use to limit edge effects for TFRs

feedback_pre = 1.0 # this is the time before the feedback appears 
feedback_post = 1.5 # this is the time the feedback is present 

IED_args = {'peak_thresh':4,
           'closeness_thresh':0.25, 
           'width_thresh':0.2}

# add behavioral times of interest 
for subj_id in subj_ids:
    # Set paths
    load_path = f'{base_dir}/projects/guLab/Salman/EMU/{subj_id}/neural/Day1'
    save_path = f'{base_dir}/work/qasims01/MemoryBanditData/EMU/Subjects/{subj_id}'

    epochs = lfp_preprocess_utils.make_epochs(load_path=load_path, save_path=save_path, elec_data=elec_dict[subj_id][0], 
                                              slope=slopes[subj_id][0], offset=offsets[subj_id][0], 
                                              behav_times=bandit_evs[subj_id][0]['feedback_start'], 
                                              baseline_times=None, baseline_dur=None, fixed_baseline=[-1.0, 0],
                                              buf_s=buf, pre_s=-feedback_pre, post_s=feedback_post, downsamp_factor=2, IED_args=IED_args)




