In [309]:
from scipy.io import loadmat

import mne
import pandas as pd
import numpy as np

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.model_selection import train_test_split

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Dropout, Flatten, Conv1D, 
                                     MaxPooling1D, GlobalAveragePooling1D)
from tensorflow.keras import utils

import itertools

import matplotlib.pyplot as plt

from IPython.utils import io

### Common Spatial Patterns (CSP)

Next, we will implement and test CSP against our data to try and improve our predictive ability. In general, CSP is a signal processing technique (particularly for classification problems) in which multivariate signals (e.g., an EEG device with 30 electrodes, like we are using here) are separated into subcomponents which maximize the differences between the classes of signal.

In practice, this will collapse our dataset for each test from an array of 30 channels x thousands of samples to just a few vector values. We will perform a gridsearch to find the ideal number of vectors for our problem, as well as whether a neural network or Linear Discriminant Analysis is the best modeling tool to make class predictions based on those CSP vectors.

### First, let's ingest our data using the data ingester we built
After running, this script will load several dictionaries into memory, as well as other needed objects:
1. raw_dict - containing MNE raw objects with all the data
2. event_dict - which indicates the sample number at which each stimulus was applied
3. y_dict - which has the type of experiment conducted in each trial
4. info -  file used to create MNE raw objects including channel names, type, and sampling frequency
5. events_explained - dictionary which provides the names for each of the five trial types
6. ch_names - list of all channel names

In [2]:
%run data_ingester.py

### Let's find the best parameters to use to create the clearest differentiation using CSP

In this grid search we'll be doing changing three major kinds of parameters to optimize our CSP settings: 

1. A smaller subset of the preprocessing options we tested using a CNN
2. The CSP parameters to use in CSP feature extraction
3. Whether an LDA or simple neural network makes more successful predictions based on CSP features

Later, we will refine and iterate on the models we use to make predictions using this CSP data, but for now we will use a straightforward LDA and shallow neural network to test which sets of parameters lead to the most differentiable CSP features.

Here is a full explanation of the CSP parameters and models we will be testing. For a full explanation of the preprocessing options, see the preprocessing options grid search notebook:

1. Preprocessing options
2. CSP parameters
    - Number of CSP components to create (n_components) and transform data into
    - Whether covariance matrices are created based on epochs concatenated together or on individual epochs and concatenated (cov_est)
    - Whether a log transform is applied to standardize features
3. Model to run through
    - LDA
    - Shallow neural network
4. Which pairwise combination of trials to compare
    - (1, 5) Word association vs imagining foot movement
    - (1, 4) Word association vs imagining hand movement
    - (2, 4) Mental subtraction vs imagining hand movement
    - (1, 3) Word association vs mental navigation
    
**Also note that these CSP models will be created individually (i.e., looking at the data for only one participant in training and validation)**

The grid search will save down the accuracy of an LDA and NN model on our test data from day 1 for each combination, and then we can select the highest-performing parameters to be used as the basis of the personalized L1 model for each subject.

In [325]:
#Preprocessing parameters to gridsearch
l_freq_filter_options = [None]
h_freq_filter_options = [40]
channels_to_drop_options = [['AFz', 'F7', 'F8']]
baseline_correction_options = [None]
projectors_to_apply_options = [slice(1)] #check that this generated best results
selected_frequency_options = [256]
tmin_options = [1] #Later start to avoid initialization of thought pattern
tmax_options = [4.5]
detrend_options = [None]
reject_options = [{'eeg': 150}] #Customize per subject later
flat_options = [{'eeg': 20}] #Customize per subject later
ica_to_exclude_options = [None] #Incorporate later if helpful in other gridsearch
scaler_options = ['robust', None] #Test no scaler for CSP
#CSP parameters to gridsearch
n_components_options = [4, 6, 8]
cov_est_options = ['concat', 'epoch']
log_options = [True, False]
#Model to run through
model_type_options = ['NN', 'LDA']
#Combinations of trial types to compare
trial_combo_options = [(1, 5), (1, 4), (2, 4), (1, 3)]

In [326]:
#Create column names for test dataframe
columns = ['l_freq_filter',
           'h_freq_filter',
           'channels_to_drop',
           'baseline_correction',
           'projectors_to_apply',
           'selected_frequency',
           'tmin',
           'tmax',
           'detrend',
           'reject',
           'flat',
           'ica_to_exclude',
           'scaler', 
           'n_components',
           'cov_est', 
           'log', 
           'model_type',
           'trial_combo']

In [327]:
#Create dataframe with all combinations of tests as rows
test_df = pd.DataFrame(itertools.product(l_freq_filter_options, 
                                         h_freq_filter_options, 
                                         channels_to_drop_options, 
                                         baseline_correction_options, 
                                         projectors_to_apply_options, 
                                         selected_frequency_options,
                                         tmin_options,
                                         tmax_options, 
                                         detrend_options,
                                         reject_options,
                                         flat_options,
                                         ica_to_exclude_options,
                                         scaler_options, 
                                         n_components_options, 
                                         cov_est_options, 
                                         log_options, 
                                         model_type_options, 
                                         trial_combo_options), 
                      columns=columns)

In [328]:
#Append columns for each subject, where we will record results for each test
subject_columns = ['sub_A', 'sub_C', 'sub_D', 'sub_E', 'sub_F', 'sub_G', 
                   'sub_H', 'sub_J', 'sub_L']

In [329]:
#Add those combos to our test_df to save highest val accuracy achieved
test_df = test_df.reindex(columns=columns + subject_columns)

In [330]:
test_df.shape

(192, 27)

### Let's test these options


In [331]:
for row in range(test_df.shape[0]):
    #skip rows that have already been completed
    if pd.isna(test_df.at[row, 'sub_A']):
    
        #Load each sessions data into an MNE raw object
        raw_dict = {}
        for key, value in data_dict.items():
            raw_dict[key] = mne.io.RawArray(value.T, info, verbose=0)

        #Filter data with bandpass. Note raw.filter applies in place
        for key, value in raw_dict.items():
            value.filter(l_freq=test_df.l_freq_filter[row], 
                         h_freq=test_df.h_freq_filter[row], 
                         method='fir', phase='zero', verbose=0)

        #Create epoch object with our raw objects and events arrays
        channels_to_keep = [ch for ch in ch_names if 
                            ch not in test_df.channels_to_drop[row]]
        epoch_dict = {}
        for key, value in raw_dict.items():
            epoch_dict[key] = mne.Epochs(value, events=event_dict[key], 
                                        event_id=events_explained, 
                                        tmin=-3, tmax=test_df.tmax[row], 
                                        baseline=test_df.baseline_correction[row],
                                        preload=True,
                                        picks=channels_to_keep, verbose=0,
                                        detrend=test_df.detrend[row],
                                        reject=test_df.reject[row],
                                        flat=test_df.flat[row],
                                        reject_tmin=test_df.tmin[row],
                                        reject_tmax=test_df.tmax[row])

        #Skip creating projectors step to save compute time if not being
        #applied in this iteration
        if test_df.projectors_to_apply[row]:
            #Create dictionary of top 5 signal space projection vectors for each epoch
            proj_dict = {}
            for key, value in epoch_dict.items():
                proj_dict[key] = mne.compute_proj_epochs(value, n_eeg=1, verbose=0)
            #apply projectors
            for key, value in epoch_dict.items():
                value.add_proj(proj_dict[key][test_df.projectors_to_apply[row]], 
                               verbose=0)
                value.apply_proj(verbose=0)

        #Skip creating ICA components step to save compute time if not
        #being applied in this iteration
        if test_df.ica_to_exclude[row]:
            #create and fit ICA object to epochs
            for key, value in epoch_dict.items():
                ica = mne.preprocessing.ICA(n_components=5, method='picard', 
                                            max_iter='auto', verbose=0)
                ica.fit(value, verbose=0)
                #Apply the ICA
                ica.apply(value, exclude=test_df.ica_to_exclude[row],
                         verbose=0)

        #Resample the data at a new frequency
        for key, value in epoch_dict.items():
            value.resample(sfreq=test_df.selected_frequency[row])

        #Extract and standard scale data from all non-dropped epochs
        #Creates intermediate data dictionary
        int_data_dict = {}
        #Use robust sklearn scaler
        if test_df.scaler[row] == 'robust':
            mne_scaler = mne.decoding.Scaler(scalings='median')
            for key, value in epoch_dict.items():
                #with scalings=median implements sklearn robust scaler
                int_data_dict[key] = (mne_scaler.
                                      fit_transform(value.
                                                    get_data(tmin=test_df.tmin[row], 
                                                             tmax=test_df.tmax[row])))
        #No scaling option
        if test_df.scaler[row] is None:
            for key, value in epoch_dict.items():
                int_data_dict[key] = value.get_data(tmin=test_df.tmin[row], 
                                                      tmax=test_df.tmax[row])

        #Create updated dictionary of y values to reflect dropped epochs
        int_y_dict = {}
        for key, value in y_dict.items():
            temp_y_list = []
            for i, epoch in enumerate(epoch_dict[key].drop_log):
        #MNE drop log shows empty parens for epochs that were not dropped - 
        #these are the trials we are keeping in each iteration
                if epoch == ():
                    temp_y_list.append(value[i])
            int_y_dict[key] = temp_y_list

        #Assemble final y dict with only trials in our current combo
        #In each combo, coding 1st trial type to 0, 2nd trial type to 1
        final_y_dict = {}
        for key, value in int_y_dict.items():
            temp_y_list = []
            for y in value:
                if y == test_df.trial_combo[row][0]:
                    temp_y_list.append(0)
                if y == test_df.trial_combo[row][1]:
                    temp_y_list.append(1)
            final_y_dict[key] = np.array(temp_y_list)

        #Assemble data dict with only trials in our current combo
        final_data_dict = {}
        for key, value in int_data_dict.items():
            index_list = []
            for i, y in enumerate(int_y_dict[key]):
                if (y == test_df.trial_combo[row][0] or 
                    y == test_df.trial_combo[row][1]):
                    index_list.append(i)
            final_data_dict[key] = value[index_list]

        #Create csp_dict of csp objects
        csp_dict = {}
        for key, value in epoch_dict.items():
            #Only want to create csp objects for our train data - from session 1
            if 'sesh_1' in key:
                csp_dict[key] = mne.decoding.CSP(n_components=int(test_df.n_components[row]), 
                                                 cov_est=test_df.cov_est[row], 
                                                 log=bool(test_df.log[row]));

        #Suppress output from this noisy function with no verbose option
        with io.capture_output() as captured:
        #Fit csp objects to training data from session 1        
            for key, value in csp_dict.items():
                #Try except to deal with iterations where fails to converge
                try:
                    value.fit(X=final_data_dict[key], 
                          y=final_y_dict[key]);
                except:
                    csp_dict[key] = 'CSP failed to converge'

        #Use csp objects to transform and save resulting data
        #Train test split sesh 1 data to avoid overfit on LDA
        #Store train data in sesh 1, test in sesh 2
        csp_data_dict = {}
        for key, value in csp_dict.items():
            #If except to deal with iterations where CSP fails to converge
            if value == 'CSP failed to converge':
                csp_data_dict[key] = 'CSP failed to converge'
                key2 = key.replace('1', '2')
                csp_data_dict[key2] = 'CSP failed to converge'
            else:
                X = value.transform(final_data_dict[key])
                (X_train, X_test, 
                 y_train, y_test) = train_test_split(X, 
                                                     final_y_dict[key], 
                                                     stratify=final_y_dict[key])
                csp_data_dict[key] = X_train
                key2 = key.replace('1', '2')
                csp_data_dict[key2] = X_test
                final_y_dict[key] = y_train
                final_y_dict[key2] = y_test


        #Model against our data for each subject and save the resulting score
        #First, LDA model
        if test_df.model_type[row] == 'LDA':
            #Base LDA objects on csp dict keys so only created for sesh 1
            lda_dict = {}
            for key in csp_dict.keys():
                lda_dict[key] = LinearDiscriminantAnalysis()

            #Fit LDA objects to training data from sesh 1
            for key, value in lda_dict.items():
                #If to deal with iterations where CSP fails to converge
                if csp_dict[key] == 'CSP failed to converge':
                    lda_dict[key] = 'CSP failed to converge' 
                else:
                    value.fit(csp_data_dict[key], final_y_dict[key])

            #Score on testing data from sesh 2 and save in test_df
            for key, value in lda_dict.items():
                #If to deal with iterations where CSP fails to converge
                if value == 'CSP failed to converge':
                    subject = key[:5]
                    test_df.at[row, subject] = 'CSP failed to converge'
                else:
                    subject = key[:5]
                    key2 = key.replace('1', '2')
                    test_df.at[row, subject] = value.score(csp_data_dict[key2], 
                                                       final_y_dict[key2])

        #Neural network
        if test_df.model_type[row] == 'NN':
            #Build model for each subject
            #Base NN analysis on csp dict keys so only created for sesh 1
            for key, value in csp_dict.items():      
                #If to deal with iterations where CSP fails to converge
                if csp_dict[key] == 'CSP failed to converge':
                    subject = key[:5]
                    test_df.at[row, subject] = 'CSP failed to converge'
                    
                else:
                    #Build model
                    model = Sequential()
                    #inputs qre equal to n_components created via CSP
                    model.add(Dense(test_df.n_components[row], 
                                       input_dim=test_df.n_components[row], 
                                       activation='relu'))
                    model.add(Dropout(0.2))
                    #Add hidden layer with half as many nodes as input
                    model.add(Dense(test_df.n_components[row]/2, activation='relu'))
                    model.add(Dropout(0.2))
                    #Hidden layer with 1/4 as many nodes as input
                    model.add(Dense(test_df.n_components[row]/4, activation='relu'))
                    model.add(Dropout(0.2))
                    #output layer
                    model.add(Dense(1, activation='sigmoid'))

                    #Compile model
                    model.compile(loss='binary_crossentropy', 
                                  optimizer='adam', 
                                  metrics=['acc'])

                    #Fit model
                    #Suppress output
                    with io.capture_output() as captured:
                        key2 = key.replace('1', '2')
                        history = model.fit(csp_data_dict[key], final_y_dict[key], 
                                            validation_data=(csp_data_dict[key2], 
                                                             final_y_dict[key2]), 
                                            epochs=3, verbose=0)

                    #Save validation accuracy into dataframe
                    subject = key[:5]
                    test_df.at[row, subject] = max(history.history['val_acc'])
        test_df.to_csv('data/csp_grid_search.csv', index=False)
        if row % 20 == 0:
            print(f'Grid search complete through row {row} of {test_df.shape[0]}')

Grid search complete through row 0 of 192
Grid search complete through row 20 of 192
Grid search complete through row 40 of 192
Grid search complete through row 60 of 192
Grid search complete through row 80 of 192
Grid search complete through row 100 of 192
Grid search complete through row 120 of 192
Grid search complete through row 140 of 192
Grid search complete through row 160 of 192
Grid search complete through row 180 of 192


**Let's look at the results**

In [336]:
#Temporarily change pandas settings to see all columns
with pd.option_context("display.max_columns", None):
    #Change this code to match the subject of interest
    display(test_df[test_df['sub_L'].apply(type) != str].
            sort_values('sub_L', ascending=False)[:10])

Unnamed: 0,l_freq_filter,h_freq_filter,channels_to_drop,baseline_correction,projectors_to_apply,selected_frequency,tmin,tmax,detrend,reject,flat,ica_to_exclude,scaler,n_components,cov_est,log,model_type,trial_combo,sub_A,sub_C,sub_D,sub_E,sub_F,sub_G,sub_H,sub_J,sub_L
28,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,4,epoch,False,LDA,"(1, 5)",0.6,0.538462,0.647059,0.588235,0.75,0.75,0.454545,0.538462,1.0
117,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,,4,epoch,True,LDA,"(1, 4)",0.35,0.272727,CSP failed to converge,CSP failed to converge,CSP failed to converge,CSP failed to converge,CSP failed to converge,0.666667,1.0
85,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,8,epoch,True,LDA,"(1, 4)",0.65,0.454545,0.75,0.8125,0.85,0.95,0.555556,0.666667,1.0
21,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,4,epoch,True,LDA,"(1, 4)",0.5,0.5,0.705882,0.588235,0.95,0.6,0.6,0.571429,0.95
109,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,,4,concat,False,LDA,"(1, 4)",0.55,CSP failed to converge,CSP failed to converge,0.5625,CSP failed to converge,0.65,0.875,CSP failed to converge,0.95
108,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,,4,concat,False,LDA,"(1, 5)",0.65,CSP failed to converge,0.6,CSP failed to converge,0.75,0.5,0.333333,0.454545,0.95
69,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,8,concat,True,LDA,"(1, 4)",0.65,0.454545,0.6875,0.8125,0.9,0.8,0.666667,0.5,0.95
38,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,6,concat,True,LDA,"(2, 4)",0.75,0.545455,0.9375,0.647059,0.8,0.7,0.7,0.416667,0.95
12,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,robust,4,concat,False,LDA,"(1, 5)",0.55,0.642857,0.764706,0.555556,0.7,0.8,0.5,0.615385,0.95
101,,40,"[AFz, F7, F8]",,"slice(None, 1, None)",256,1,4.5,,{'eeg': 150},{'eeg': 20},,,4,concat,True,LDA,"(1, 4)",CSP failed to converge,0.363636,0.8125,0.875,0.9,0.75,CSP failed to converge,0.583333,0.95


### Identifying per subject ideal individual CSP model settings

We'll be using these models based on CSP covariance matrices as one of our level 1 models in our final ensemble model. We have a bit more testing to do on those level 1 models though (notably individualizing the parameters to drop epochs), so let's examine our results thus far and extract the parameters we want to lock in for each subject. There are a few patterns here in the data:

- Most subjects have a particular trial combo that seems to be most differentiable for them
    - Not the same combination for every subject
    - This is likely influenced by the nature and area of their central nervous system injury
    - This is the most important parameter to improve modeling for each participant
- Most subjects clearly have a number of CSP components that outperform the others, but that number varies
- The robust scaler is uniformly better than no scaling of the data
    - CSP often fails to fit without scaled data
    - As such not saving down as an individual parameter - will just scale in all models
- The cov_est setting does not seems to matter much, but will take the highest performing setting for each person
- Log transforming appears to generally have a small impact on the CSP fit as well
- LDA seems to perform better than the neural network for all but a few subjects. For those subjects where the LDA doesn't perform better than NN, it is always quite close, and those are the subjects where both model types are fairly bad. I'll plan to take the parameters from the highest performing model and then run both NN and LDA models on the CSP data to get a broader perspective.

We'll save the remaining, variable highest performing parameters for each individual to be used going forward as the core of their individual L1 model. The parameters we are saving are:

1. Trial combo
3. Covariance estimate method
4. Whether log transformed
5. Number of csp components

In [None]:
#Code to make data extractable from dataframe once opened:
#First need to open the data with read_csv

sub_A_list = []
for entry in test_df.sub_A:
    try:
        sub_A_list.append(float(entry))
    except:
        sub_A_list.append(entry)
test_df['sub_A'] = sub_A_list

sub_C_list = []
for entry in test_df.sub_C:
    try:
        sub_C_list.append(float(entry))
    except:
        sub_C_list.append(entry)
        
test_df['sub_C'] = sub_C_list

sub_D_list = []
for entry in test_df.sub_D:
    try:
        sub_D_list.append(float(entry))
    except:
        sub_D_list.append(entry)
test_df['sub_D'] = sub_D_list

sub_E_list = []
for entry in test_df.sub_E:
    try:
        sub_E_list.append(float(entry))
    except:
        sub_E_list.append(entry)
test_df['sub_E'] = sub_E_list

sub_F_list = []
for entry in test_df.sub_F:
    try:
        sub_F_list.append(float(entry))
    except:
        sub_F_list.append(entry)
test_df['sub_F'] = sub_F_list

sub_G_list = []
for entry in test_df.sub_G:
    try:
        sub_G_list.append(float(entry))
    except:
        sub_G_list.append(entry)
test_df['sub_G'] = sub_G_list

sub_H_list = []
for entry in test_df.sub_H:
    try:
        sub_H_list.append(float(entry))
    except:
        sub_H_list.append(entry)
test_df['sub_H'] = sub_H_list

sub_J_list = []
for entry in test_df.sub_J:
    try:
        sub_J_list.append(float(entry))
    except:
        sub_J_list.append(entry)
test_df['sub_J'] = sub_J_list

sub_L_list = []
for entry in test_df.sub_L:
    try:
        sub_L_list.append(float(entry))
    except:
        sub_L_list.append(entry)
test_df['sub_L'] = sub_L_list

In [298]:
params_to_save = ['n_components',
                  'cov_est', 
                  'log', 
                  'trial_combo']

#Create function to save the info we want for a given subject
def pull_parameters(sub):
    #identify the highest score achieved for this subject
    sub_max = max(test_df[test_df[sub].apply(type) != str][sub])
    #Get the index in the dataframe where that occurred
    #In case of tie, returns first row
    [max_index] = test_df[test_df[sub] == sub_max].head(1).index
    #Save the params we want plus the subject from that best row
    temp_df = pd.DataFrame(test_df.loc[max_index, params_to_save + [sub]]).T
    temp_df[sub] = sub
    #Duplicate the row 25 times to join with our 25 epoch dropping combos
    return pd.concat([temp_df]*25, ignore_index=True)

#Use function to create the beginning of next test dfs for each subject
sub_A_csp_df = pull_parameters('sub_A')
sub_C_csp_df = pull_parameters('sub_C')
sub_D_csp_df = pull_parameters('sub_D')
sub_E_csp_df = pull_parameters('sub_E')
sub_F_csp_df = pull_parameters('sub_F')
sub_G_csp_df = pull_parameters('sub_G')
sub_H_csp_df = pull_parameters('sub_H')
sub_J_csp_df = pull_parameters('sub_J')
sub_L_csp_df = pull_parameters('sub_L')

In [303]:

sub_C_csp_df = pull_parameters('sub_D')

### Identifying per-subject epoch rejection criteria

The shapes of each individual's EEG recordings look very different - some have far more dramatic amplitude changes than others. The algorithm we are using to drop epochs is based on setting minimum and maximum amplitude differences between peaks and valleys in the data, so we need to personalize those settings if we want to be able to drop bad epochs in each individual study participants data.

We will quantify the percentage of epochs that are dropped from each experiment session with a range of rejection criteria, and then experiment with how dropping increasing percentages of epochs impacts our accuracy.

In [264]:
#Create list of epoch drop filters
reject_options_1 = [None]
reject_options_2 = [None, 
                  {'eeg': 40}, {'eeg': 50}, {'eeg': 60}, 
                  {'eeg': 70}, {'eeg': 80}, {'eeg': 90}, 
                  {'eeg': 105}, {'eeg': 140}, {'eeg': 175},
                   {'eeg': 135}, {'eeg': 145}, {'eeg': 150},
                   {'eeg': 155}, {'eeg': 160}, {'eeg': 167},
                   {'eeg': 146}, {'eeg': 147}, 
                   {'eeg': 130}, {'eeg': 125}, {'eeg': 120},
                   {'eeg': 115}, {'eeg': 110},
                   {'eeg': 180}, {'eeg': 185}, {'eeg': 190},
                   {'eeg': 143}, {'eeg': 110}, {'eeg': 115},
                   {'eeg': 120}, {'eeg': 125}, {'eeg': 130},
                   {'eeg': 83}, {'eeg': 86}, {'eeg': 78}, 
                   {'eeg': 100}, {'eeg': 95}]
flat_options_1 = [None]
flat_options_2 = [None, 
                {'eeg': 3}, {'eeg': 6}, {'eeg': 9}, 
                {'eeg': 11}, {'eeg': 13}, {'eeg': 15}, 
                {'eeg': 17}, {'eeg': 19}, {'eeg': 21},
                 {'eeg': 25}, {'eeg': 30}, {'eeg': 35},
                 {'eeg': 36}, {'eeg': 31}, {'eeg': 33}, 
                 {'eeg': 34}, {'eeg': 30.5}, {'eeg': 31.5},
                 {'eeg': 16}, {'eeg': 18}, {'eeg': 20},
                 {'eeg': 17.3}, {'eeg': 17.6}, {'eeg': 18.5},
                 {'eeg': 18.75}, {'eeg': 19.5}, {'eeg': 20.5},
                 {'eeg': 22}, {'eeg': 23}, {'eeg': 24}, 
                 {'eeg': 19.75},
                 {'eeg': 24.5}, {'eeg': 26}, {'eeg': 27},
                 {'eeg': 28}, {'eeg': 29}, 
                 {'eeg': 25.5}, {'eeg': 29.5}, {'eeg': 26.5}]

#Create dataframe - tests flat & spiky filters independently
rejection_settings_df = pd.concat((pd.DataFrame(itertools.product(reject_options_1,
                                                                 flat_options_2), 
                                               columns=['reject', 'flat']),
                                  pd.DataFrame(itertools.product(reject_options_2,
                                                                 flat_options_1), 
                                               columns=['reject', 'flat'])),
                                 ignore_index=True)

#reindex
rejection_settings_df = rejection_settings_df.reindex(columns=['reject', 'flat'] + 
                                                      list(csp_dict.keys()))

#Compute percentage of trials dropped for each setting
for row in range(rejection_settings_df.shape[0]): 
    epoch_dict = {}
    for key, value in raw_dict.items():
        epoch_dict[key] = mne.Epochs(value, events=event_dict[key], 
                                     event_id=events_explained, 
                                     tmin=-3, tmax=4.5, 
                                     baseline=None,
                                     preload=True,
                                     picks=channels_to_keep, verbose=0,
                                     reject=rejection_settings_df.reject[row],
                                     flat=rejection_settings_df.flat[row],
                                     reject_tmin=1,
                                     reject_tmax=4.5)
    perc_trials_dropped_dict = {}
    for key, value in y_dict.items():
        dropped = value.shape[0] - epoch_dict[key].get_data().shape[0]
        drop_percentage = dropped / value.shape[0]
        rejection_settings_df.at[row, key] = drop_percentage

In [265]:
#PICK UP WORK HERE, DOUBLE CHECK MY TABLE AND SWAP ANY FROM SESH 2 TO SESH 1 IF NEEDED
#view results
(rejection_settings_df[['reject', 'flat', 
                        'sub_L_sesh_1']].
 loc[(rejection_settings_df['sub_L_sesh_1'] < 0.2) & 
     (rejection_settings_df['sub_L_sesh_1'] > 0) &
 (rejection_settings_df['reject'].isna())].sort_values(['sub_L_sesh_1']))

Unnamed: 0,reject,flat,sub_L_sesh_1,sub_L_sesh_2
10,,{'eeg': 25},0.015,0.275
30,,{'eeg': 24},0.015,0.205
32,,{'eeg': 24.5},0.015,0.255
37,,{'eeg': 25.5},0.025,0.31
33,,{'eeg': 26},0.04,0.36
39,,{'eeg': 26.5},0.075,0.405
34,,{'eeg': 27},0.11,0.425
35,,{'eeg': 28},0.165,0.55


### Testing custom drop settings per subject with CSP model

For each subject, we will set out to test how our highest performing CSP model for that individual performs with rejection settings designed to drop the following percentages of epochs. For each approximate drop percentage we are targeting, we will use both min and max peak to trough settings to drop epochs that are too flat and too spiky at those percentage (checked via grid search of combinations.

- None (keep all epochs)
- Drop 2%
- Drop 4%
- Drop 8%
- Drop 12%

The setting that results in the closest drop % to our targets in session 1 is what we will select. There is likely to be relatively wide variability in session 2 (our test data) compared to session 1, so there may be some subjects who have far more or far fewer epochs dropped in session 2. To avoid this in a production BCI device, it needs to either be trained on enough data to be resilient to inter-day swings in EEG signal, or be frequently rebiased or retrained in short calibration sessions.

**Creating list of drop criteria we want to test for each individual**
Based on the results of our grid search, let's manually assemble the list of settings we want to test for each individual that drop roughly the percentage of trials we are planning to drop..

In [None]:
#I want this to be a dictionary with keys equal to sub_A_flat, sub_A_reject, and then the list of values

sub_A_flat_options = [None, {'eeg': 31.5}, {'eeg': 33}, {'eeg': 34}, {'eeg': 35}]
sub_A_reject_options = [None, {'eeg': 167}, {'eeg': 155}, {'eeg': 150}, {'eeg': 143}]
sub_C_flat_options = [None, {'eeg': 16}, {'eeg': 17.6}, {'eeg': 18.5}, {'eeg': 20}]
sub_C_reject_options = [None, {'eeg': 160}, {'eeg': 145}, {'eeg': 130}, {'eeg': 125}]
sub_D_flat_options = [None, {'eeg': 18.75}, {'eeg': 19.5}, {'eeg': 20.5}, {'eeg': 21}]
sub_D_reject_options = [None, {'eeg': 190}, {'eeg': 185}, {'eeg': 150}, {'eeg': 120}]
sub_E_flat_options = [None, {'eeg': 19}, {'eeg': 19.75}, {'eeg': 21}, {'eeg': 22}]
sub_E_reject_options = [None, {'eeg': 150}, {'eeg': 143}, {'eeg': 125}, {'eeg': 105}]
sub_F_flat_options = [None, {'eeg': 24}, {'eeg': 24.5}, {'eeg': 26}, {'eeg': 27}]
sub_F_reject_options = [None, {'eeg': 90}, {'eeg': 86}, {'eeg': 82}, {'eeg': 78}]
sub_G_flat_options = [None, {'eeg': 26}, {'eeg': 27}, {'eeg': 29}, {'eeg': 30}]
sub_G_reject_options = [None, {'eeg': 160}, {'eeg': 150}, {'eeg': 100}, {'eeg': 80}]
sub_H_flat_options = [None, {'eeg': 18}, {'eeg': 18.75}, {'eeg': 19.5}, {'eeg': 20}]
sub_H_reject_options = [None, {'eeg': 185}, {'eeg': 175}, {'eeg': 167}, {'eeg': 150}]
sub_J_flat_options = [None, {'eeg': 17.3}, {'eeg': 18}, {'eeg': 18.75}, {'eeg': 19.75}]
sub_J_reject_options = [None, {'eeg': 155}, {'eeg': 143}, {'eeg': 130}, {'eeg': 105}]
sub_L_flat_options = [None, {'eeg': 24}, {'eeg': 26}, {'eeg': 26.5}, {'eeg': 27}]
sub_L_reject_options = [None, {'eeg': 100}, {'eeg': 90}, {'eeg': 83}, {'eeg': 78}]

### Model refinement

Ensemble model testing - later
- Testing LDA on individual model vs NN on individual model vs NN on heigher sample weighted but all data ingested model

**I think if transform into is CSP space it will yield data in a shape that I should still feed it into a CNN - don't think that really belongs in this test set. Do that separately.**

**Grid search for reject and flat settings to maintain 90%, 75%, 60% of epochs for each trial participant, right now we're dropping way more with some participants**

Grid search sending in a shorter period of data into the CSP

Grid search parameters for LDA if it is performing better than simple NN

Test pulling more time shifted samples

5. Resampling our training data (e.g., including -0.3 to 4.9, 0. to 5.2, and 0.3 to 5.5) to give our model more data to train on

When doing the individual models, I can set the rejection criteria for epochs to match the individual in question better - if I have time.

Read this article for other preprocessing ideas after getting dropping epochs set up: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5915520/

Another thing I could play around with:

- EEGLib
    - https://www.sciencedirect.com/science/article/pii/S2352711021000753
    - Primarily used for feature extraction after the data has been processed, but it does have some preprocessing capability
    - Appears to be written to allow visual inspection of data and then creation of features based on the selected point - certainly worth investigating