In [35]:
import mne
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import (Dense, Dropout, Flatten, Conv1D, 
                                     MaxPooling1D, GlobalAveragePooling1D)
from tensorflow.keras.regularizers import l2

import itertools
import copy

from IPython.utils import io

from ast import literal_eval

### Purpose of this notebook

In this notebook, we'll be testing and assembling the final ensemble model that we will use as our final predictor for each test subject.

The ensemble model will include the following level 1 models for each subject:
1. A CNN model with a larger filter size and fewer output nodes to capture larger features
2. A CNN model with a smaller filter size and more output nodes to capture smaller features
3. An LDA model with a small number of CSP components
4. A NN looking at CSP transformed data

First, as usual, we'll use our data ingester to bring our data into the notebook.

In [20]:
%run data_ingester.py

### We will assemble an L1 model dataframe that generates all the inputs for our ensemble

One row per model, so each subject will have multiple rows in the dataframe (one for each of the L1 model types). All L1 models for each subject will have the same general MNE preprocessing settings, so we can do the MNE preprocessing work just once for each subject - which would be extremely important in a production device to reduce computation time for ongoing predictions.

In addition, we need to add a few columns to the dataframe:

1. Model type ('LDA', 'NN_CSP' or 'CNN_LF', 'CNN_SF' respectively)
2. train_output (array to be passed to ensemble model)
    - For the LDA model this will be an array of probabilities between 0 and 1 with a length equal to the number of trials of the relevant trial type for that subject.
    - For the CNN models and NN models this will be an array with a length equal to the number of trials, and a width equal to the number of nodes in the 2nd to last dense layer of the model
3. test_output (array to be passed to ensemble model)
    - Outputs are as above, but for either test data from session 1 (if this is not our final test on session 2 data), or output on session 2 data if the ensemble model is finalized and we are calculating our scores
4. y_train and y_test
    - We need to save the true values for the ensemble model to use in its fitting and scoring.


### We will also have an ensemble_df

This dataframe will contain combinations of our subjects and the permutations of the ensemble model parameters we want to test, as well as a columns where the ensemble model's scores can be recorded.

**First let's import our csp models**

In [105]:
#Import final CSP model params
final_csp_models = pd.read_csv('data/csp_models_not_overfit.csv')

#Reset several columns away from strings and as literal_evals so can be read
final_csp_models['trial_combo'] = (final_csp_models['trial_combo'].
                                apply(lambda x: literal_eval(x)))

#Use list comprehension b/c NaN can appear in list
final_csp_models['flat'] = [None if pd.isna(x) else literal_eval(x)
                         for x in final_csp_models['flat']]
final_csp_models['reject'] = [None if pd.isna(x) else literal_eval(x)
                         for x in final_csp_models['reject']]

#Add our three needed output columns
final_csp_models['model_type'] = 'LDA'
final_csp_models['train_output'] = None
final_csp_models['test_output'] = None

Unfortunately ast.literal_eval cannot evaluate slice expressions, which we also need to convert away from a string. I could write a complicated find expression to extract the data, but it is faster to just manually re-enter it.

In [106]:
final_csp_models['projectors_to_apply']

0    slice(None, 1, None)
1    slice(None, 1, None)
2    slice(None, 1, None)
3       slice(1, 2, None)
4    slice(None, 1, None)
5    slice(None, 1, None)
6    slice(None, 1, None)
7    slice(None, 1, None)
8       slice(1, 2, None)
Name: projectors_to_apply, dtype: object

In [107]:
final_csp_models['projectors_to_apply'] = [slice(None, 1, None),
                                           slice(None, 1, None),
                                           slice(None, 1, None),
                                           slice(1, 2, None),
                                           slice(1, 2, None),
                                           slice(None, 1, None),
                                           slice(None, 1, None),
                                           slice(1, 2, None),
                                           slice(1, 2, None)]

final_csp_models['y_train'] = None
final_csp_models['y_test'] = None

**Let's import our CNN models**

In [108]:
#Import final CNN model params
cnn_models = pd.read_csv('data/CNN_models_most_general.csv')

#Reset several columns away from strings and as literal_evals so can be read
cnn_models['trial_combo'] = (cnn_models['trial_combo'].
                                apply(lambda x: literal_eval(x)))

#Use list comprehension b/c NaN can appear in list
cnn_models['flat'] = [None if pd.isna(x) else literal_eval(x)
                         for x in cnn_models['flat']]
cnn_models['reject'] = [None if pd.isna(x) else literal_eval(x)
                         for x in cnn_models['reject']]

#Add our three needed output columns
cnn_models['model_type'] = 'CNN'
cnn_models['train_output'] = None
cnn_models['test_output'] = None

cnn_models['y_train'] = None
cnn_models['y_test'] = None

**And finally import our NN models run on CSP data**

In [109]:
#Import final CNN model params
nn_csp_models = pd.read_csv('data/NN_CSP_models_not_overfit.csv')

#Reset several columns away from strings and as literal_evals so can be read
nn_csp_models['trial_combo'] = (nn_csp_models['trial_combo'].
                                apply(lambda x: literal_eval(x)))

#Use list comprehension b/c NaN can appear in list
nn_csp_models['flat'] = [None if pd.isna(x) else literal_eval(x)
                         for x in nn_csp_models['flat']]
nn_csp_models['reject'] = [None if pd.isna(x) else literal_eval(x)
                         for x in nn_csp_models['reject']]

#Add our three needed output columns
nn_csp_models['model_type'] = 'NN_CSP'
nn_csp_models['train_output'] = None
nn_csp_models['test_output'] = None

nn_csp_models['y_train'] = None
nn_csp_models['y_test'] = None

**Combine into L1 model dataframe**

In [110]:
L1_model_df = pd.concat((final_csp_models, 
                         cnn_models, 
                         nn_csp_models), 
                        ignore_index=True)

#Replace NaN with None so MNE can read it
L1_model_df.replace(np.nan, None, inplace=True)

### Let's construct the ensemble_df test frame
Each row of this frame will use a different set of parameters for the final ensemble model construction, and have space for the resulting training and test scores to be recorded.

In [79]:
hidden_layers_options = [1, 2]
l2_reg_alpha_options = [0.0001, 0.001]
dropout_ratio_options = [0.2, 0.4]
#How number of nodes reduces with each dense layer
dense_reduction_ratio_options = [0, 0.25]
#How many epochs to run through
epochs_options = [3, 6]


columns = ['hidden_layers',
           'l2_reg_alpha',
           'dropout_ratio',
           'dense_reduction',
           'epochs']

#Create dataframe of our first set of variable params
ensemble_test_df = pd.DataFrame(itertools.
                               product(hidden_layers_options,
                                       l2_reg_alpha_options,
                                       dropout_ratio_options,
                                       dense_reduction_ratio_options,
                                       epochs_options), 
                           columns=columns)

#Create dataframe with just our subjects, add results columns
temp_dict = {'subject': final_csp_models.subject.unique()}
temp_frame = pd.DataFrame(temp_dict)
temp_frame['train_score'] = None
temp_frame['test_score'] = None
temp_frame['true_pos_rate'] = None
temp_frame['true_neg_rate'] = None

#Get number of permutations of variable parameters
permutations = len(list(itertools.
                        product(hidden_layers_options,
                                l2_reg_alpha_options,
                                dropout_ratio_options,
                                dense_reduction_ratio_options,
                                epochs_options)))

#Duplicate our temp frame to match the number of variable
#permutations to run each permutation for each subject
temp_columns = temp_frame.columns
temp_frame = pd.DataFrame(np.repeat(temp_frame.values, 
                                     permutations, 
                                     axis=0))
temp_frame.columns = temp_columns

#Concat variable params with itself 9 times times to get
#right shape to combine with all params for all subjects
ensemble_test_df = pd.concat((ensemble_test_df, 
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df,
                              ensemble_test_df),
                              ignore_index=True)

#Join our two grids to assemble full ensemble test grid
ensemble_test_df = temp_frame.join(ensemble_test_df)

ensemble_test_df.shape

(288, 10)

### Build the functions that will build and test our ensemble models

The has two nested iterators:

First, it iterates through the subjects. All of the L1 models for each subject use the same MNE parameters, so we can do those operations first and use the results for all rows where that subject is mentioned.

Then, it iterates through the rows which match that subject and performs the needed calculations to create the correct output for the L1 model on that row.

Finally, when it has iterated through all the rows for that individual, it perfoms the ensemble calculations at that point.

In [60]:
def ensemble_input_creator(L1_model_df, final_test, savefile):
    """A function to create all the needed inputs for our ensemble
    model inside the L1_model_df.
    
    The L1_model_df should have one row per L1 model, and specify the
    subject to which that L1 model applies. All L1 models for each subject
    must use the same parameters for MNE preprocessing steps.
    
    final_test: True, False. When set to False, a train-test split
    will be conducted on session 1 data and the saved score in the 
    ensemble_df accuracy column for each subject will be the 
    score on the test data from session 1. When set to True, the model
    will be trained on the entirety of the session 1 data, and the test score
    shown is the final accuracy against unseen session 2 data."""
    
    
    #First we populate our L1_model_df with output from each L1 model 
    #to be used as input into each ensemble model
    
    #For each row in ensemble we iterate across all subjects
    for sub in L1_model_df['subject'].unique():

        #Create list of index numbers for all L1 models for this subject
        i_list = L1_model_df.loc[L1_model_df['subject'] == sub].index

        #First we will perform the MNE preprocessing for this subject
        #All L1 models for each subject use same MNE params, so we'll
        #refer to the first row for this subject
        i1 = i_list[0]

        #Load each sessions data into an MNE raw object
        raw_dict = {}
        for key, value in data_dict.items():
            raw_dict[key] = mne.io.RawArray(value.T, info, verbose=0)

        #Filter data with bandpass. Note raw.filter applies in place
        for key, value in raw_dict.items():
            value.filter(l_freq=L1_model_df.l_freq_filter[i1], 
                         h_freq=L1_model_df.h_freq_filter[i1], 
                         method='fir', phase='zero', verbose=0)

        #Create epoch object with our raw objects and events arrays
        channels_to_keep = [ch for ch in ch_names if 
                            ch not in L1_model_df.channels_to_drop[i1]]
        epoch_dict = {}
        for key, value in raw_dict.items():
            epoch_dict[key] = mne.Epochs(value, events=event_dict[key], 
                                        event_id=events_explained, 
                                        tmin=-3, 
                                        tmax=L1_model_df.tmax[i1], 
                                        baseline=L1_model_df.baseline_correction[i1],
                                        preload=True,
                                        picks=channels_to_keep, verbose=0,
                                        detrend=L1_model_df.detrend[i1],
                                        reject=L1_model_df.reject[i1],
                                        flat=L1_model_df.flat[i1],
                                        reject_tmin=L1_model_df.tmin[i1],
                                        reject_tmax=L1_model_df.tmax[i1])

        #Skip creating projectors step to save compute time if not being
        #applied in this iteration
        if L1_model_df.projectors_to_apply[i1]:
            #Create dictionary of signal space projection vectors for each epoch
            proj_dict = {}
            for key, value in epoch_dict.items():
                proj_dict[key] = mne.compute_proj_epochs(value, 
                                                         n_eeg=2, 
                                                         verbose=0)
            #apply projectors
            for key, value in epoch_dict.items():
                value.add_proj(proj_dict[key][L1_model_df.projectors_to_apply[i1]], 
                               verbose=0)
                value.apply_proj(verbose=0)

        #Skip creating ICA components step to save compute time if not
        #being applied in this iteration
        if L1_model_df.ica_to_exclude[i1]:
            #create and fit ICA object to epochs
            for key, value in epoch_dict.items():
                ica = mne.preprocessing.ICA(n_components=5, method='picard', 
                                            max_iter='auto', verbose=0)
                ica.fit(value, verbose=0)
                #Apply the ICA
                ica.apply(value, exclude=L1_model_df.ica_to_exclude[i1],
                         verbose=0)

        #Resample the data at a new frequency, happens inplace
        for key, value in epoch_dict.items():
            value.resample(sfreq=L1_model_df.selected_frequency[i1])

        #Extract and standard scale data from all non-dropped epochs
        #Creates intermediate data dictionary
        int_data_dict = {}
        #Use robust sklearn scaler
        if L1_model_df.scaler[i1] == 'robust':
            mne_scaler = mne.decoding.Scaler(scalings='median')
            for key, value in epoch_dict.items():
                #with scalings=median implements sklearn robust scaler
                int_data_dict[key] = (mne_scaler.
                                      fit_transform(value.
                                                    get_data(tmin=L1_model_df.tmin[i1], 
                                                             tmax=L1_model_df.tmax[i1])))
        #No scaling option
        if L1_model_df.scaler[i1] is None:
            for key, value in epoch_dict.items():
                int_data_dict[key] = value.get_data(tmin=L1_model_df.tmin[i1], 
                                                      tmax=L1_model_df.tmax[i1])

        #Create updated dictionary of y values to reflect dropped epochs
        int_y_dict = {}
        for key, value in y_dict.items():
            temp_y_list = []
            for i, epoch in enumerate(epoch_dict[key].drop_log):
                #MNE drop log shows empty parens for epochs that were not dropped - 
                #these are the trials we are keeping in each iteration
                if epoch == ():
                    temp_y_list.append(value[i])
            int_y_dict[key] = temp_y_list

        #Assemble final y dict with only trials in our current combo
        #In each combo, coding 1st trial type to 0, 2nd trial type to 1
        final_y_dict = {}
        for key, value in int_y_dict.items():
            temp_y_list = []
            for y in value:
                if y == L1_model_df.trial_combo[i1][0]:
                    temp_y_list.append(0)
                if y == L1_model_df.trial_combo[i1][1]:
                    temp_y_list.append(1)
            final_y_dict[key] = np.array(temp_y_list)

        #Assemble data dict with only trials in our current combo
        final_data_dict = {}
        for key, value in int_data_dict.items():
            index_list = []
            for i, y in enumerate(int_y_dict[key]):
                if (y == L1_model_df.trial_combo[i1][0] or 
                    y == L1_model_df.trial_combo[i1][1]):
                    index_list.append(i)
            final_data_dict[key] = value[index_list]

        #If this isn't our final test on sesh 2 data,
        #We need to train test split our sesh 1 data
        if final_test == False:
            #Train test split the sesh 1 data and y for the subject of current row
            for key, value in final_data_dict.items():
                if (sub in key) and ('sesh_1' in key):
                    sub_X = value
            for key, value in final_y_dict.items():
                if (sub in key) and ('sesh_1' in key):
                    sub_y = value
            (X_train, X_test, 
             y_train, y_test) = train_test_split(sub_X, 
                                                 sub_y, 
                                                 stratify=sub_y,
                                                 random_state=23)

            #Save y_train and y_test into df for ensemble model to use
            L1_model_df.at[i1, 'y_train'] = y_train
            L1_model_df.at[i1, 'y_test'] = y_test
            

            #For subject, reset X and y in final dicts with train values,
            #test values will be passed to L1 models for validation
            for key, value in final_data_dict.items():
                if (sub in key) and ('sesh_1' in key):
                    final_data_dict[key] = X_train
            for key, value in final_y_dict.items():
                if (sub in key) and ('sesh_1' in key):
                    final_y_dict[key] = y_train
                    
        if final_test == True:
            #Need to save down true y's for the ensemble model to fit and score
            for key, value in final_y_dict.items():
                if (sub in key) and ('sesh_1' in key):
                    L1_model_df.at[i1, 'y_train'] = value
                if (sub in key) and ('sesh_2' in key):
                    L1_model_df.at[i1, 'y_test'] = value
                    

            
            
            
        #We're now complete with the pre-processing common to all models
        #And will begin iterating over specific rows for this subject
        for row in i_list:
        
        
        
            #If this is a model built on top of CSP, do CSP preprocessing
            if ((L1_model_df.model_type[row] == 'LDA') or 
                (L1_model_df.model_type[row] == 'NN_CSP')):
                #Create csp_dict of csp objects
                csp_dict = {}
                for key, value in epoch_dict.items():
                    #Only need to create CSP objects for our subject,
                    #and only in session 1
                    if (sub in key) and ('sesh_1' in key):
                        csp_dict[key] = mne.decoding.CSP(n_components=int(L1_model_df.n_components[row]), 
                                                         cov_est=L1_model_df.cov_est[row], 
                                                         log=bool(L1_model_df.log[row]));

                #Suppress output from this noisy function
                with io.capture_output() as captured:
                #Fit csp objects to training data from session 1        
                    for key, value in csp_dict.items():
                        value.fit(X=final_data_dict[key], 
                                  y=final_y_dict[key]);

                #Use csp object to transform data
                csp_data_dict = {}
                for key, value in csp_dict.items():
                    #Use CSP object from sesh_1 to tranform both seshs data
                    csp_data_dict[key] = value.transform(final_data_dict[key])
                    key2 = key.replace('sesh_1', 'sesh_2')
                    csp_data_dict[key2] = value.transform(final_data_dict[key2])
                    #In non-final tests also need to transform X_test
                    #CSP tranform happens in place so need to deepcopy X_test
                    if final_test == False:
                        X_test_CSP = copy.deepcopy(X_test)
                        X_test_CSP = value.transform(X_test_CSP)
                        
                        
                        
                        
                        
            #Its now time to iterate through each of the L1 models for this subject
            #Recall i_list is the indexes of all rows in L1 models for this subject
            #First, the LDA models        
            if L1_model_df.model_type[row] == 'LDA':
                #Create LDA object for our subject (just one, fed from csp_dict)
                for key, value in csp_dict.items():
                    LDA = LinearDiscriminantAnalysis()
                    #Remember train data assigned to dicts when final_test=False
                    LDA.fit(csp_data_dict[key], final_y_dict[key])

                    if final_test == False:
                        #Save down probabilities for train and test
                        L1_model_df.at[row, 'train_output'] = LDA.predict_proba(csp_data_dict[key])
                        L1_model_df.at[row, 'test_output'] = LDA.predict_proba(X_test_CSP)

                    #For final tests, fit and score against sesh1, sesh2
                    if final_test == True:
                        L1_model_df.at[row, 'train_output'] = LDA.predict_proba(csp_data_dict[key])
                        key2 = key.replace('sesh_1', 'sesh_2')
                        L1_model_df.at[row, 'test_output'] = LDA.predict_proba(csp_data_dict[key2])
            
            
            
            
            
            
            #Now for our NN models on CSP data:
            if L1_model_df.model_type[row] == 'NN_CSP':
                #Model against our data for each subject and save the resulting score
                for key, value in csp_dict.items():

                    #Build model
                    model = Sequential()
                    #inputs are equal to n_components created via CSP
                    model.add(Dense(int(L1_model_df.n_components[row]), 
                                    input_dim=int(L1_model_df.n_components[row]), 
                                    activation='relu'))
                    model.add(Dropout(0.2))
                    #Add hidden layer with half as many nodes as input
                    model.add(Dense(int(L1_model_df.n_components[row]/2), 
                                    activation='relu'))
                    model.add(Dropout(0.2))
                    #Hidden layer with 1/4 as many nodes as input
                    model.add(Dense(int(L1_model_df.n_components[row]/4), 
                                    activation='relu'))
                    model.add(Dropout(0.2))
                    #output layer
                    model.add(Dense(1, activation='sigmoid'))

                    #Suppress output
                    with io.capture_output() as captured:
                        #Compile model
                        model.compile(loss='binary_crossentropy', 
                                      optimizer='adam', 
                                      metrics=['acc'])

                        #Fit model
                        history = model.fit(csp_data_dict[key], 
                                            final_y_dict[key], 
                                            epochs=3, verbose=0)

                    #Create extractor to save output of penultimate dense layer
                    extractor = Model(inputs=model.inputs,
                                      outputs=model.get_layer(index=-3).output)
                    
                    if final_test == False:
                    #Save down probabilities for train and test
                        L1_model_df.at[row, 
                                       'train_output'] = np.array(extractor(csp_data_dict[key]))
                        L1_model_df.at[row, 
                                       'test_output'] = np.array(extractor(X_test_CSP))
                        

                    #For final tests, fit and score against sesh1, sesh2
                    if final_test == True:
                        L1_model_df.at[row, 
                                       'train_output'] = np.array(extractor(csp_data_dict[key]))
                        key2 = key.replace('1', '2')
                        L1_model_df.at[row, 
                                       'test_output'] = np.array(extractor(csp_data_dict[key2]))
                                                                  
            
            
            
            
            
            
            #Now for our CNN models
            if L1_model_df.model_type[row] == 'CNN':

                #Create sample weight lists, higher weight for subject
                #Remember final_y_dict has correct y_train whether final or not
                sample_weights = []
                for key, value in final_y_dict.items():
                    if (sub in key) and ('sesh_1' in key):
                        temp = [L1_model_df.sample_weight[row]] * len(value)
                        sample_weights += temp
                    if (sub not in key) and ('sesh_1' in key):
                        temp = [1] * len(value)
                        sample_weights += temp
                        
                if final_test == True:
                    #Need to get X_test and y_test into variables
                    for key, value in final_y_dict.items():
                        if (sub in key) and ('sesh_2' in key):
                            y_test = value
                    for key, value in final_data_dict.items():
                        if (sub in key) and ('sesh_2' in key):
                            X_test = value
                
                #Concat all sesh 1 values together into X_train and y_train for CNNs
                #Remember for both final_test instances these are correct
                X_train = np.concatenate(([final_data_dict[key] for 
                                           key in final_data_dict.keys()
                                          if 'sesh_1' in key]), axis=0)
                y_train = np.concatenate(([final_y_dict[key] for 
                                           key in final_y_dict.keys()
                                          if 'sesh_1' in key]))


                #Reshape all data tensors to feed into neural network
                #Rename result to avoid overwriting X_test and y_test
                #Otherwise misaligns shapes when final_test=False and
                #More than 1 CNN model per subject
                X_train_NN = np.reshape(X_train, (X_train.shape[0],
                                               X_train.shape[2],
                                               X_train.shape[1]))
                X_test_NN = np.reshape(X_test, (X_test.shape[0],
                                             X_test.shape[2],
                                             X_test.shape[1]))
                y_train_NN = np.reshape(y_train, len(y_train))
                y_test_NN = np.reshape(y_test, len(y_test))
                sample_weights = np.reshape(sample_weights, 
                                            len(sample_weights))


                #Model against our data and save the resulting val score
                model = Sequential()
                model.add(Conv1D(filters=L1_model_df.input_filter_count[row], 
                                 kernel_size=int(L1_model_df.input_kernel_size[row]), 
                                 strides=int(L1_model_df.input_strides[row]), 
                                 padding='same', 
                                 activation='relu', 
                                 input_shape=(X_train_NN.shape[1], 
                                              X_train_NN.shape[2])))
                model.add(Dropout(0.2))
                model.add(MaxPooling1D(int(L1_model_df.pool_size[row])))
                model.add(Conv1D(filters=L1_model_df.hidden_filters[row], 
                                 kernel_size=int(L1_model_df.hidden_kernel_size[row]), 
                                 strides=int(L1_model_df.hidden_strides[row]), 
                                 padding='same', 
                                 activation='relu'))
                model.add(Dropout(0.2))
                model.add(MaxPooling1D(int(L1_model_df.pool_size[row])))
                model.add(GlobalAveragePooling1D())
                model.add(Dense(L1_model_df.nodes[row], activation='relu'))
                model.add(Dropout(0.2))
                model.add(Dense(1, activation='sigmoid'))

                #compile & fit model
                model.compile(loss='binary_crossentropy', 
                              optimizer='adam', 
                              weighted_metrics=['accuracy'])
                #Suppress output
                with io.capture_output() as captured:
                    history = model.fit(x=X_train_NN, 
                                        y=y_train_NN, 
                                        sample_weight=sample_weights, 
                                        batch_size=60, epochs=3, 
                                        validation_data=(X_test_NN, 
                                                         y_test_NN), 
                                        verbose=0, workers=8)

                #Save down the output of our penultimate dense layer
                extractor = Model(inputs=model.inputs,
                                  outputs=model.get_layer(index=-3).output)
                L1_model_df.at[row, 'test_output'] = (np.array
                                                      (extractor(X_test_NN)))

                #Run model on just data for subject, not all sesh 1
                #To get train output for ensemble
                for key, value in final_data_dict.items():
                        if (sub in key) and ('sesh_1' in key):
                            X_train_small = value
                #Reshape X_train_small to feed into neural network
                X_train_small = np.reshape(X_train_small, 
                                           (X_train_small.shape[0],
                                            X_train_small.shape[2],
                                            X_train_small.shape[1]))
                L1_model_df.at[row, 'train_output'] = (np.array
                                                       (extractor
                                                        (X_train_small)))
                
                
    #Save the results down
    L1_model_df.to_csv(f'data/{savefile}.csv', index=False)

### Let's create our ensemble training and final scoring function

This function will enable us to train our ensemble model, and then calculate our final scores against either test data, or, on our final test, data from session 2 as well.

In [78]:
def ensemble_test(L1_model_df, ensemble_df, savefile):
    """Uses the outputs of the L1 model to train ensemble model.
    
    The ensemble_df dataframe should include the settings we want to
    test for the ensemble model, and has columns to fill in the resulting
    accuracy on the test set.
    
    This function iterates through all rows in the ensemble_df.
    On each row it applies the ensemble parameters listed on 
    that row to construct a NN model, and then uses that model
    to make a final prediction for each subject. The model is
    fit using the L1 outputs in the L1_model_df, and a score is
    then recorded for each subject."""
    for row in range(ensemble_df.shape[0]):
        #Get subject of current row
        sub = ensemble_df['subject'][row]
        
        #Pull all needed inputs from L1_model_df
        #y values come from first row of L1 matching subject 
        y_train = (L1_model_df.loc[L1_model_df['subject'] == sub, 
                                   'y_train'].values[0])
        y_test = (L1_model_df.loc[L1_model_df['subject'] == sub, 
                                   'y_test'].values[0])
        #Reshape y test and train to feed into NN
        y_train = np.reshape(y_train, len(y_train))
        y_test = np.reshape(y_test, len(y_test))
        
        #Concatenate X_train and test from all matching sub rows of L1 DF
        X_train = L1_model_df.loc[L1_model_df['subject'] == sub, 
                                  'train_output'].values
        X_test = L1_model_df.loc[L1_model_df['subject'] == sub, 
                                  'test_output'].values
        
        #Concatenate together into single array for entry into NN
        X_train = np.concatenate((X_train), axis=1)
        X_test = np.concatenate((X_test), axis=1)
        
        
        #Now let's build our NN to make final predictions
        model = Sequential()
        
        #Input nodes in first layer is equal to combined input nodes from L1
        input_nodes = X_train.shape[1]
        
        #Add first two layers
        model.add(Dense(input_nodes, 
                        input_dim=input_nodes, 
                        activation='relu', 
                        kernel_regularizer=l2(ensemble_df.l2_reg_alpha[row])))
        model.add(Dropout(ensemble_df.dropout_ratio[row]))
        model.add(Dense(int((1 - ensemble_df.dense_reduction[row]) * 
                             input_nodes), 
                        activation='relu', 
                        kernel_regularizer=l2(ensemble_df.l2_reg_alpha[row])))

        #Add 1-2 more dense hidden layers depending on settings
        if ensemble_df.hidden_layers[row] >= 2:
            model.add(Dropout(ensemble_df.dropout_ratio[row]))
            model.add(Dense(int(((1 - ensemble_df.dense_reduction[row])**2) * 
                             input_nodes), 
                            activation='relu', 
                            kernel_regularizer=l2(ensemble_df.l2_reg_alpha[row])))
        if ensemble_df.hidden_layers[row] >= 3:
            model.add(Dropout(ensemble_df.dropout_ratio[row]))
            model.add(Dense(int(((1 - ensemble_df.dense_reduction[row])**3) * 
                             input_nodes), 
                            activation='relu', 
                            kernel_regularizer=l2(ensemble_df.l2_reg_alpha[row])))
        
        #Add output layer
        model.add(Dropout(ensemble_df.dropout_ratio[row]))
        model.add(Dense(1, 
                        activation='sigmoid', 
                        kernel_regularizer=l2(ensemble_df.l2_reg_alpha[row])))

        # Compile it
        model.compile(loss='binary_crossentropy', 
                      optimizer='adam', 
                      metrics=['acc'])

        # Fit it
        history = model.fit(X_train, y_train, 
                            validation_data=(X_test, y_test), 
                            epochs=ensemble_df.epochs[row], 
                            verbose=0)
        
        #Save the best val and training accuracy achieved by our model
        ensemble_df.at[row, 'train_score'] = max(history.history['acc'])
        ensemble_df.at[row, 'test_score'] = max(history.history['val_acc'])
        
        #Calculate true positive and negative rates
        #First generate predictions
        test_pred = model.predict(X_test, verbose=0).flatten()
        
        #Round those prediction and convert to integers
        test_pred = [int(round(x)) for x in test_pred]

        #Then calculate rates of true positivity and negativity
        ensemble_df.at[row, 'true_pos_rate'] = (sum([1 for pred, actual in 
                                                    zip(test_pred, y_test)
                                                    if (pred == actual) &
                                                     (actual == 1)])
                                                / sum(y_test))
        ensemble_df.at[row, 'true_neg_rate'] = (sum([1 for pred, actual in 
                                                    zip(test_pred, y_test)
                                                    if (pred == actual) &
                                                     (actual == 0)])
                                                / (len(y_test) - 
                                                   sum(y_test)))
        
        ensemble_df.to_csv(f'data/{savefile}.csv', index=False)
        if row % 50 == 0:
            print(f'Grid search complete through row {row} of {ensemble_df.shape[0]}')

### Let's run our functions and look at our results

In [71]:
ensemble_input_creator(L1_model_df=L1_model_df, 
                       final_test=False,
                       savefile='L1_inputs_to_ensemble')
ensemble_test(L1_model_df=L1_model_df, 
              ensemble_df=ensemble_test_df,
              savefile='ensemble_grid_search')

Grid search complete through row 0 of 288
Grid search complete through row 50 of 288
Grid search complete through row 100 of 288
Grid search complete through row 150 of 288
Grid search complete through row 200 of 288
Grid search complete through row 250 of 288


In [81]:
ensemble_test_df.sort_values('train_score', ascending=False).head(20)

Unnamed: 0,subject,train_score,test_score,true_pos_rate,true_neg_rate,hidden_layers,l2_reg_alpha,dropout_ratio,dense_reduction,epochs
73,sub_D,0.95,1.0,1.0,1.0,1,0.001,0.2,0.0,6
81,sub_D,0.916667,1.0,1.0,1.0,2,0.0001,0.2,0.0,6
91,sub_D,0.9,1.0,1.0,1.0,2,0.001,0.2,0.25,6
67,sub_D,0.883333,0.9,0.8,1.0,1,0.0001,0.2,0.25,6
249,sub_J,0.87037,0.736842,0.777778,0.7,2,0.001,0.2,0.0,6
89,sub_D,0.866667,0.9,0.8,1.0,2,0.001,0.2,0.0,6
75,sub_D,0.866667,1.0,1.0,1.0,1,0.001,0.2,0.25,6
77,sub_D,0.85,0.9,0.8,1.0,1,0.001,0.4,0.0,6
235,sub_J,0.833333,0.631579,0.555556,0.7,1,0.001,0.2,0.25,6
93,sub_D,0.816667,0.95,0.9,0.9,2,0.001,0.4,0.0,6


**OK, let's modify our ensemble settings to what seem to be the best options for not overfitting, and run our final model against the session 2 data**

Unfortunately we didn't get very clear signal from running our ensemble against test data. Many of our models have fairly wide divergence between train and test scores, and subject D in particular appears set to be wildly overfit on session 1 data. Its performing very well on our test split, but I'd be very surprised if it did as well against our real session 2 data.

I suspect I'm going to need to drop the CNN models - based on everything I've seen to this point, they are causing a large part of the overfitting on session 1 data.

I'm locking in settings for my ensemble model and going to run my models against the real test data from session 2 - but am anticipating re-running afterward with just my CSP based models.

In [82]:
hidden_layers_options = [1]
l2_reg_alpha_options = [0.001]
dropout_ratio_options = [0.2]
#How number of nodes reduces with each dense layer
dense_reduction_ratio_options = [0]
#How many epochs to run through
epochs_options = [3]


columns = ['hidden_layers',
           'l2_reg_alpha',
           'dropout_ratio',
           'dense_reduction',
           'epochs']

#Create dataframe of our first set of variable params
ensemble_test_df_2 = pd.DataFrame(itertools.
                               product(hidden_layers_options,
                                       l2_reg_alpha_options,
                                       dropout_ratio_options,
                                       dense_reduction_ratio_options,
                                       epochs_options), 
                           columns=columns)

#Create dataframe with just our subjects, add results columns
temp_dict = {'subject': final_csp_models.subject.unique()}
temp_frame = pd.DataFrame(temp_dict)
temp_frame['train_score'] = None
temp_frame['test_score'] = None
temp_frame['true_pos_rate'] = None
temp_frame['true_neg_rate'] = None

#Get number of permutations of variable parameters
permutations = len(list(itertools.
                        product(hidden_layers_options,
                                l2_reg_alpha_options,
                                dropout_ratio_options,
                                dense_reduction_ratio_options,
                                epochs_options)))

#Duplicate our temp frame to match the number of variable
#permutations to run each permutation for each subject
temp_columns = temp_frame.columns
temp_frame = pd.DataFrame(np.repeat(temp_frame.values, 
                                     permutations, 
                                     axis=0))
temp_frame.columns = temp_columns

#Concat variable params with itself 9 times times to get
#right shape to combine with all params for all subjects
ensemble_test_df_2 = pd.concat((ensemble_test_df_2, 
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2,
                              ensemble_test_df_2),
                              ignore_index=True)

#Join our two grids to assemble full ensemble test grid
ensemble_test_df_2 = temp_frame.join(ensemble_test_df_2)

ensemble_test_df_2.shape

(9, 10)

In [83]:
ensemble_input_creator(L1_model_df=L1_model_df, 
                       final_test=True,
                       savefile='L1_inputs_to_ensemble_final_run_all_4')
ensemble_test(L1_model_df=L1_model_df, 
              ensemble_df=ensemble_test_df_2,
              savefile='ensemble_grid_search_final_run_all_4')

Grid search complete through row 0 of 9


In [85]:
ensemble_test_df_2.sort_values('test_score', ascending=False).head(10)

Unnamed: 0,subject,train_score,test_score,true_pos_rate,true_neg_rate,hidden_layers,l2_reg_alpha,dropout_ratio,dense_reduction,epochs
7,sub_J,0.739726,0.72,0.972222,0.487179,1,0.001,0.2,0,3
1,sub_C,0.5375,0.658228,0.475,0.846154,1,0.001,0.2,0,3
4,sub_F,0.712329,0.585714,0.72973,0.272727,1,0.001,0.2,0,3
5,sub_G,0.526316,0.529412,1.0,0.0,1,0.001,0.2,0,3
8,sub_L,0.701299,0.514286,0.540541,0.363636,1,0.001,0.2,0,3
2,sub_D,0.7375,0.5,0.0,1.0,1,0.001,0.2,0,3
6,sub_H,0.55,0.5,0.885714,0.057143,1,0.001,0.2,0,3
3,sub_E,0.557143,0.492308,0.0,1.0,1,0.001,0.2,0,3
0,sub_A,0.619718,0.487179,0.871795,0.102564,1,0.001,0.2,0,3


In [93]:
ensemble_test_df_2.test_score.mean()

0.5541252096494039

**As I suspected, it looks like we still overfit to session 1 badly**

Subject D is the best evidence of this - that model was achieving 100% accuracy with more epochs on the training data, but is no better than a coinflip on session 2 data - it predicted 0 every time on the test data after the shifts to the EEG signal in session 2. Unfortunately our models just are not solving this problem well - a general CNN or LDA applied to all subjects would likely perform better. More work is needed.