# Assignment 8: Machine Learning
Please submit this assignment to Canvas as a jupyter notebook (.ipynb).  The assignment will introduce you to machine learning techniques used in the analysis of EEG data.

In [1]:
# imports
import numpy as np
import pandas as pd
import cmlreaders as cml
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.model_selection import KFold
from scipy.stats import ttest_ind

This assignment is designed to familiarize you with multivariate analysis of intracranial EEG data. For each subject, you will train a logistic-regression classifier to discriminate subsequently recalled vs non-recalled studied items using the distribution of spectral power across electrodes as the features. After completing the assignment you should be able to fit an L2-penalized logistic regression classifier to intracranial electrophysiological recordings. You should also be able to construct a receiver operating characteristic (ROC) curve and compute area under the curve (AUC) to assess classifier performance, and compare train and test performance.

You will use data from the following 20 FR1/catFR1 subjects in the intracranial EEG (iEEG) dataset:

In [2]:
subs_FR = ['R1380D', 'R1111M', 'R1332M', 'R1377M', 'R1065J', 'R1385E', 'R1189M', \
           'R1108J', 'R1390M', 'R1236J', 'R1391T', 'R1401J', 'R1361C', 'R1060M', \
           'R1350D', 'R1378T', 'R1375C', 'R1383J', 'R1354E', 'R1292E']
print(len(subs_FR))

20


For each of these subjects, use the following processing steps:
* Load EEG with CMLReader.load_eeg from a bipolar montage loaded using CMLReader.load('pairs').
* Apply a Butterworth notch filter around 60 Hz (freqs = [58 62]) when extracting the voltage.
* Calculate power at the above frequencies with a Morlet wavelet with wavenumber (keyword “width”) of 6 for each encoding event (from time 0 until 1.6 seconds after the encoding event onset) using a 1 second buffer.
* For each frequency, channel, and encoding event, average the power over the entire 1600 ms encoding period (but not over the buffer period!)
* Log-transform the average encoding power values as in the final step of the previous problem.
* In some cases you may notice artifacts in the data that manifest in power values of zero. These would produce problems in the transformation and classification, so please exclude any events with this issue from all analyses.

You will train an L2-penalized logistic regression classifier on the time-frequency (TF) data obtained during item encoding for every subject. Throughout this assignment, unless otherwise specified, we will use the default parameters for the *LogisticRegression* classifier in sklearn.

## Question 1, Generate features: 
* For your input features, extract spectral power as in Assignment 2 with Morlet wavelets at 8 frequencies logarithmically spaced between 3 and 180 Hz (np.logspace(np.log10(3), np.log10(180),8)) for each recorded electrode pair (“channel”). Average the power across each of these frequencies over the 1600 ms word encoding period.
* For each subject, create an $X_{N×p}$ matrix of spectral power patterns ($N$= number of encoding events concatenated across sessions; $p$ = number of frequencies × number of channels) and obtain the $y_{N×1}$ vector of labels (1: recalled, 0: non-recalled). The pair $(X, y)$ will be our dataset.
* Z-score the features across observations (i.e., events) within each session. Since we're performing leave-one-session-out cross validation in this assignment, z-scoring within-session prevents information from leaking between train sessions and test sessions through the z-score statistics.
* Some subjects will have different sets of electrodes for different recording sessions. For these subjects you can drop the sessions such that you keep the largest possible set of available sessions which all have the same recording contacts (there could be groups of sessions for the same subject with different electrode sets). The reasons a subject might have different active recording electrodes across sessions are:
    * some subjects have so many electrodes implanted that not all of them could be recorded from simultaneously; these subjects will sometimes then have different "montages" in which different electrodes are turned connected or disconnected or
    * some subjects have multiple implant surgeries, with different electrodes being in place after each operation, meaning again that the same subject will have different sets of electrodes.

In [3]:
# Question 1
### YOUR CODE HERE

## Question 2, Cross-validation:
* Use leave-one-session-out cross-validation to train and test L2-penalized logistic regression classifiers. This means that for each cross-validation iteration, you will (1) leave out one session ($X_{test}$, $y_{test}$), (2) train the model on the other sessions ($X_{train}$, $y_{train}$), and (3) test the trained model on the held-out session. Repeat this procedure by iterating across all sessions that a subject has. For each iteration of the cross-validation procedure, you will train the L2-penalized logistic regression classifier on the encoding events from all sessions except the held-out session. You will take the model fit to this training set and use it to predict recall performance for the encoding events in the held-out session. For each encoding event in the held out session, you should get a predicted probability that this item will be subsequently recalled. Once you have held out each session (i.e., at the end of the leave-one-session-out cross-validation procedure), you will have the predicted probability for each encoding event (all predicted by models trained on all encoding events except for the ones in the same session). After doing the above separately for each subject, you should now have cross-validated predictions for all encoding events. Use the default penalty parameter (C) of 1.0 (you will optimize this parameter for some subjects in Part 2). 
* For the first three subjects in the list above, plot a histogram of the predicted cross-validated probabilities across all encoding events, giving different colors to predictions for encoding events of words that were subsequently recalled and for encoding events for words that were not recalled. How strongly do the neural features predict subsequent recall?
* Hint: since different sessions have different numbers of events (subjects can discontinue a session partway through), you'll need to implement leave-one-session-out cross validation without using the KFold sklearn class used in the examples from the intro material.

In [4]:
# Question 2
### YOUR CODE HERE

## Question 3, Construct across-subject ROCs and AUCs using sklearn functions:
* To assess the performance of a classifier, we will utilize the area under the receiver operating curve (AUC). Using sklearn’s ROC curve function, calculate the ROC and the corresponding AUC for each subject. Plot all the subjects’ ROC curves in one plot, and plot all the subjects’ AUCs in one histogram. To compute a subject-level ROC curve and AUC value, pool all predictions across the outer cross-validation folds and compute the AUC/ROC curve with the pooled predictions.
* How good is the performance? Run a statistical test to determine if the between-subject average performance is reliably above chance.

In [5]:
# Question 3
### YOUR CODE HERE

## Question 4, Train and Test AUCs
Report mean train AUCs and mean test AUCs across cross validation folds for all subjects with two overlapping histograms (two histograms in the same plot). In comparison to how the test AUCs were computed at the subject level in previous problems, you can compute the train and test AUCs for a given outer fold with just the predictions from that fold; then you can average those fold-level AUCs together.
* What is the mean difference across subjects in cross-validated AUC scores between training and testing?

In [6]:
# Question 4
### YOUR CODE HERE