### Clinical BCI Challenge-WCCI2020
- [website link](https://sites.google.com/view/bci-comp-wcci/?fbclid=IwAR37WLQ_xNd5qsZvktZCT8XJerHhmVb_bU5HDu69CnO85DE3iF0fs57vQ6M)


 - [Dataset Link](https://github.com/5anirban9/Clinical-Brain-Computer-Interfaces-Challenge-WCCI-2020-Glasgow)
 
 
 - [FBCSP Github Repo Link](https://github.com/jesus-333/FBCSP-Python)
 
I have changed the source code to give cohen's kappa score instead of accuracy as an evaluation measure and I have also made a few other small changes as well. 

In [2]:
from FBCSP.FBCSP_V4 import FBCSP_V4 as FBCSP 

In [3]:
import mne
from scipy.io import loadmat
import scipy
import sklearn
import numpy as np
import pandas as pd
import glob
from mne.decoding import CSP
import os

In [4]:
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import LinearSVC, SVC
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV, StratifiedShuffleSplit
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda

In [5]:
import warnings
warnings.filterwarnings('ignore') # to ignore warnings

In [6]:
verbose = False                    # global variable to suppress output display of MNE functions
mne.set_log_level(verbose=verbose) # to suppress large info outputs

In [7]:
verbose_clf = False # control output of FBCSP function
freqs_band = np.linspace(8, 32, 7) # filter bank choice
cv = 10
train_ratio = 0.75 # 75:25 for trian-valid split

In [8]:
n_jobs = None  # for multicore parallel processing, set it to 1 if cause memory issues, for full utilization set to -1

## Data Loading and Conversion to MNE Datatypes
[Mike Cohen Tutorials link for EEG Preprocessing](https://www.youtube.com/watch?v=uWB5tjhataY&list=PLn0OLiymPak2gDD-VDA90w9_iGDgOOb2o)

In [9]:
current_folder = globals()['_dh'][0]  # a hack to get path of current folder in which juptyter file is located
data_path = os.path.join(current_folder, 'Data')

In [None]:
training_files   = glob.glob(data_path + '/*T.mat')
len(training_files)     # if  return zero,then no file is loadedtraining_files   = glob.glob(data_path + '/*T.mat')
len(training_files)     # if  return zero,then no file is loaded

In [11]:
def get_mne_epochs(filepath, verbose=verbose, t_start=2, fs=512, mode='train'):
    '''
    This function reads the EEG data from .mat file and convert it to MNE-Python Compatible epochs
    data structure. It takes data from [0, 8] sec range and return it by setting t = 0 at cue onset
    i.e. 3 seconds and dropping first two seconds so the output data is in [-1.0, 5.0] sec range. The
    Details can be found in the preprocessing section of the attached document
    '''
    mat_data = loadmat(filepath) # read .mat file
    eeg_data= mat_data['RawEEGData']
    idx_start = fs*t_start      
    eeg_data = eeg_data[:, :, idx_start:]
    event_id = {'left-hand': 1, 'right-hand': 2}
    channel_names = ['F3', 'FC3', 'C3', 'CP3', 'P3', 'FCz', 'CPz', 'F4', 'FC4', 'C4', 'CP4', 'P4']
    info = mne.create_info(ch_names=channel_names, sfreq=fs, ch_types='eeg')
    epochs = mne.EpochsArray(eeg_data, info, verbose=verbose, tmin=t_start-3.0)
    epochs.set_montage('standard_1020')
    epochs.filter(1., None) 
    epochs.apply_baseline(baseline=(-.250, 0)) # linear baseline correction
    
    if mode == 'train': # this in only applicable for training data
        epochs.event_id = event_id
        epochs.events[:,2] = mat_data['Labels'].ravel()    
    return epochs 

def get_labels(filepath):
    mat_data = loadmat(filepath) # read .mat file
    return mat_data['Labels'].ravel()

In [12]:
epochs, labels = get_mne_epochs(training_files[0], verbose=verbose), get_labels(training_files[0])
data = epochs.get_data()
print('Shape of EEG Data: ', data.shape, '\t Shape of Labels: ', labels.shape) 

Shape of EEG Data:  (80, 12, 3072) 	 Shape of Labels:  (80,)


### Training Data

In [13]:
# loading original data
epochs_list_train = []
for i in training_files:
    epochs_list_train.append(get_mne_epochs(i, verbose=verbose))

### Bandpass filtering of data

In [15]:
for epoch in epochs_list_train:
    epoch.filter(7.0, 32.0)

## FBCSP 
The class must receive in input with the initialization a training set inside a dictionary. The keys of the dictionary must be the label of the two class and each element must be a numpy matrix of dimension "n. trials x n. channels x n.samples". The class must also receive the frequency sampling of the signal.

FBCSP function original has a built-in random splitting so I didn't do a manual splitting here

In [16]:
epochs = epochs_list_train[0]
data, labels = epochs.get_data(), epochs.events[:,-1]

In [17]:
data_dict = {'left-hand':  epochs['left-hand'].get_data()[:,:,256+512:-256], # [0.5, 4.5] sec data
             'right-hand': epochs['right-hand'].get_data()[:,:,256+512:-256]}
fs = epochs.info['sfreq']

In [18]:
valid_scores_lda = []
valid_scores_svm = []
print('-'*15, 'FBCSP with LDA', '-'*15)
fbcsp_clf_lda = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=lda(), 
                    train_ratio=train_ratio)
valid_scores_lda.append(fbcsp_clf_lda.valid_score)
print('-'*15, 'FBCSP with Linear SVM', '-'*15)
fbcsp_clf_svm = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=LinearSVC(), 
                    train_ratio=train_ratio)
valid_scores_svm.append(fbcsp_clf_svm.valid_score)

--------------- FBCSP with LDA ---------------
Features used for classification:  8
Score on Training set:  0.9
Score on Validation set:  0.797979797979798 

--------------- FBCSP with Linear SVM ---------------
Features used for classification:  8
Score on Training set:  0.9
Score on Validation set:  0.6875 



In [19]:
# incorporating cross validation
valid_scores_lda = []
valid_scores_svm = []

for _ in range(cv):
    fbcsp_clf_lda = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=lda(), 
                        train_ratio=0.75, print_var=verbose_clf)
    valid_scores_lda.append(fbcsp_clf_lda.valid_score)
    
    fbcsp_clf_svm = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=LinearSVC(), 
                        train_ratio=0.75, print_var=verbose_clf)
    valid_scores_svm.append(fbcsp_clf_svm.valid_score)

In [20]:
print("FBCSP-LDA Cross Validation Score:", np.mean(valid_scores_lda))
print("FBCSP-SVM Cross Validation Score:", np.mean(valid_scores_svm)) 
# we aren't doing grid search here so wouldn't take max score

FBCSP-LDA Cross Validation Score: 0.7217547114539595
FBCSP-SVM Cross Validation Score: 0.722064737136882


### It's Training Time

In [21]:
def training_function(subject_index=0):
    # this time training function trains on whole training set
    print('-'*25, 'Training for Subject:', subject_index+1, '-'*25)
    epochs = epochs_list_train[subject_index]
    data_dict = {'left-hand':  epochs['left-hand'].get_data()[:,:,256+512:-256], # [0.5, 4.5] sec data
             'right-hand': epochs['right-hand'].get_data()[:,:,256+512:-256]}
    fs = epochs.info['sfreq']
    labels = epochs.events[:,-1]
    valid_scores_lda = []
    valid_scores_svm = []

    for _ in range(cv):
        fbcsp_clf_lda = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=lda(), 
                            train_ratio=train_ratio, print_var=verbose_clf)
        valid_scores_lda.append(fbcsp_clf_lda.valid_score)
        
        fbcsp_clf_svm = FBCSP(data_dict, fs, freqs_band=freqs_band, classifier=LinearSVC(), 
                            train_ratio=train_ratio, print_var=verbose_clf)
        valid_scores_svm.append(fbcsp_clf_svm.valid_score)
        
    print("FBCSP-LDA Cross Validation Score: {:.2f}".format(np.mean(valid_scores_lda)))
    print("FBCSP-SVM Cross Validation Score: {:.2f}".format(np.mean(valid_scores_svm))) 
    print()

In [22]:
for subject in range(len(training_files)):
    training_function(subject)

------------------------- Training for Subject: 1 -------------------------
FBCSP-LDA Cross Validation Score: 0.71
FBCSP-SVM Cross Validation Score: 0.79

------------------------- Training for Subject: 2 -------------------------
FBCSP-LDA Cross Validation Score: 0.73
FBCSP-SVM Cross Validation Score: 0.79

------------------------- Training for Subject: 3 -------------------------
FBCSP-LDA Cross Validation Score: 0.64
FBCSP-SVM Cross Validation Score: 0.62

------------------------- Training for Subject: 4 -------------------------
FBCSP-LDA Cross Validation Score: 0.77
FBCSP-SVM Cross Validation Score: 0.64

------------------------- Training for Subject: 5 -------------------------
FBCSP-LDA Cross Validation Score: 0.75
FBCSP-SVM Cross Validation Score: 0.67

------------------------- Training for Subject: 6 -------------------------
FBCSP-LDA Cross Validation Score: 0.77
FBCSP-SVM Cross Validation Score: 0.78

------------------------- Training for Subject: 7 --------------------

### Results
Winners for individual subjects
- FBCSP-LDA:  3, 4, 5, 7, 8
- FBCSP-SVM:  All others