### Clinical BCI Challenge-WCCI2020
- [website link](https://sites.google.com/view/bci-comp-wcci/?fbclid=IwAR37WLQ_xNd5qsZvktZCT8XJerHhmVb_bU5HDu69CnO85DE3iF0fs57vQ6M)


 - [Dataset Link](https://github.com/5anirban9/Clinical-Brain-Computer-Interfaces-Challenge-WCCI-2020-Glasgow)
 
 
 - [FBCSP Github Repo Link](https://github.com/jesus-333/FBCSP-Python)
 
I have changed the source code to give cohen's kappa score instead of accuracy as an evaluation measure and I have also made a few other small changes as well. Moreover, for cross subject analysis I have also changed the implementation stuff a bit. 

LinearSVM showed some errors while making predictions from evaluateTrail() method. So, I replaced it with LogisticRegression

In [39]:
from FBCSP.FBCSP_V4_CS import FBCSP_V4 as FBCSP 

In [40]:
import mne
from scipy.io import loadmat
import scipy
import sklearn
import numpy as np
import pandas as pd
import glob
from mne.decoding import CSP
import os

In [41]:
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import LinearSVC, SVC
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV, StratifiedShuffleSplit
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda
from sklearn.model_selection import LeaveOneGroupOut
from sklearn.metrics import cohen_kappa_score as kappa_score

In [42]:
import warnings
warnings.filterwarnings('ignore') # to ignore warnings

In [43]:
verbose = False                    # global variable to suppress output display of MNE functions
mne.set_log_level(verbose=verbose) # to suppress large info outputs

In [62]:
verbose_clf = True # control output of FBCSP function
freqs_band = np.linspace(8, 32, 7) # filter bank choice

In [45]:
# using kappa as evaluation metric
kappa = sklearn.metrics.make_scorer(sklearn.metrics.cohen_kappa_score) # kappa scorer
acc = sklearn.metrics.make_scorer(sklearn.metrics.accuracy_score)      # accuracy scorer
scorer = kappa          # just assign another scorer to replace kappa scorer

## Data Loading and Conversion to MNE Datatypes
[Mike Cohen Tutorials link for EEG Preprocessing](https://www.youtube.com/watch?v=uWB5tjhataY&list=PLn0OLiymPak2gDD-VDA90w9_iGDgOOb2o)

In [46]:
current_folder = globals()['_dh'][0]  # a hack to get path of current folder in which juptyter file is located
data_path = os.path.join(current_folder, 'Data')

In [None]:
training_files   = glob.glob(data_path + '/*T.mat')
len(training_files)     # if  return zero,then no file is loaded

## Lets Append Epochs

In [48]:
def get_mne_epochs_complete(files_paths, verbose=verbose, t_start=2, fs=512, mode='train'):
    '''
    similar to get_mne_epochs, just appends data from all relevant files together to give a single
    epoch object
    '''
    eeg_data = []
    for filepath in files_paths:
        mat_data = loadmat(filepath)
        eeg_data.extend(mat_data['RawEEGData'])

    idx_start = fs*t_start      # fs*ts
    eeg_data = np.array(eeg_data)
    eeg_data = eeg_data[:, :, idx_start:]
    event_id = {'left-hand': 1, 'right-hand': 2}
    channel_names = ['F3', 'FC3', 'C3', 'CP3', 'P3', 'FCz', 'CPz', 'F4', 'FC4', 'C4', 'CP4', 'P4']
    info = mne.create_info(ch_names=channel_names, sfreq=fs, ch_types='eeg')
    epochs = mne.EpochsArray(eeg_data, info, verbose=verbose, tmin=t_start-3.0)
    epochs.set_montage('standard_1020')
    epochs.filter(1., None) # required be ICA, (7-30 Hz) later
    epochs.apply_baseline(baseline=(-.250, 0)) # linear baseline correction
    
    if mode == 'train': # this in only applicable for training data
        labels = []
        for filepath in files_paths:
            mat_data = loadmat(filepath)
            labels.extend(mat_data['Labels'].ravel())
        epochs.event_id = event_id
        epochs.events[:,2] = labels    
    return epochs 

### Data Loading with Band Pass Filtering

In [49]:
# loading relevant files
training_epochs_all = get_mne_epochs_complete(training_files).filter(7,32)            # for all training subjects

In [50]:
epochs = training_epochs_all.copy()
data, labels = epochs.get_data(), epochs.events[:,-1]
print('Shape of EEG Data: ', data.shape, '\t Shape of Labels: ', labels.shape) 

Shape of EEG Data:  (640, 12, 3072) 	 Shape of Labels:  (640,)


### Training with Leave One Group Out CV


In [58]:
cv = LeaveOneGroupOut()

In [59]:
# group parameter for leave one group out cross validation in sklearn, each subject is given unique identifier
group_list = []
for subject in np.linspace(1,8,8):
    group_list.extend([subject for _ in range(80)]) # since we have 80 samples in each training file
groups = np.array(group_list)

In [67]:
i = 1
fs = epochs.info['sfreq']
valid_scores_lda = []
    
for train_idx, valid_idx in cv.split(epochs, y=labels, groups=groups):
    print('-'*20, "Iteration:", i, '-'*20) 
    train_epochs = epochs[train_idx]
    valid_epochs = epochs[valid_idx]

    valid_data, valid_labels = valid_epochs.get_data()[:,:,256+512:-256], valid_epochs.events[:,-1]
    
    data_dict_train = {'left-hand':  train_epochs['left-hand'].get_data()[:,:,256+512:-256], # [0.5, 4.5] sec data
             'right-hand': train_epochs['right-hand'].get_data()[:,:,256+512:-256]}
        
    # using LDA as classifier
    fbcsp_clf_lda = FBCSP(data_dict_train, fs, freqs_band=freqs_band, 
                          classifier=lda(), print_var=verbose_clf)
    preds_fbcsp_clf_lda = fbcsp_clf_lda.evaluateTrial(valid_data)[0]
    valid_scores_lda.append(kappa_score(preds_fbcsp_clf_lda, valid_labels))
    
    i = i+1
    print()

-------------------- Iteration: 1 --------------------
Features used for classification:  8
Score on Training set:  0.5178571428571428

-------------------- Iteration: 2 --------------------
Features used for classification:  8
Score on Training set:  0.5

-------------------- Iteration: 3 --------------------
Features used for classification:  8
Score on Training set:  0.5678571428571428

-------------------- Iteration: 4 --------------------
Features used for classification:  8
Score on Training set:  0.5285714285714286

-------------------- Iteration: 5 --------------------
Features used for classification:  8
Score on Training set:  0.43214285714285716

-------------------- Iteration: 6 --------------------
Features used for classification:  8
Score on Training set:  0.4392857142857143

-------------------- Iteration: 7 --------------------
Features used for classification:  8
Score on Training set:  0.5642857142857143

-------------------- Iteration: 8 --------------------
Feature

In [70]:
i = 1
valid_scores_logreg = []
    
for train_idx, valid_idx in cv.split(epochs, y=labels, groups=groups):
    print('-'*20, "Iteration:", i, '-'*20) 
    train_epochs = epochs[train_idx]
    valid_epochs = epochs[valid_idx]

    valid_data, valid_labels = valid_epochs.get_data()[:,:,256+512:-256], valid_epochs.events[:,-1]
    
    data_dict_train = {'left-hand':  train_epochs['left-hand'].get_data()[:,:,256+512:-256], # [0.5, 4.5] sec data
             'right-hand': train_epochs['right-hand'].get_data()[:,:,256+512:-256]}
    fs = epochs.info['sfreq']
    
    # using Logistic Regression as classifier
    fbcsp_clf_logreg = FBCSP(data_dict_train, fs, freqs_band=freqs_band, 
                             classifier=LogisticRegression(), print_var=verbose_clf)
    preds_fbcsp_clf_logreg = fbcsp_clf_logreg.evaluateTrial(valid_data)[0]
    valid_scores_logreg.append(kappa_score(preds_fbcsp_clf_logreg, valid_labels))
    
    i = i+1
    print()

-------------------- Iteration: 1 --------------------
Features used for classification:  8
Score on Training set:  0.5142857142857142

-------------------- Iteration: 2 --------------------
Features used for classification:  8
Score on Training set:  0.5285714285714286

-------------------- Iteration: 3 --------------------
Features used for classification:  8
Score on Training set:  0.575

-------------------- Iteration: 4 --------------------
Features used for classification:  8
Score on Training set:  0.525

-------------------- Iteration: 5 --------------------
Features used for classification:  8
Score on Training set:  0.4464285714285714

-------------------- Iteration: 6 --------------------
Features used for classification:  8
Score on Training set:  0.4464285714285714

-------------------- Iteration: 7 --------------------
Features used for classification:  8
Score on Training set:  0.5642857142857143

-------------------- Iteration: 8 --------------------
Features used for c

In [71]:
print("FBCSP-LDA    Cross Validation Score:", np.mean(valid_scores_lda))
print("FBCSP-Logreg Cross Validation Score:", np.mean(valid_scores_logreg)) 
# we aren't doing grid search here so wouldn't take max score

FBCSP-LDA    Cross Validation Score: 0.39999999999999997
FBCSP-Logreg Cross Validation Score: 0.39374999999999993


### Results
Winner for cross subject task is FBCSP-LDA