In [12]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import sys
sys.path.append('../')
from neuroIDBench.datasets.cogBciFlanker import COGBCIFLANKER
from neuroIDBench.preprocessing.erp import ERP
from neuroIDBench.featureExtraction.features import AutoRegressive
from neuroIDBench.featureExtraction.features import PowerSpectralDensity
from neuroIDBench.featureExtraction.twinNeural import TwinNeuralNetwork
from neuroIDBench.datasets import utils
from sklearn.pipeline import make_pipeline
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from neuroIDBench.evaluations.multi_session_open_set import MultiSessionOpenSet
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC                                                                                 
from sklearn.ensemble import RandomForestClassifier
from neuroIDBench.analysis.plotting import Plots 
import warnings
warnings.filterwarnings('ignore')
import os


 50%|███████████████████                   | 550M/1.10G [01:00<00:43, 12.7MB/s]

COGBIC Flanker and Lee2019 are ERP based datasets and were generated as part of the studyies [1] Hinss, M. F., Jahanpour, E. S., Somon, B., Pluchon, L., Dehais, F., & Roy, R. N. (2022). COG-BCI database: A multi-session and multi-task EEG cognitive dataset for passive brain-computer interfaces (Version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6874129 and [2] Lee, M. H., Kwon, O. Y., Kim, Y. J., Kim, H. K., Lee, Y. E., Williamson, J., … Lee, S. W. (2019). EEG dataset and OpenBMI toolbox for three BCI paradigms: An investigation into BCI illiteracy. GigaScience, 8(5), 1–16. https://doi.org/10.1093/gigascience/giz002 

| Name    | #Subj | #Chan | Sampling Rate | #Sessions |
|---------|-------|-------|---------------|-----------|
| COG-BCI |  29   | 32    |    512Hz      |    3      |

**Description of COG-BCI Flanker Task**

The dataset consists of recordings from 29 participants who completed three separate sessions, 
each conducted at an interval of 7 days. The participants are exposed to stimuli consisting 
of five arrows positioned at the center of a computer screen. Participants are instructed to 
respond to the central arrow while disregarding the surrounding (flanker) arrows. 
These flanker stimuli can aim in the same direction as the central target (congruent condition) 
or in the opposite direction (incongruent condition). 
Upon the conclusion of the trial, the participant is provided with feedback regarding 
the outcome of their performance, explicitly indicating whether their response was correct, 
incorrect, or a miss. A total of 120 trials are conducted, with each complete run having 
an approximate duration of 10 minutes. 


| Name    | #Subj | #Chan | Sampling Rate | #Sessions |
|---------|-------|-------|---------------|-----------|
| Lee2019 |  54   |  62   |    1000Hz      |    2     |

**Description of ERP Task in Lee2019**

The ERP speller interface followed a standard row-column layout with 36 symbols (A to Z, 1 to 9, and _), spaced evenly. Additional settings, including random-set presentation and face stimuli, were incorporated to enhance signal quality by minimizing adjacency distraction errors and presenting a familiar face image. Each sequence consisted of 12 stimulus flashes with a stimulus-time interval of 80 ms and an inter-stimulus interval (ISI) of 135 ms. A maximum of five sequences (60 flashes) were presented without prolonged inter-sequence intervals for each target character. After five sequences, 4.5 s were allotted for the participant to identify and locate the next target character. During training, participants copied-spelled a given sentence ("NEURAL NETWORKS AND DEEP LEARNING") without feedback. In the test session, participants copied-spelled "PATTERN RECOGNITION MACHINE LEARNING," and real-time EEG data were analyzed based on training session classifier results. EEG data consisted of 1,980 and 2,160 trials for the training and test phases, respectively.

### Creating instances of dataset COGBCI-Flanker and ERP with default parameters

Following are the default parameters of datasets and paradigm ERP:</br></br>
    <i>Number of Subjects=29</i></br>
    <i>Sample_duration=1 seconds (-200,800)</i></br>
    <i>Baseline_Correction=True</i>

In [4]:
dataset1=COGBCIFLANKER()
paradigm=ERP()

### Intializing pipelines for 4 shallow Classifiers i.e., LDA, LR, KNN and RF with AR Parameters and PSD Features with default Parameters

Following are the default parameters of AR coeffecients and Twin Neural Networks:</br></br>
    <i>AR Order=1</i></br>
    <i>Batch_Size=192</i></br>
    <i>Epochs=100</i></br>

In [6]:
pipeline={}
pipeline['AR+PSD+LDA']=make_pipeline(AutoRegressive(), PowerSpectralDensity(), LDA())
pipeline['AR+PSD+LR']=make_pipeline(AutoRegressive(), PowerSpectralDensity(), LogisticRegression())
pipeline['AR+PSD+KNN']=make_pipeline(AutoRegressive(), PowerSpectralDensity(), KNeighborsClassifier())
pipeline['AR+PSD+RF']=make_pipeline(AutoRegressive(), PowerSpectralDensity(), RandomForestClassifier())
pipeline['TNN']=make_pipeline(TwinNeuralNetwork())

### Creating Authentication pipeline for multi session evaluation under unkown attacker Scenario

In [10]:
evaluation=MultiSessionOpenSet(paradigm=paradigm, datasets=dataset1, overwrite=False)

### Executing the Multi Session Authentication Pipeline

In [None]:
results=evaluation.process(pipeline)

### Statistical Analysis of the performace of Multi Session Evalaution Across the two datasets

Comparative Analysis of Average EER and FNMR at 1%, 0.1%, and 0.01% FMR thresholds across multi-session dataset. The evaluation encompasses single-session evaluation Scheme, with a focus on classifiers’ performance in the unknown attacker scenario.

In [None]:
multi_session_df=results.groupby(['eval Type','dataset','pipeline']).agg({
                'eer': lambda x: f'{np.mean(x)*100:.3f} ± {np.std(x)*100:.3f}',
                'frr_1_far': lambda x: f'{np.mean(x)*100:.3f}',
                'frr_0.1_far': lambda x: f'{np.mean(x)*100:.3f}',
                'frr_0.01_far': lambda x: f'{np.mean(x)*100:.3f}'    
            }).reset_index()

display(multi_session_df)