### Project Overview

In this project, a complete motor imagery EEG classification pipeline is implemented using the BCI Competition IV-2a dataset.  
EEG signals are loaded from GDF files, while the corresponding trial labels are obtained from separate MAT files and aligned using trial onset markers.

The EEG data are epoched into fixed time windows corresponding to motor imagery tasks and bandpass filtered to retain task-relevant frequency components.  
Spatial features are extracted using the Common Spatial Patterns (CSP) algorithm. Since CSP is inherently binary, a one-vs-rest strategy is employed to extend it to four-class motor imagery classification.

For each motor imagery class, a dedicated CSP model is trained against all remaining classes. The resulting log-variance features from all CSP models are concatenated to form a discriminative feature vector.  
A Linear Discriminant Analysis (LDA) classifier is then trained on the training sessions and evaluated on the independent test sessions.

The final performance is reported using subject-wise classification accuracy, as well as the mean and standard deviation of accuracy across all subjects.


### Data Loading (GDF + MAT labels)

We load EEG signals from GDF files and read class labels from MAT files.
Epochs are extracted using trial-start markers, then labels are aligned with epochs.

In [26]:
import warnings
warnings.filterwarnings("ignore", message="Channel names are not unique*", category=RuntimeWarning)

import mne
import numpy as np
import scipy.io as sio


### Helper: Load epochs from GDF and labels from MAT

In [27]:
def load_epochs_and_labels(gdf_path, mat_path,
                          tmin=2.0, tmax=6.0,
                          l_freq=8.0, h_freq=30.0):
    raw = mne.io.read_raw_gdf(gdf_path, preload=True, verbose=False)

    raw.rename_channels({ch: f"{ch}_{i:02d}" for i, ch in enumerate(raw.ch_names)})

    events, event_id = mne.events_from_annotations(raw)

    raw.pick("eeg")

    # --- labels from MAT ---
    mat = sio.loadmat(mat_path)
    y = mat["classlabel"].squeeze().astype(int)   # classes: 1..4

    # --- epoch on trial start (768) ---
    epochs = mne.Epochs(
        raw, events,
        event_id={"trial": event_id["768"]},
        tmin=tmin, tmax=tmax,
        baseline=None, preload=True,
        verbose=False
    )

    # --- bandpass filter ---
    epochs = epochs.filter(l_freq, h_freq, verbose=False)
    X = epochs.get_data()  # (trials, ch, samples)

    # align lengths just in case
    n = min(len(X), len(y))
    return X[:n], y[:n]


### Single-subject demo (Train: A01T, Test: A01E)

In [29]:
base = "./dataset/BCICIV_2a"

XT, yT = load_epochs_and_labels(f"{base}/A01T.gdf", f"{base}/A01T.mat")
XE, yE = load_epochs_and_labels(f"{base}/A01E.gdf", f"{base}/A01E.mat")

XT.shape, np.unique(yT, return_counts=True), XE.shape, np.unique(yE, return_counts=True)

Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]


((288, 25, 1001),
 (array([1, 2, 3, 4]), array([72, 72, 72, 72])),
 (288, 25, 1001),
 (array([1, 2, 3, 4]), array([72, 72, 72, 72])))

### Binary Class Selection (Left vs Right) + CSP input formatting

In MAT labels: 1=left, 2=right, 3=foot, 4=tongue
We keep only {1,2} and relabel to {0,1}.


In [30]:
# keep left(1) vs right(2)
idxT = np.where((yT == 1) | (yT == 2))[0]
idxE = np.where((yE == 1) | (yE == 2))[0]

XT_lr, yT_lr = XT[idxT], yT[idxT]
XE_lr, yE_lr = XE[idxE], yE[idxE]

# relabel: left=0, right=1
yT_lr = np.where(yT_lr == 1, 0, 1)
yE_lr = np.where(yE_lr == 1, 0, 1)

# CSP format: (ch, samples, trials) per class (TRAIN فقط)
XA = XT_lr[yT_lr == 0].transpose(1, 2, 0)
XB = XT_lr[yT_lr == 1].transpose(1, 2, 0)

XA.shape, XB.shape, XT_lr.shape, XE_lr.shape

((25, 1001, 72), (25, 1001, 72), (144, 25, 1001), (144, 25, 1001))

### CSP Training (on Train) + Feature Extraction (Train/Test)

In [17]:
from filters.csp import CSP

In [31]:
m = 2
csp = CSP(m=m, reg=0.0)
csp.fit(XA, XB)

# features need: (ch, samples, trials)
FT = csp.compute_features(XT_lr.transpose(1, 2, 0))  # (trials_train, 2m)
FE = csp.compute_features(XE_lr.transpose(1, 2, 0))  # (trials_test,  2m)

FT.shape, FE.shape

((144, 4), (144, 4))

### Train on T, Evaluate on E (LDA)

Here we split the extracted CSP features into training and test sets (stratified), then train an LDA classifier and report the classification accuracy on the test set.

In [32]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.metrics import accuracy_score, confusion_matrix

clf = LinearDiscriminantAnalysis()
clf.fit(FT, yT_lr)

y_pred = clf.predict(FE)

acc = accuracy_score(yE_lr, y_pred)
cm = confusion_matrix(yE_lr, y_pred)

acc, cm

(0.8263888888888888,
 array([[71,  1],
        [24, 48]]))

 ### Subject-wise Evaluation (Train: T, Test: E)
For each subject A01..A09

In [33]:
subjects = [f"A0{i}" for i in range(1, 10)]
accs = []

for subj in subjects:
    XT, yT = load_epochs_and_labels(f"{base}/{subj}T.gdf", f"{base}/{subj}T.mat")
    XE, yE = load_epochs_and_labels(f"{base}/{subj}E.gdf", f"{base}/{subj}E.mat")

    # left vs right only
    idxT = np.where((yT == 1) | (yT == 2))[0]
    idxE = np.where((yE == 1) | (yE == 2))[0]

    XT_lr, yT_lr = XT[idxT], yT[idxT]
    XE_lr, yE_lr = XE[idxE], yE[idxE]

    yT_lr = np.where(yT_lr == 1, 0, 1)
    yE_lr = np.where(yE_lr == 1, 0, 1)

    XA = XT_lr[yT_lr == 0].transpose(1, 2, 0)
    XB = XT_lr[yT_lr == 1].transpose(1, 2, 0)

    csp = CSP(m=2, reg=0.0)
    csp.fit(XA, XB)

    FT = csp.compute_features(XT_lr.transpose(1, 2, 0))
    FE = csp.compute_features(XE_lr.transpose(1, 2, 0))

    clf = LinearDiscriminantAnalysis()
    clf.fit(FT, yT_lr)

    y_pred = clf.predict(FE)
    acc = accuracy_score(yE_lr, y_pred)
    accs.append(acc)

    print(f"{subj} T->E accuracy: {acc:.3f}")

mean_acc = float(np.mean(accs))
std_acc  = float(np.std(accs))

mean_acc, std_acc

Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]
A01 T->E accuracy: 0.826
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]
A02 T->E accuracy: 0.653
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), 

(0.7523148148148148, 0.11606436237705253)

### 4-class CSP (One-vs-Rest)

For four-class motor imagery, we train one CSP model per class using a one-vs-rest strategy (class k vs all other classes).  
For each CSP model we extract log-variance features (first `m` + last `m` components), then concatenate features from all four models and train a multi-class classifier (LDA).  
Training is done on `A??T` and evaluation is performed on `A??E`.


In [None]:
import numpy as np
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.metrics import accuracy_score, confusion_matrix

# labels in MAT: 1=left, 2=right, 3=foot, 4=tongue
CLASSES = [1, 2, 3, 4]

def ovr_csp_features(X, y, m=2, reg=0.0):
    from filters.csp import CSP

    csps = []
    feats = []

    for k in CLASSES:
        # class k vs rest
        idx_pos = np.where(y == k)[0]
        idx_neg = np.where(y != k)[0]

        X_pos = X[idx_pos]
        X_neg = X[idx_neg]

        XA = X_pos.transpose(1, 2, 0)  # (ch, samp, trials)
        XB = X_neg.transpose(1, 2, 0)

        csp = CSP(m=m, reg=reg)
        csp.fit(XA, XB)
        csps.append(csp)

        Fk = csp.compute_features(X.transpose(1, 2, 0))  # (trials, 2m)
        feats.append(Fk)

    F = np.concatenate(feats, axis=1)  # (trials, 4*2m)
    return csps, F

def ovr_csp_transform(csps, X):
    feats = []
    Xt = X.transpose(1, 2, 0)  # (ch, samp, trials)
    for csp in csps:
        feats.append(csp.compute_features(Xt))  # (trials, 2m)
    return np.concatenate(feats, axis=1)       # (trials, 4*2m)


In [35]:
# --- Single-subject 4-class: Train on T, Test on E ---
base = "./dataset/BCICIV_2a"

XT, yT = load_epochs_and_labels(f"{base}/A01T.gdf", f"{base}/A01T.mat")
XE, yE = load_epochs_and_labels(f"{base}/A01E.gdf", f"{base}/A01E.mat")

m = 2
csps, FT = ovr_csp_features(XT, yT, m=m, reg=0.0)
FE = ovr_csp_transform(csps, XE)

clf = LinearDiscriminantAnalysis()
clf.fit(FT, yT)
y_pred = clf.predict(FE)

acc = accuracy_score(yE, y_pred)
cm = confusion_matrix(yE, y_pred, labels=CLASSES)

acc, cm


Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]


(0.7361111111111112,
 array([[62,  3,  2,  5],
        [19, 46,  4,  3],
        [ 1,  1, 49, 21],
        [ 1,  0, 16, 55]]))

In [36]:
subjects = [f"A0{i}" for i in range(1, 10)]
accs = []

for subj in subjects:
    XT, yT = load_epochs_and_labels(f"{base}/{subj}T.gdf", f"{base}/{subj}T.mat")
    XE, yE = load_epochs_and_labels(f"{base}/{subj}E.gdf", f"{base}/{subj}E.mat")

    csps, FT = ovr_csp_features(XT, yT, m=2, reg=0.0)
    FE = ovr_csp_transform(csps, XE)

    clf = LinearDiscriminantAnalysis()
    clf.fit(FT, yT)

    y_pred = clf.predict(FE)
    acc = accuracy_score(yE, y_pred)
    accs.append(acc)

    print(f"{subj} 4-class T->E accuracy: {acc:.3f}")

mean_acc = float(np.mean(accs))
std_acc  = float(np.std(accs))

mean_acc, std_acc


Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]
A01 4-class T->E accuracy: 0.736
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('783')]
A02 4-class T->E accuracy: 0.517
Used Annotations descriptions: [np.str_('1023'), np.str_('1072'), np.str_('276'), np.str_('277'), np.str_('32766'), np.str_('768'), np.str_('769'), np.str_('770'), np.str_('771'), np.str_('772')]
Used Annotations descriptions: [np.str_('1023'), n

(0.6296296296296297, 0.11344954402632265)