# Assignment: Training EEGNet on P300 EEG Data

In this assignment, you will work with real EEG data from a P300 speller experiment and implement the EEGNet architecture to detect P300 responses. The emphasis of this assignment is on understanding and implementing the EEGNet model rather than extensive signal preprocessing.

**Instructions:**
- Complete the provided code scaffolding
- Fill in missing logic where indicated
- Focus especially on the EEGNet architecture and training


## Part 1: Loading and Inspecting the Dataset

In this section, you will load the EEG dataset and inspect its basic structure. The dataset contains continuous EEG recordings along with stimulus and label information.

In [36]:
import scipy.io as sio
import numpy as np

# Load the dataset
# TODO: Update the path if needed
data = sio.loadmat('Subject_A_Train.mat')

# Inspect available keys
print(data.keys())


dict_keys(['__header__', '__version__', '__globals__', 'Signal', 'TargetChar', 'Flashing', 'StimulusCode', 'StimulusType'])


## Part 2: Understanding the Experimental Design

The P300 speller paradigm is based on detecting brain responses to rare target stimuli. In this section, you will identify how stimulus timing and labels are encoded in the data.

In [37]:
# TODO: Identify which variables correspond to
# 1. Continuous EEG signal
# 2. Stimulus onset information
# 3. Target vs non-target labels

# Hint: Look for variables related to stimulus codes and stimulus types


In [51]:
signal = np.array(data['Signal'], dtype=np.float32)
onsets = np.where(data['StimulusCode'].reshape(-1) == 1)[0]
labels = data['StimulusType'].reshape(-1)[onsets]
fs = 240

## Part 3: EEG Epoch Extraction

EEGNet does not operate on continuous EEG. Instead, the signal must be segmented into short epochs following each stimulus. This step converts raw EEG into trials suitable for supervised learning.

In [55]:
def extract_epochs(signal, stimulus_onsets, labels, fs, t_start=0.0, t_end=0.8):
    """
    Extract EEG epochs around each stimulus onset.

    Parameters:
    - signal: continuous EEG array of shape (time, channels)
    - stimulus_onsets: indices where stimuli occur
    - labels: target/non-target labels per stimulus
    - fs: sampling frequency in Hz
    - t_start: start time (seconds) relative to stimulus
    - t_end: end time (seconds) relative to stimulus

    Returns:
    - epochs: array of shape (num_trials, channels, time)
    - y: corresponding labels
    """
    # TODO: Implement epoch extraction logic
    # Hint: Convert time window to samples using fs

    start_sample = int(t_start*fs)
    end_sample = int(t_end*fs)
    num_samples = end_sample-start_sample

    epochs = []
    y_labels = []
    indices = np.where(onsets>0)[0]

    if len(indices) > 0:
        clean_indices = [indices[0]]
        for i in range(1, len(indices)):
            if indices[i]-clean_indices[-1]>(fs/2):
                clean_indices.append(indices[i])

        for idx in clean_indices:
            start = idx+start_sample
            end = idx+end_sample
            if end <= signal.shape[0]:
                epochs.append(signal[start:end,:].T)
                y_labels.append(labels[idx])

    return np.array(epochs), np.array(y_labels)

X_raw, y = extract_epochs(signal, onsets, labels, fs)

## Part 4: Preparing Data for EEGNet

In this section, you will perform minimal preprocessing to make the data compatible with EEGNet. Extensive signal processing is not required.

In [41]:
def prepare_for_eegnet(epochs):
    """
    Prepare EEG epochs for input into EEGNet.

    Expected input shape: (trials, channels, time)
    Expected output shape: (trials, 1, channels, time)
    """
    # TODO: Add singleton dimension required by Conv2D
    # Hint: Use numpy.expand_dims
    return np.expand_dims(epochs, axis=1)

X = prepare_for_eegnet(X_raw)

## Part 5: Implementing EEGNet

This is the core part of the assignment. You will implement the EEGNet architecture as discussed in class. Focus on matching the block structure and understanding the role of each layer.

In [42]:
signal.shape

(85, 7794, 64)

In [43]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import (Input, Conv2D, DepthwiseConv2D,
                                     SeparableConv2D, BatchNormalization,
                                     AveragePooling2D, Dropout, Flatten, Dense)

def EEGNet(nb_classes, Chans, Samples, F1=8, D=2, F2=16, dropoutRate=0.5):
    """
    EEGNet architecture.

    Parameters:
    - nb_classes: number of output classes
    - Chans: number of EEG channels
    - Samples: number of time samples per epoch
    - F1: number of temporal filters
    - D: depth multiplier for spatial filters
    - F2: number of pointwise filters
    """

    inputs = Input(shape=(1, Chans, Samples))

    # Block 1: Temporal Convolution
    block1 = Conv2D(F1,(1,64), padding='same', use_bias=False, data_format='channels_first')(inputs)
    block1 = BatchNormalization(axis=1)(block1)

    # Block 1: Spatial Convolution
    block1 = DepthwiseConv2D((Chans,1), use_bias=False,
                             depth_multiplier=D,
                             data_format='channels_first',
                             depthwise_constraint=tf.keras.constraints.MaxNorm(1.))(block1)
    block1 = BatchNormalization(axis=1)(block1)
    block1 = tf.keras.layers.Activation('elu')(block1)
    block1 = AveragePooling2D((1,4), data_format='channels_first')(block1)
    block1 = Dropout(dropoutRate)(block1)

    # Block 2: Separable Convolution
    block2 = SeparableConv2D(F2,(1,16), use_bias=False, padding='same', data_format='channels_first')(block1)
    block2 = BatchNormalization(axis=1)(block2)
    block2 = tf.keras.layers.Activation('elu')(block2)
    block2 = AveragePooling2D((1,8), data_format='channels_first')(block2)
    block2 = Dropout(dropoutRate)(block2)

    # Classification
    flatten = Flatten()(block2)
    dense = Dense(nb_classes, activation='softmax')(flatten)

    return Model(inputs=inputs, outputs=dense)

model = EEGNet(nb_classes=2, Chans=64, Samples=192)
model.summary()

## Part 6: Training the Model

In this section, you will train EEGNet to distinguish between P300 and non-P300 EEG epochs.

In [49]:
# TODO: Split the dataset into training and validation sets
# TODO: Compile the model with an appropriate loss and optimizer
# Hint: Use categorical cross-entropy and Adam optimizer
# TODO: Train the model and store the training history

from sklearn.model_selection import train_test_split
import tensorflow as tf

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(X_train, y_train,
                    validation_data=(X_val, y_val),
                    epochs=50,
                    batch_size=16)

ValueError: With n_samples=0, test_size=0.2 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.