# Epileptic Seizure Classification with an Autoencoder and Classification of the Latent Space
This notebook contains the classification of time series EEG data for the detection of epileptic seizures based on the preprocessed CHB-MIT Scalp EEG Database using an autoencoder- and classification-model.<br>
The codes is structured as followed:
1. [Imports](#1-imports)
2. [Load Preprocessed Dataset](#2-load-preprocessed-dataset)
3. [Split Dataset](#3-split-dataset)
4. [Normalize Dataset](#4-normalize-dataset)
5. [Autoencoder](#5-autoencoder) <br>
5.1 [Define Autoencoder-Model](#51-define-autoencoder-model) <br>
5.2 [Compile Autoencoder-Model](#52-compile-autoencoder-model) <br>
5.3 [Train Autoencoder](#53-fit-autoencoder-model) <br>
5.4 [Visualize Reconstruction Error](#54-visualize-reconstruction)
6. [Split Autoencoder at Latent Space](#6-seperate-encoder)
7. [Binary Classification](#7-binary-classification) <br>
7.1 [Define Classification-Model](#71-define-classification-model) <br>
7.2 [Compile Classification-Model](#72-compile-classification-model) <br>
7.3 [Train Classificator](#73-fit-classification-model) <br>
8. [Validate Results](#8-validate-results)
9. [Conclusion](#9-conclusion)

## 1. Imports
Import requiered libraries. <br>
External packages can be installed via the `pip install -r requirements.txt` command or the notebook-cell below.

In [None]:
! pip install -r ../requirements.txt

In [None]:
# Import built-in libraries
import warnings
warnings.filterwarnings("ignore", message=".*The 'nopython' keyword.*") # Suppress SHAP warnings

# Import datascience libraries
import numpy as np

# Import preprocessing-libraries, classification metrics
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn.metrics import f1_score, roc_auc_score, precision_score, recall_score
from imblearn.metrics import geometric_mean_score

# Import visualization libraries
import plotly.graph_objects as go
from prettytable import PrettyTable

# Import neural network framework & layers
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv1D, MaxPooling1D, UpSampling1D, Dense, Flatten, BatchNormalization
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping

# Import explainability library
import shap


## 2. Load Preprocessed Dataset
In order to load the preprocessed dataset, that was created with the notebook `00_Preprocessing.ipynb`, is loaded and the numpy Arrays for the features and labels are extracted. <br>
To enshure a functional distribution of the classes in the dataset, the classes with the respective amounts are plotted.

In [None]:
dataset = np.load('../00_Data/Processed-Data/classification_dataset_max.npz') # Load compressed numpy array
X = dataset["features"] # Extract feature-array from compressed file
y = dataset["labels"] # Extract label-array from compressed file

In [None]:
print("Shapes: \n X:", X.shape, "y:", y.shape)
print("Unique Values:", np.unique(y, return_counts=True))

## 3. Split Dataset
In order to validate and test the trained classifier, the dataset must be split into a `train`, `test`, and `validation` subset. <br>
To preserve an equal distribution within each split, the `stratify`-option is enabled.

In [None]:
X_train, X_rest, y_train, y_rest = train_test_split(X, y, test_size=0.4, shuffle=True, stratify=np.ravel(y), random_state=34)
X_test, X_val, y_test, y_val = train_test_split(X_rest, y_rest, test_size=0.5, shuffle=True, stratify=np.ravel(y_rest), random_state=34)

## 4. Normalize Dataset
When working with neural networks, it is imperative to normalize the data bevore training and testing. This enshures a faster training, avoids numerical instablities and provides a better generalization of the neural network. However with EEG-data, there are additional requirements due to the different characteristics and value-ranges of the individual channels. Therefore, the normalization is done channel by channel based on the training-subset and applied on the test- and validation-split.

In [None]:
def normalize_features(X_train:np.ndarray, X_test:np.ndarray, X_val:np.ndarray, use_standard_scaler:bool=False) -> tuple:
    if(use_standard_scaler):
        scaler = StandardScaler() # Create Z-Score normalizer
    else:
        scaler = MinMaxScaler() # Create Min-Max normalizer
    X_train_norm = np.zeros(shape=(X_train.shape), dtype='float32') # Create empty array for normalized train-data
    X_test_norm = np.zeros(shape=(X_test.shape), dtype='float32') # Create empty array for normalized test-data
    X_val_norm = np.zeros(shape=(X_val.shape), dtype='float32') # Create empty array for normalized val-data
    for feature_col in range(X_train.shape[2]): # Iterate over features in dataset
        X_train_norm[:,:,feature_col] = scaler.fit_transform(X_train[:,:,feature_col]) # Fit and apply normalizer on current feature in train subset
        X_test_norm[:,:,feature_col] = scaler.transform(X_test[:,:,feature_col]) # Apply normalizer on current feature in test subset
        X_val_norm[:,:,feature_col] = scaler.transform(X_val[:,:,feature_col]) # Apply normalizer on current feature in val subset
    return X_train_norm, X_test_norm, X_val_norm

In [None]:
X_train_normalized, X_test_normalized, X_val_normalized = normalize_features(X_train, X_test, X_val, True)

## 5. Autoencoder
The following section contains the data-preperation, build and training of the autoencoder. <br>

<b>What is an autoencoder?</b><br>
An autoencoder is a neural network architecture, that is used for unsupervised machine learning tasks. It consists out of two main components: The encoder & decoder. The encoder takes the input data and transforms it into a lower dimensional representation of the data, the so-called "latent-space". The decoder-part takes this data and tries to reconstruct the original input data. The main target during the training-phase is to minimize the reconstruction error. <br>

<b>How can autoencoders be used for the detection of epileptic seizures in EEG-data?</b><br>
There are two options how autoencoders can be used for the detection of epileptic seizures in EEG-data: Reconstruction-Error & Latent-Space. <br>
By training the autoencoder only on data that does not contain any epileptic seizures, the reconstruction error for "normal" data is minimized. That means that if a sample with an active seizure is predicted, the reconstruction error will be increased. By defining an error-threshold, a binary classification can be performed to seperate normal samples from samples with an epileptic seizure.

The second option is to use the latent space for the classification. The autoencoder is trained on the complete dataset with the same task of minimizing the reconstruction error. By seperating the decoding-component from the autoencoder, the latent space is exposed. Because of the differences in the data when an epileptic seizure is present, the representation of these samples must be different in the reduced space. Based on this assumption, a classification by using a clustering-approach can be done.

The following code contains the second approach.
### 5.1 Define Autoencoder-Model
**Adam** is used as the optimizer and the **binary crossentropy** used for the loss function.

*Note: This project was build on M1 mac. To enshure best performance, the legacy version of the Adam optimizer was used. This can be changed when using windows or newer Versions of Tensorflow!*

In [None]:
def build_and_compile_ae(train_shape:tuple, initla_lr:float=0.0001):
    inputs = Input(shape=(train_shape[1], train_shape[2]))
    # Encoder
    E1 = Conv1D(filters=24, kernel_size=3, activation='relu', padding='same')(inputs)
    E2 = MaxPooling1D(pool_size=2, padding='same')(E1)
    E3 = Conv1D(filters=12, kernel_size=3, activation='relu', padding='same')(E2)
    E4 = MaxPooling1D(pool_size=2, padding='same')(E3)
    latent_vector = Conv1D(filters=6, kernel_size=3, activation='relu', padding='same')(E4)
    # Decoder
    D1 = Conv1D(filters=6, kernel_size=3, activation='relu', padding='same')(latent_vector)
    D2 = UpSampling1D(size=2)(D1)
    D3 = Conv1D(filters=12, kernel_size=3, activation='relu', padding='same')(D2)
    D4 = UpSampling1D(size=2)(D3)
    D5 = Conv1D(filters=24, kernel_size=3, activation='relu', padding='same')(D4)
    ae_outputs = Dense(train_shape[2])(D5)

    autoencoder = Model(inputs=inputs, outputs=ae_outputs)
    opt = tf.keras.optimizers.legacy.Adam(learning_rate=0.001)

    autoencoder.compile(optimizer=opt, loss='mse')
    return autoencoder, inputs, latent_vector

### 5.2 Compile Autoencoder-Model
In order to train the neural network, the model defined in the previous step must be compiled. Furthermore, two **callbacks** are defined: **Early Stopping** and a **dynamic Learning Rate**. Both callbacks monitor the validation loss and stop the fitting process after a defined number of epochs without any improvement or reduce the learing rate to enable better results. Finally, the model is plottet to visualize the structure, number of parameters and enable the detection of errors in the definition.

In [None]:
autoencoder, ae_inputs, latent_vector = build_and_compile_ae(X_train.shape, 0.001)

earlystopper_ae = EarlyStopping(monitor='val_loss', patience=10, start_from_epoch=10, restore_best_weights=True, verbose=1)
reduce_lr_ae = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=0.0000001, verbose=1, cooldown=10)

autoencoder.summary()

### 5.3 Fit Autoencoder-Model
Fitting the neural network stands for the training process of the weights and bias of each layer to enable good predictions. The normalized training data is used for the training step and a validation is performed after every epoch via the normalized validation split to detect and prevent overfitting. Different to a "normal" training process using the features and labels, the feature-data is used for both input and output. After training, the progression of the training and validation loss is visualized for the easy visualization of the fit-process and further detection of overfitting or other issues during the training. In addition, an example of the original and reconstructed data is plottet to see the capabilities of the autoencoder. Finally, the model is saved as an .h5 file to be able to load the model an/or weights during later testing without new training.

In [None]:
history_ae = autoencoder.fit(
    X_train_normalized, 
    X_train_normalized,
    epochs=500,
    batch_size=50,
    validation_data=(X_val_normalized, X_val_normalized),
    shuffle=True,
    verbose=1,
    callbacks=[earlystopper_ae, reduce_lr_ae]
)

In [None]:
fig = go.Figure(
    data = [
        go.Scatter(y=history_ae.history['loss'], name="train"),
        go.Scatter(y=history_ae.history['val_loss'], name="val"),
    ],
    layout = {"yaxis": {"title": "Loss [MSE]"}, "xaxis": {"title": "Epoch"}, "title": "Reconstruction Loss over Epochs"}
)

fig.show()

### 5.4 Visualize Reconstruction

In [None]:
a = np.transpose(X_train_normalized, (0,2,1))
data = []
for i in a[0]:
    data.append(
        go.Scatter(y=i)
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Loss [MSE]"}, "xaxis": {"title": "Epoch"}, "title": "Model Loss over Epochs"}
)

fig.show()

In [None]:
pred = autoencoder.predict(X_train_normalized[:1])
a = np.transpose(pred, (0,2,1))
data = []
for i in a[0]:
    data.append(
        go.Scatter(y=i)
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Loss [MSE]"}, "xaxis": {"title": "Epoch"}, "title": "Model Loss over Epochs"}
)

fig.show()

In [None]:
autoencoder.save('../99_Assets/01_Saved Models/03_autoencoder_latent_space_clf.h5')
# autoencoder = tf.keras.models.load_model('../99_Assets/01_Saved Models/03_autoencoder_latent_space_clf.h5')

## 6. Seperate Encoder
In order to classify the latent space, the autoencoder-model must be split in half and the latent vector extracted. To prevent overwriting the already trained weights and biases, the `trainable` parameter is set to false. 

In [None]:
encoder = Model(inputs=ae_inputs, outputs=latent_vector)

for layer in encoder.layers:
    layer.trainable = False

## 7. Classification
After the creation, training and splitting of the autoencoder, the classification of the latent space can be performed. 

### 7.1 Define Classification-Model
Due to the latent space having being two-dimensional (timesteps, convolutional filters), CNNs are also used at the upper layers of the classification model. The information is further reduced with decreasing filters and pooling layers and finally flattend into a one-dimensional representation. This is further processed with dense-layers and finally restriced to one perceptron with an sigmoid activation function to limit the output between 0 and 1.

In [None]:
def build_and_compile_clf(encoder):
    classification_model = tf.keras.Sequential()
    classification_model.add(encoder)
    classification_model.add(Conv1D(filters=6, kernel_size=3, padding='same'))
    classification_model.add(MaxPooling1D(pool_size=2, padding='same'))
    classification_model.add(BatchNormalization())
    classification_model.add(Conv1D(filters=4, kernel_size=3, padding='same'))
    classification_model.add(MaxPooling1D(pool_size=2, padding='same'))
    classification_model.add(BatchNormalization())
    classification_model.add(Conv1D(filters=2, kernel_size=3, padding='same'))
    classification_model.add(MaxPooling1D(pool_size=2, padding='same'))
    classification_model.add(BatchNormalization())
    classification_model.add(Flatten())
    classification_model.add(Dense(32))
    classification_model.add(Dense(1, activation='sigmoid'))

    opt = tf.keras.optimizers.legacy.Adam(learning_rate=0.001)

    classification_model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['binary_accuracy'])
    return classification_model

### 7.2 Compile Classification-Model
In order to train the neural network, the model defined in the previous step must be compiled. Furthermore, two **callbacks** are defined: **Early Stopping** and a **dynamic Learning Rate**. Both callbacks monitor the validation loss and stop the fitting process after a defined number of epochs without any improvement or reduce the learing rate to enable better results. Finally, the model is plottet to visualize the structure, number of parameters and enable the detection of errors in the definition.

In [None]:
clf = build_and_compile_clf(encoder)

earlystopper_clf = EarlyStopping(monitor='val_loss', patience=25, restore_best_weights=True, verbose=1)
reduce_lr_clf = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=0.0000001, verbose=1, cooldown=10)

clf.summary()

### 7.3 Fit Classification-Model
Fitting the neural network stands for the training process of the weights and bias of each layer to enable good predictions. The normalized training data is used for the training step and a validation is performed after every epoch via the normalized validation split to detect and prevent overfitting. After training, the progression of the training and validation loss is visualized for the easy visualization of the fit-process and further detection of overfitting or other issues during the training. Finally, the model is saved as an .h5 file to be able to load the model an/or weights during later testing without new training.

In [None]:
history_clf = clf.fit(
    X_train_normalized, 
    y_train, 
    epochs=1250, 
    batch_size=50,
    validation_data=(X_val_normalized, y_val),
    verbose=1, 
    callbacks=[earlystopper_clf, reduce_lr_clf]
)

In [None]:
fig = go.Figure(
    data = [
        go.Scatter(y=history_clf.history['loss'], name="train"),
        go.Scatter(y=history_clf.history['val_loss'], name="val"),
    ],
    layout = {"yaxis": {"title": "Loss [Binary Crossentropy]"}, "xaxis": {"title": "Epoch"}, "title": "Model Loss over Epochs"}
)

fig.show()

In [None]:
# model.save('../99_Saved Models/latent_clf.h5')
# model = tf.keras.models.load_model('../99_Saved Models/latent_clf.h5')

## 8. Validate Results
To ensure correct training without overfitting and to demonstrate the generalizability of the model, a validation step is performed last. The test subset, which was not seen by the neural network during training, serves as the data basis for this. Therefore, the obtained results can be used as a representation of the generalistic predictive ability of the model. Since, depending on the data set, there may be an inequality in the distribution of the classes, the accuracy is not used as the discriminating metric. 

The F1 score, G-Mean, the AUC of the ROC both as well as the basic Precision and Recall are calculated in the following section.

In [None]:
y_test_predictions = clf.predict(X_test_normalized)
y_test_predictions = (y_test_predictions >= 0.5).astype(int)
f1score = f1_score(y_test, y_test_predictions)
gm = geometric_mean_score(y_test, y_test_predictions, average="binary")
auc = roc_auc_score(y_test, y_test_predictions, average="weighted")
precision = precision_score(y_test, y_test_predictions)
recall = recall_score(y_test, y_test_predictions)

In [None]:
data = [["F1-Score", "G-Mean", "AUC", "Precision", "Recall"], [f1score, gm, auc, precision, recall]]
table = PrettyTable(data[0])
table.add_rows(data[1:])
print(table)

## 9. Conclusion