# Epileptic Seizure Classification with an Autoencoder and Classification of the Latent Space
This notebook contains the classification of time series EEG data for the detection of epileptic seizures based on the preprocessed CHB-MIT Scalp EEG Database using an autoencoder-model with following clustering of the latent space.<br>
The codes is structured as followed:
1. [Imports](#1-imports)
2. [Load Preprocessed Dataset](#2-load-preprocessed-dataset)
3. [Split Dataset](#3-split-dataset)
4. [Normalize Dataset](#4-normalize-dataset)
5. [Autoencoder](#5-autoencoder) <br>
5.1 [Define Autoencoder-Model](#51-define-autoencoder-model) <br>
5.2 [Compile Autoencoder-Model](#52-compile-autoencoder-model) <br>
5.3 [Train Autoencoder](#53-fit-autoencoder-model) <br>
5.4 [Visualize Reconstruction Error](#54-visualize-reconstruction)
6. [Split Autoencoder at Latent Space](#6-seperate-encoder)
7. [Clustering of Latent Space](#7-clustering-of-latent-space) <br>
7.1 [Define Cluster](#71-define-cluster) <br>
7.2 [Clustering](#72-clustering) <br>
7.3 [Visualize Cluster](#73-visualize-clusters) <br>
8. [Conclusion](#8-conclusion)

## 1. Imports
Import requiered libraries. <br>
External packages can be installed via the `pip install -r requirements.txt` command or the notebook-cell below.

In [None]:
! pip install -r ../requirements.txt

In [43]:
# Import built-in libraries
import warnings
warnings.filterwarnings("ignore", message=".*The 'nopython' keyword.*") # Suppress SHAP warnings

# Import datascience libraries
import numpy as np

# Import preprocessing-libraries, classification metrics
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn.metrics import f1_score, roc_auc_score, precision_score, recall_score
from imblearn.metrics import geometric_mean_score

# Import visualization libraries
import plotly.graph_objects as go
from prettytable import PrettyTable

# Import neural network framework & layers
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv1D, MaxPooling1D, UpSampling1D, Dense, Flatten, BatchNormalization
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping
from tslearn.clustering import TimeSeriesKMeans

# Import explainability library
import shap


## 2. Load Preprocessed Dataset
In order to load the preprocessed dataset, that was created with the notebook `00_Preprocessing.ipynb`, is loaded and the numpy Arrays for the features and labels are extracted. <br>
To enshure a functional distribution of the classes in the dataset, the classes with the respective amounts are plotted.

In [2]:
dataset = np.load('../00_Data/Processed-Data/classification_dataset_max.npz') # Load compressed numpy array
X = dataset["features"] # Extract feature-array from compressed file
y = dataset["labels"] # Extract label-array from compressed file

In [79]:
channels = ['F8-T8', 'T7-FT9', 'F4-C4', 'C3-P3', 'P7-T7', 'P7-O1', 'T8-P8', 'FP1-F7', 'P8-O2', 'T7-P7', 'C4-P4', 'FT10-T8', 'P4-O2', 'F7-T7', 'CZ-PZ', 'FP2-F8', 'P3-O1', 'FP1-F3','FP2-F4', 'FZ-CZ', 'F3-C3', 'FT9-FT10', 'age', 'gender']

In [3]:
print("Shapes: \n X:", X.shape, "y:", y.shape)
print("Unique Values:", np.unique(y, return_counts=True))

Shapes: 
 X: (16999, 1000, 24) y: (16999, 1)
Unique Values: (array([0, 1], dtype=int8), array([9375, 7624]))


## 3. Split Dataset
In order to validate and test the trained classifier, the dataset must be split into a `train`, `test`, and `validation` subset. <br>
To preserve an equal distribution within each split, the `stratify`-option is enabled.

In [4]:
X_train, X_rest, y_train, y_rest = train_test_split(X, y, test_size=0.4, shuffle=True, stratify=np.ravel(y), random_state=34)
X_test, X_val, y_test, y_val = train_test_split(X_rest, y_rest, test_size=0.5, shuffle=True, stratify=np.ravel(y_rest), random_state=34)

## 4. Normalize Dataset
When working with neural networks, it is imperative to normalize the data bevore training and testing. This enshures a faster training, avoids numerical instablities and provides a better generalization of the neural network. However with EEG-data, there are additional requirements due to the different characteristics and value-ranges of the individual channels. Therefore, the normalization is done channel by channel based on the training-subset and applied on the test- and validation-split.

In [5]:
def normalize_features(X_train:np.ndarray, X_test:np.ndarray, X_val:np.ndarray, use_standard_scaler:bool=False) -> tuple:
    if(use_standard_scaler):
        scaler = StandardScaler() # Create Z-Score normalizer
    else:
        scaler = MinMaxScaler() # Create Min-Max normalizer
    X_train_norm = np.zeros(shape=(X_train.shape), dtype='float32') # Create empty array for normalized train-data
    X_test_norm = np.zeros(shape=(X_test.shape), dtype='float32') # Create empty array for normalized test-data
    X_val_norm = np.zeros(shape=(X_val.shape), dtype='float32') # Create empty array for normalized val-data
    for feature_col in range(X_train.shape[2]): # Iterate over features in dataset
        X_train_norm[:,:,feature_col] = scaler.fit_transform(X_train[:,:,feature_col]) # Fit and apply normalizer on current feature in train subset
        X_test_norm[:,:,feature_col] = scaler.transform(X_test[:,:,feature_col]) # Apply normalizer on current feature in test subset
        X_val_norm[:,:,feature_col] = scaler.transform(X_val[:,:,feature_col]) # Apply normalizer on current feature in val subset
    return X_train_norm, X_test_norm, X_val_norm

In [6]:
X_train_normalized, X_test_normalized, X_val_normalized = normalize_features(X_train, X_test, X_val, True)

## 5. Autoencoder
The following section contains the data-preperation, build and training of the autoencoder. <br>

<b>What is an autoencoder?</b><br>
An autoencoder is a neural network architecture, that is used for unsupervised machine learning tasks. It consists out of two main components: The encoder & decoder. The encoder takes the input data and transforms it into a lower dimensional representation of the data, the so-called "latent-space". The decoder-part takes this data and tries to reconstruct the original input data. The main target during the training-phase is to minimize the reconstruction error. <br>

<b>How can autoencoders be used for the detection of epileptic seizures in EEG-data?</b><br>
There are two options how autoencoders can be used for the detection of epileptic seizures in EEG-data: Reconstruction-Error & Latent-Space. <br>
By training the autoencoder only on data that does not contain any epileptic seizures, the reconstruction error for "normal" data is minimized. That means that if a sample with an active seizure is predicted, the reconstruction error will be increased. By defining an error-threshold, a binary classification can be performed to seperate normal samples from samples with an epileptic seizure.

The second option is to use the latent space for the classification. The autoencoder is trained on the complete dataset with the same task of minimizing the reconstruction error. By seperating the decoding-component from the autoencoder, the latent space is exposed. Because of the differences in the data when an epileptic seizure is present, the representation of these samples must be different in the reduced space. Based on this assumption, a classification by using a clustering-approach can be done.

The following code contains the second approach.
### 5.1 Define Autoencoder-Model
**Adam** is used as the optimizer and the **binary crossentropy** used for the loss function.

*Note: This project was build on M1 mac. To enshure best performance, the legacy version of the Adam optimizer was used. This can be changed when using windows or newer Versions of Tensorflow!*

In [19]:
X_train_anomalies = X_train_normalized[np.where(y_train == 1)[0]]

X_val_anomalies = X_val_normalized[np.where(y_val == 1)[0]]

X_test_anomalies = X_test_normalized[np.where(y_test == 1)[0]]

In [8]:
def build_and_compile_ae(train_shape:tuple, initla_lr:float=0.0001):
    inputs = Input(shape=(train_shape[1], train_shape[2]))
    # Encoder
    E1 = Conv1D(filters=24, kernel_size=3, activation='relu', padding='same')(inputs)
    E2 = MaxPooling1D(pool_size=2, padding='same')(E1)
    E3 = Conv1D(filters=12, kernel_size=3, activation='relu', padding='same')(E2)
    E4 = MaxPooling1D(pool_size=2, padding='same')(E3)
    latent_vector = Conv1D(filters=6, kernel_size=3, activation='relu', padding='same')(E4)
    # Decoder
    D1 = Conv1D(filters=6, kernel_size=3, activation='relu', padding='same')(latent_vector)
    D2 = UpSampling1D(size=2)(D1)
    D3 = Conv1D(filters=12, kernel_size=3, activation='relu', padding='same')(D2)
    D4 = UpSampling1D(size=2)(D3)
    D5 = Conv1D(filters=24, kernel_size=3, activation='relu', padding='same')(D4)
    ae_outputs = Dense(train_shape[2])(D5)

    autoencoder = Model(inputs=inputs, outputs=ae_outputs)
    opt = tf.keras.optimizers.legacy.Adam(learning_rate=0.001)

    autoencoder.compile(optimizer=opt, loss='mse')
    return autoencoder, inputs, latent_vector

### 5.2 Compile Autoencoder-Model
In order to train the neural network, the model defined in the previous step must be compiled. Furthermore, two **callbacks** are defined: **Early Stopping** and a **dynamic Learning Rate**. Both callbacks monitor the validation loss and stop the fitting process after a defined number of epochs without any improvement or reduce the learing rate to enable better results. Finally, the model is plottet to visualize the structure, number of parameters and enable the detection of errors in the definition.

In [9]:
autoencoder, ae_inputs, latent_vector = build_and_compile_ae(X_train.shape, 0.001)

earlystopper_ae = EarlyStopping(monitor='val_loss', patience=10, start_from_epoch=10, restore_best_weights=True, verbose=1)
reduce_lr_ae = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=0.0000001, verbose=1, cooldown=10)

autoencoder.summary()

Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 1000, 24)]        0         
                                                                 
 conv1d (Conv1D)             (None, 1000, 24)          1752      
                                                                 
 max_pooling1d (MaxPooling1D  (None, 500, 24)          0         
 )                                                               
                                                                 
 conv1d_1 (Conv1D)           (None, 500, 12)           876       
                                                                 
 max_pooling1d_1 (MaxPooling  (None, 250, 12)          0         
 1D)                                                             
                                              

### 5.3 Fit Autoencoder-Model
Fitting the neural network stands for the training process of the weights and bias of each layer to enable good predictions. The normalized training data is used for the training step and a validation is performed after every epoch via the normalized validation split to detect and prevent overfitting. Different to a "normal" training process using the features and labels, the feature-data is used for both input and output. After training, the progression of the training and validation loss is visualized for the easy visualization of the fit-process and further detection of overfitting or other issues during the training. In addition, an example of the original and reconstructed data is plottet to see the capabilities of the autoencoder. Finally, the model is saved as an .h5 file to be able to load the model an/or weights during later testing without new training.

In [10]:
history_ae = autoencoder.fit(
    X_train_anomalies, 
    X_train_anomalies,
    epochs=1000,
    batch_size=50,
    validation_data=(X_val_anomalies, X_val_anomalies),
    shuffle=True,
    verbose=1,
    callbacks=[earlystopper_ae, reduce_lr_ae]
)

2023-07-02 12:00:31.298619: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

In [11]:
fig = go.Figure(
    data = [
        go.Scatter(y=history_ae.history['loss'], name="train"),
        go.Scatter(y=history_ae.history['val_loss'], name="val"),
    ],
    layout = {"yaxis": {"title": "Loss [MSE]"}, "xaxis": {"title": "Epoch"}, "title": "Reconstruction Loss over Epochs"}
)

fig.show()

### 5.4 Visualize Reconstruction

In [78]:
a = np.transpose(X_train_anomalies, (0,2,1))
data = []
for i in range(len(a[0])):
    data.append(
        go.Scatter(y=a[0,i], name=channels[i])
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Original Channel Value"}, "xaxis": {"title": "Timestep"}, "title": "Reconstructed EEG-Sample"}
)

fig.show()

In [80]:
pred = autoencoder.predict(X_train_anomalies[:1])
a = np.transpose(pred, (0,2,1))
data = []
for i in range(len(a[0])):
    data.append(
        go.Scatter(y=a[0,i], name=channels[i])
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Normalized Channel Value"}, "xaxis": {"title": "Timestep"}, "title": "Reconstructed EEG-Sample"}
)

fig.show()



In [16]:
# autoencoder.save('../99_Assets/01_Saved Models/04_autoencoder_latent_space_clustering.h5')
autoencoder = tf.keras.models.load_model('../99_Assets/01_Saved Models/04_autoencoder_latent_space_clustering.h5')

## 6. Seperate Encoder & Create Dataset
After training the autoencoder, it is split at the latent space and a new model is formed containing only the encoder. With this, a dataset of the latent space is created for the test samples, which can be used for clustering in the next step.

In [18]:
encoder = Model(inputs=ae_inputs, outputs=latent_vector)

In [25]:
cluster_space = encoder.predict(X_test_anomalies)



## 7. Clustering of latent space
After the creation, training and splitting of the autoencoder, a clustering of the latent space can be performed. 

### 7.1 Define Cluster
Due to the presence of time series formats in the latent space, a TimeSeriesKMeans is applied for clustering. Since the main goal of clustering is to distinguish between local and generalized seizures, two clusters are formed.

In [52]:
ts_kmeans = TimeSeriesKMeans(n_clusters=2, metric="euclidean", max_iter=50, random_state=0)
ts_kmeans.fit(cluster_space)

### 7.2 Clustering
Using the latent space of the test split as well as the defined TimeSeriesKMeans, the clustering is now performed and the results are stored.

In [53]:
pred = ts_kmeans.predict(cluster_space)

In [56]:
cluster_1 = X_test_anomalies[np.where(pred == 0)[0]]
cluster_2 = X_test_anomalies[np.where(pred == 1)[0]]

### 7.3 Visualize Clusters
Due to the high dimensionality as well as the time series format, no simple visualization of the created clusters is possible. Therefore, by comparing samples of the two clusters, we try to draw a conclusion about the distinction made.

In [81]:
a = np.transpose(cluster_1, (0,2,1))
data = []
for i in range(len(a[0])):
    data.append(
        go.Scatter(y=a[0,i], name=channels[i])
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Normalized Channel Value"}, "xaxis": {"title": "Timestep"}, "title": "EEG-Sample - Cluster 1"}
)

fig.show()

In [82]:
a = np.transpose(cluster_2, (0,2,1))
data = []
for i in range(len(a[0])):
    data.append(
        go.Scatter(y=a[0,i], name=channels[i])
    )

fig = go.Figure(
    data = data,
    layout = {"yaxis": {"title": "Normalized Channel Value"}, "xaxis": {"title": "Epoch"}, "title": "EEG-Sample - Cluster 2"}
)

fig.show()

## 8. Conclusion