#Deep Learning Exercise 9 - Time Series Anomaly detection

This exercise is about detection of the anomaly time series data. It does not focus on the time series elements, but on the whole time series that differs from the standard. 

Data we will use come from [Time Series Classification Website](https://www.timeseriesclassification.com/dataset.php), we will use sensor data from ECG datasets, but we will take a *Normal* class as a base and other classes as anomalies.


[Open in Google colab](https://colab.research.google.com/github/jplatos/VSB-FEI-Deep-Learning/blob/master/dl_09_time_series_anomalies.ipynb) [Download from Github](https://raw.githubusercontent.com/jplatos/VSB-FEI-Deep-Learning/main/dl_09_time_series_anomalies.ipynb)


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.io.arff import loadarff 
import plotly.express as px
pd.options.plotting.backend = "plotly"
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow.keras as keras
from sklearn.metrics import mean_absolute_error

tf.version.VERSION

### Dataset preparation
The dataset was taken from the above link and train and test data were joined together. Data contain 140 measurement of the ECG signal with 5 different classes. One *Normal* (class 1) and other classes are different hearth problems. 

Part of the class 1 will taken as a training and validation data. Other data will be taken as a testing data. The goal is to show, how to detect non-standard time series run that differs from the normal run.

In [None]:
df = pd.read_feather('https://github.com/jplatos/VSB-FEI-Deep-Learning/raw/main/datasets/ecg5000.feather')

In [None]:
df.shape

In [None]:
df

In [None]:
df = df.sample(frac=1).reset_index(drop=True) # shuffle and reset data index

In [None]:
df # see the target value change

#### See the frequency of each class in the data

In [None]:
df.target.value_counts().plot.bar()

#### Extract the normal class and the anomaly part of the data a drop the target values.

Then reshape the data to the form that is suitable for the recurrent models *(number of measurement, number of features, record)*

In [None]:
normal = df[df.target==1].drop(columns=['target']).values
normal.shape

In [None]:
anomaly = df[df.target!=1].drop(columns=['target']).values
anomaly.shape

In [None]:
sh = normal.shape
normal = np.reshape(normal, (sh[0],1, sh[1]))
normal.shape

In [None]:
sh = anomaly.shape
anomaly = np.reshape(anomaly, (sh[0],1, sh[1]))
anomaly.shape

Split the *normal* data into three groups, *train*, *validation*, *test*. The ratio between the is in the form *70%:12%:18%*.

In [None]:
train, test = train_test_split(normal, test_size=0.3, random_state=42)
val, test = train_test_split(test, test_size=0.6, random_state=42)
train.shape, val.shape, test.shape

### Model preparation

Model is very simple, it is a *Autoencoder* using the LSTM layers. The first layer encodes the input and the second encodes the data into a compressed form. The Decoder reconstruct the data into a original form.


In [None]:
seq_len = 140
features = 1

model = keras.Sequential([
    # encoder
    keras.layers.LSTM(128, input_shape=train[0].shape, return_sequences = True),
    keras.layers.LSTM(32, input_shape=train[0].shape, return_sequences = True),
    # decoder
    keras.layers.LSTM(32, input_shape=train[0].shape, return_sequences = True),
    keras.layers.LSTM(128, input_shape=train[0].shape, return_sequences = True),
    keras.layers.Dense(seq_len, activation='linear')
])
model.summary()

In [None]:
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics='mae'
)

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100, restore_best_weights=True)


In [None]:
history = model.fit(train, train, validation_data=(val, val), epochs=100, callbacks=[early_stopping])

In [None]:
px.line(history.history)

### Training evaluation
Lets see the quality of reconstruction on the 42nd records in the training dataset, on the testing dataset and anomaly dataset. As may be seen, the reconstruction of the *Normal* dataset in train and test set is good, the quality of the anomaly show bigger differences. 

In [None]:
index = 42
database = np.asarray([train[index], test[index], anomaly[index]])
predicted = model.predict(database)

In [None]:
px.line({'Real':database[0][0], 'Predicted':predicted[0][0]}, title='Train')

In [None]:
px.line({'Real':database[1][0], 'Predicted':predicted[1][0]}, title='Test')

In [None]:
px.line({'Real':database[2][0], 'Predicted':predicted[2][0]}, title='Anomaly')

### Reconstruction error measure
The error in reconstruction is the mesure for anomaly detection. a MeanAbsoluteError is a good choice. Sometimes, a sum of absolute error is used, but the results is almost the same. 

First of all, lets see the histogram of differences on the train data.

In [None]:
train_pred = model.predict(train)
differences = [mean_absolute_error(real, pred) for (real, pred) in zip(train, train_pred)]
px.histogram(differences, title='Train data reconstruction error')

Then look at the reconstruction of on the test dataset.

In [None]:
test_pred = model.predict(test)
differences = [mean_absolute_error(real, pred) for (real, pred) in zip(test, test_pred)]
px.histogram(differences, title='Test data reconstruction error')

#### Selection of the proper Threshold
The Threshold that distingushes between a normal time serie and a anomaly one is the critical part. Too high threshold leads to the high misclassification in false normal. Too low threshold leads to the high false anomaly rate. 

In [None]:
def evaluate_prediction(model, datasets, names, threshold):
  results = [f"{'Dataset':>10}{'Normal':>10}{'Anomaly':>10}\n"]
  for (name, dataset) in zip(names, datasets):
    predicted = model.predict(dataset)
    differences = [mean_absolute_error(real, pred) for (real, pred) in zip(dataset, predicted)]
    results.append(f'{name:>10}{sum(l<=threshold for l in differences):>10}{sum(l>threshold for l in differences):>10}\n')
  print(*results)


In [None]:
threshold = 0.2
evaluate_prediction(model, [train, test, anomaly], ['Train', 'Test', 'Anomaly'], threshold)

### Evaluation
As may be seen, the threshold set to 0.2 leds to nice results where less than 1% of training samples are misclassified, 2% of test samples and 11% of anomaly samples are missclassified too. 

Everything dependson the setting of the LSTM autoencoder and the size of the compressed representation.

## Tasks
1. Try different size of the LSTM autoencoder.
2. Try a Conv1D eautoencoder to better cover the specificity of the encoder. 
3. Select the proper of the Threshold value.

### References
1. [LSTM Autoencoder for Anomaly Detection for ECG data
 by Abhishek Shah](https://medium.com/@jwbtmf/lstm-autoencoder-for-anomaly-detection-for-ecg-data-5c0b07d00e50)
2. [Real-Time Anomaly Detection With Python by Anthony Cavin](https://towardsdatascience.com/real-time-anomaly-detection-with-python-36e3455e84e2)