<a href="https://colab.research.google.com/github/LeoBaro/phd/blob/main/rtapipe/analysis/Untitled.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
! python3 -c 'import tensorflow as tf; print(tf.__version__)'  # for Python 3

2021-03-23 17:13:44.410731: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-23 17:13:44.410789: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2.4.1


# Sequence-to-Sequence Prediction Problems
Sequence prediction often involves forecasting the next value in a real valued sequence or outputting a class label for an input sequence.

This is often framed as a sequence of one input time step to one output time step (e.g. one-to-one) or multiple input time steps to one output time step (many-to-one) type sequence prediction problem.

One approach to seq2seq prediction problems that has proven very effective is called the Encoder-Decoder LSTM.

## Encoder-Decoder LSTM 
The LSTM network can be organized into an architecture called the Encoder-Decoder LSTM that allows the model to be used to both support variable length input sequences and to predict or output variable length output sequences.

In this architecture, an encoder LSTM model reads the input sequence step-by-step. After reading in the entire input sequence, the hidden state or output of this model represents an internal learned representation of the entire input sequence as a fixed-length vector. This vector is then provided as an input to the decoder model that interprets it as each step in the output sequence is generated
This architecture is comprised of two models: one for reading the input sequence and encoding it into a fixed-length vector, and a second for decoding the fixed-length vector and outputting the predicted sequence. The use of the models in concert gives the architecture its name of Encoder-Decoder LSTM designed specifically for seq2seq problems.
The innovation of this architecture is the use of a fixed-sized internal representation in the heart of the model that input sequences are read to and output sequences are read from. For this reason, the method may be referred to as sequence embedding.

    … RNN Encoder-Decoder, consists of two recurrent neural networks (RNN) that act as an encoder and a decoder pair. The encoder maps a variable-length source sequence to a fixed-length vector, and the decoder maps the vector representation back to a variable-length target sequence.

    — Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014.

The Encoder-Decoder LSTM was developed for natural language processing problems where it demonstrated state-of-the-art performance, specifically in the area of text translation called statistical machine translation. 

    The proposed RNN Encoder-Decoder naturally generates a continuous-space representation of a phrase. […] From the visualization, it is clear that the RNN Encoder-Decoder captures both semantic and syntactic structures of the phrases

    — Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014.

## Keras implementation

For a given dataset of sequences, an encoder-decoder LSTM is configured to read the input sequence, encode it, decode it, and recreate it. The performance of the model is evaluated based on the model’s ability to recreate the input sequence.

Once the model achieves a desired level of performance recreating the sequence, the decoder part of the model may be removed, leaving just the encoder model. This model can then be used to encode input sequences to a fixed-length vector.

The resulting vectors can then be used in a variety of applications, not least as a compressed representation of the sequence as an input to another supervised learning model.

We can think of the model as being comprised of two key parts: the encoder and the decoder.

One or more LSTM layers can be used to implement the encoder model. The output of this model is a fixed-size vector that represents the internal representation of the input sequence. The number of memory cells in this layer defines the length of this fixed-sized vector.

In [None]:
from os import getcwd
import os.path
import numpy as np
import pandas as pd
from pathlib import Path
from tensorflow import keras
from tensorflow.keras import layers
from matplotlib import pyplot as plt


In [None]:
datapath = Path("/data01/home/baroncelli/phd/repos/phd/rtapipe/analysis/notebook_dataset_generation_for_models_output")
datapath

In [None]:
currentdir = getcwd()
currentdir

In [None]:
outdir = Path(currentdir).joinpath("notebook_lstm_output")
outdir

In [None]:
master_url_root = "https://raw.githubusercontent.com/numenta/NAB/master/data/"

df_small_noise_url_suffix = "artificialNoAnomaly/art_daily_small_noise.csv"
df_small_noise_url = master_url_root + df_small_noise_url_suffix
df_small_noise = pd.read_csv(
    df_small_noise_url, parse_dates=True, index_col="timestamp"
)

df_daily_jumpsup_url_suffix = "artificialWithAnomaly/art_daily_jumpsup.csv"
df_daily_jumpsup_url = master_url_root + df_daily_jumpsup_url_suffix
df_daily_jumpsup = pd.read_csv(
    df_daily_jumpsup_url, parse_dates=True, index_col="timestamp"
)

In [None]:
print(df_small_noise.head())

print(df_daily_jumpsup.head())

In [None]:
fig, ax = plt.subplots()
df_small_noise.plot(legend=False, ax=ax)
plt.show()

In [None]:
fig, ax = plt.subplots()
df_daily_jumpsup.plot(legend=False, ax=ax)
plt.show()

In [None]:
training_mean = df_small_noise.mean()
training_std = df_small_noise.std()
df_training_value = (df_small_noise - training_mean) / training_std
print("Number of training samples:", len(df_training_value))

In [None]:
df_training_value.head()

In [None]:
TIME_STEPS = 288

# Generated training sequences for use in the model.
def create_sequences(values, time_steps=TIME_STEPS):
    output = []
    for i in range(len(values) - time_steps):
        output.append(values[i : (i + time_steps)])
    return np.stack(output)


x_train = create_sequences(df_training_value.values)
print("Training input shape: ", x_train.shape)

## Convolutional Autoencoder model

In [None]:
modelConv = keras.Sequential(
    [
        layers.Input(shape=(x_train.shape[1], x_train.shape[2])),
        layers.Conv1D(
            filters=32, kernel_size=7, padding="same", strides=2, activation="relu"
        ),
        layers.Dropout(rate=0.2),
        layers.Conv1D(
            filters=16, kernel_size=7, padding="same", strides=2, activation="relu"
        ),
        layers.Conv1DTranspose(
            filters=16, kernel_size=7, padding="same", strides=2, activation="relu"
        ),
        layers.Dropout(rate=0.2),
        layers.Conv1DTranspose(
            filters=32, kernel_size=7, padding="same", strides=2, activation="relu"
        ),
        layers.Conv1DTranspose(filters=1, kernel_size=7, padding="same"),
    ]
)
modelConv.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001), loss="mse")
modelConv.summary()

## LSTM Autoencoder

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Input, Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.models import Model

from sklearn.preprocessing import MinMaxScaler, StandardScaler

In [None]:
modelLSTM = Sequential()
modelLSTM.add(LSTM(64, activation='relu', input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=True))
modelLSTM.add(LSTM(32, activation='relu', return_sequences=False))
modelLSTM.add(RepeatVector(x_train.shape[1]))
modelLSTM.add(LSTM(32, activation='relu', return_sequences=True))
modelLSTM.add(LSTM(64, activation='relu', return_sequences=True))
modelLSTM.add(TimeDistributed(Dense(x_train.shape[2])))

modelLSTM.compile(optimizer='adam', loss='mse')
modelLSTM.summary()

In [None]:
modelLSTM2 = Sequential()
modelLSTM2.add(LSTM(128, input_shape=(x_train.shape[1], x_train.shape[2])))
modelLSTM2.add(Dropout(rate=0.2))
modelLSTM2.add(RepeatVector(x_train.shape[1]))
modelLSTM2.add(LSTM(128, return_sequences=True))
modelLSTM2.add(Dropout(rate=0.2))
modelLSTM2.add(TimeDistributed(Dense(x_train.shape[2])))

modelLSTM2.compile(optimizer='adam', loss='mae')
modelLSTM2.summary()

## Models Training

In [None]:
checkpoint_path_lstm = Path("./training_lstm/cp.ckpt")
checkpoint_path_lstm.mkdir(exist_ok=True, parents=True)

checkpoint_path_lstm2 = Path("./training_lstm2/cp.ckpt")
checkpoint_path_lstm2.mkdir(exist_ok=True, parents=True)

checkpoint_path_conv = Path("./training_conv/cp.ckpt")
checkpoint_path_conv.mkdir(exist_ok=True, parents=True)

In [None]:
# Create a callback that saves the model's weights
cp_callback_lstm = keras.callbacks.ModelCheckpoint(filepath=str(checkpoint_path_lstm), save_weights_only=True, verbose=1)
cp_callback_lstm2 = keras.callbacks.ModelCheckpoint(filepath=str(checkpoint_path_lstm2), save_weights_only=True, verbose=1)
cp_callback_conv = keras.callbacks.ModelCheckpoint(filepath=str(checkpoint_path_conv), save_weights_only=True, verbose=1)

In [None]:
epochs=20

In [None]:
modelLSTM.load_weights(str(checkpoint_path_lstm))

In [None]:
modelLSTM2.load_weights(str(checkpoint_path_lstm2))

In [None]:
modelConv.load_weights(str(checkpoint_path_conv))

In [None]:
# modelLSTMHistory = modelLSTM.fit(x_train, x_train, epochs=2, batch_size=128, validation_split=0.1, verbose=1, callbacks=[cp_callback_lstm])

In [None]:
# modelLSTM2History = modelLSTM2.fit(x_train, x_train, epochs=epochs, batch_size=128, validation_split=0.1, verbose=1, callbacks=[cp_callback_lstm2])

In [None]:
modelConvHistory = modelConv.fit(x_train, x_train, epochs=50, batch_size=128, validation_split=0.1,
    callbacks=[
        keras.callbacks.EarlyStopping(monitor="val_loss", patience=5, mode="min"),
        cp_callback_conv
    ],
)

In [None]:
plt.plot(modelConvHistory.history["loss"], label="CONV Training Loss", color="grey")
plt.plot(modelConvHistory.history["val_loss"], label="CONV Validation Loss", color="grey", linestyle="--")


plt.plot(modelLSTMHistory.history["loss"], label="LSTM Training Loss", color="orange")
plt.plot(modelLSTMHistory.history["val_loss"], label="LSTM Validation Loss", color="orange", linestyle="--")


plt.plot(modelLSTM2History.history["loss"], label="LSTM2 Training Loss", color="green")
plt.plot(modelLSTM2History.history["val_loss"], label="LSTM2 Validation Loss", color="green", linestyle="--")

plt.legend()
plt.show()

## Loss 

In [None]:
def plotLoss(pred_data, real, labels=[""]):
    for i, pred in enumerate(pred_data):
        train_mae_loss = np.mean(np.abs(pred - real), axis=1)
        plt.hist(train_mae_loss, bins=50, label=labels[i])
        plt.xlabel("Train MAE loss")
        plt.ylabel("No of samples")
        # Get reconstruction loss threshold.
        threshold = np.max(train_mae_loss)
        print("Reconstruction error threshold: ", threshold)    
    plt.legend()
    plt.show()

In [None]:
autoencoder_predictions_on_training = modelConv.predict(x_train)


In [None]:
lstm_predictions_on_training = modelLSTM.predict(x_train)

In [None]:
lstm2_predictions_on_training = modelLSTM2.predict(x_train)

In [None]:
print(autoencoder_predictions_on_training.shape)
print(lstm2_predictions_on_training.shape)

In [None]:
plotLoss([autoencoder_predictions_on_training, lstm_predictions_on_training, lstm2_predictions_on_training], x_train, labels=["Conv","Lstm","Lstm2"])

In [None]:
plt.plot(x_train[0])
plt.plot(autoencoder_predictions_on_training[0], label="Conv")
plt.plot(lstm_predictions_on_training[0], label="Lstm")
plt.plot(lstm2_predictions_on_training[0], label="Lstm")
plt.legend()
plt.show()


In [None]:
# Get train MAE loss.
x_train_pred = model.predict(x_train)
train_mae_loss = np.mean(np.abs(x_train_pred - x_train), axis=1)

plt.hist(train_mae_loss, bins=50)
plt.xlabel("Train MAE loss")
plt.ylabel("No of samples")
plt.show()

# Get reconstruction loss threshold.
threshold = np.max(train_mae_loss)
print("Reconstruction error threshold: ", threshold)

In [None]:
plt.plot(x_train[0])
plt.plot(x_train_pred[0])
plt.show()

In [None]:
model = keras.Sequential()
model.add(layers.LSTM(50, activation='relu', input_shape=(n_in,1)))
model.add(layers.RepeatVector(n_in))
model.add(layers.LSTM(50, activation='relu', return_sequences=True))
model.add(layers.TimeDistributed(layers.Dense(1)))
model.compile(optimizer='adam', loss='mse')
