<a id='AlternativeModelsEnsemble-RNN-GRUTop'></a>
# Train Ensembles of Alternative Recurrent Neural Network Models

Comparing performance of alterative recurrent neural network (RNN) models, the [traditional RNN](#RNN) and [gated recurrent unit (GRU)](#GRU).

- Architecture, kept constant across LSTM, GRU and RNN models for a fair comparison, is:
    - input layer - 300
    - hidden layer - 100
    - output layer - 2 (softplus activation, to ensure variance predictions are positive)
- batch size = 32
- Number of base learners in ensemble = 10
- 3:1:1 train:validation:test dataset split
- 2000 epochs

The model from the epoch which gives the best validation loss is saved for future use, for making predictions on the previously unseen experimental test set

Minimum loss is -ln(minimum_variance)/2 = -6.91 (for a minimum variance chosen to be 1e-6)

<a id='GRU'></a>
# Train Ensemble of Gated Recurrent Unit (GRU) Networks
[return to top](#AlternativeModelsEnsemble-RNN-GRUTop)

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from LSTMutils import MeanVarianceLogLikelyhoodLoss
from sklearn.model_selection import train_test_split
import LSTMutils

# input parameters
NumEnsemble = 10
SequenceLength = 250
validation_split = 0.25
batch_size = 32
NumEpochs = 2000
test_split = 0.2

# set random seeds
np.random.seed(42)
tf.random.set_seed(42)

# read experimental dataset
ExperimentalData = LSTMutils.ExperimentalData(SequenceLength=SequenceLength)
unused, concentrations, df_data, unused = ExperimentalData.ReadData()

# split data into stratified train and test sets, size defined by the test_split variable
# the split will always be the same provided the data is in the same order, the same random_state is used,
# and strangely the labels used for stratification are always the same type (str is used here)
df_train, df_test = train_test_split(df_data, test_size=test_split, train_size=1-test_split, random_state=42, shuffle=True, stratify=concentrations)

# normalise time series data
df_norm_train, df_norm_test, unused = ExperimentalData.NormalizeData(df_train,df_test)
    
# Define y as the last element in X, and ensure X and y are the correct shape
X_train, y_train = ExperimentalData.Shape(df_norm_train)

# train NumEnsemble base learners, minimizing negative log likelyhood loss for mean and variance predictions
# implementation follows this work: doi.org/10.48550/arXiv.1612.01474
for i in range(NumEnsemble):
    
    model = keras.models.Sequential([keras.layers.GRU(300, input_shape=(SequenceLength,1), return_sequences=True)
                                  , keras.layers.GRU(100, return_sequences=True)
                                 , keras.layers.GRU(2, activation='softplus',return_sequences=True)])
    
    # save the model at the epoch which gives the lowest loss predictions on the validataion dataset
    checkpoint_filepath = r"../Models/AlternativeModels/GRU/EnsembleModel" + str(i+1)
    model_checkpoint_callback = keras.callbacks.ModelCheckpoint(
        filepath=checkpoint_filepath,
        monitor='val_loss',
        mode='min',
        save_best_only=True)
        
    model.compile(optimizer="adam",loss = MeanVarianceLogLikelyhoodLoss)

    history = model.fit(X_train, y_train, batch_size=batch_size, validation_split=validation_split, epochs=NumEpochs, callbacks=[model_checkpoint_callback])

    # plot loss vs epochs
    Evaluation = LSTMutils.ModelTrainingEvaluation()
    Evaluation.PlotLossHistory(history)

    # load and evaluate the best model, in terms of validation loss
    bestModel = keras.models.load_model(checkpoint_filepath, custom_objects={"MeanVarianceLogLikelyhoodLoss": MeanVarianceLogLikelyhoodLoss})
    bestModel.evaluate(X_train, y_train, batch_size=batch_size)

<a id='RNN'></a>
# Train Ensemble of RNN Networks
[return to top](#AlternativeModelsEnsemble-RNN-GRUTop)

In [None]:
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import math
from LSTMutils import MeanVarianceLogLikelyhoodLoss
from sklearn.model_selection import train_test_split

# input parameters
NumEnsemble = 10
SequenceLength = 250
validation_split = 0.25
batch_size = 32
NumEpochs = 2000
test_split = 0.2

# set random seeds
np.random.seed(42)
tf.random.set_seed(42)

# read experimental dataset
ExperimentalData = LSTMutils.ExperimentalData(SequenceLength=SequenceLength)
unused, concentrations, df_data, unused = ExperimentalData.ReadData()

# split data into stratified train and test sets, size defined by the test_split variable
# the split will always be the same provided the data is in the same order, the same random_state is used,
# and strangely the labels used for stratification are always the same type (str is used here)
df_train, df_test = train_test_split(df_data, test_size=test_split, train_size=1-test_split, random_state=42, shuffle=True, stratify=concentrations)

# normalise time series data
df_norm_train, df_norm_test, unused = ExperimentalData.NormalizeData(df_train,df_test)
    
# Define y as the last element in X, and ensure X and y are the correct shape
X_train, y_train = ExperimentalData.Shape(df_norm_train)

# train NumEnsemble base learners, minimizing negative log likelyhood loss for mean and variance predictions
# implementation follows this work: doi.org/10.48550/arXiv.1612.01474
for i in range(NumEnsemble):
    
    model = keras.models.Sequential([keras.layers.SimpleRNN(300, input_shape=(SequenceLength,1), return_sequences=True)
                                  , keras.layers.SimpleRNN(100, return_sequences=True)
                                 , keras.layers.SimpleRNN(2, activation='softplus',return_sequences=True)])
    
    # save the model at the epoch which gives the lowest loss predictions on the validataion dataset
    checkpoint_filepath = r"../Models/AlternativeModels/RNN/EnsembleModel" + str(i+1)
    model_checkpoint_callback = keras.callbacks.ModelCheckpoint(
        filepath=checkpoint_filepath,
        monitor='val_loss',
        mode='min',
        save_best_only=True)
        
    model.compile(optimizer="adam",loss = MeanVarianceLogLikelyhoodLoss)

    history = model.fit(X_train, y_train, batch_size=batch_size, validation_split=validation_split, epochs=NumEpochs, callbacks=[model_checkpoint_callback])

    # plot loss vs epochs
    Evaluation = LSTMutils.ModelTrainingEvaluation()
    Evaluation.PlotLossHistory(history)
    
    # load and evaluate the best model, in terms of validation loss
    bestModel = keras.models.load_model(checkpoint_filepath, custom_objects={"MeanVarianceLogLikelyhoodLoss": MeanVarianceLogLikelyhoodLoss})
    bestModel.evaluate(X_train, y_train, batch_size=batch_size)