# GRU neural network for EMS demand predictions
This notebook contains the code for applying neural network models to smart city data <br>


In [None]:
# GPU check:

import tensorflow as tf
tf.test.gpu_device_name()


'/device:GPU:0'

In [None]:
from tensorflow.python.client import device_lib
print("Show System RAM Memory: \n \n")
!cat /proc/meminfo | egrep "MemTotal"
device_lib.list_local_devices()

Show System RAM Memory: 
 

MemTotal:       26751688 kB


[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 559503576835004651, name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 3420919100551386495
 physical_device_desc: "device: XLA_CPU device", name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 7183241010192726250
 physical_device_desc: "device: XLA_GPU device", name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 15473775744
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 7109277902584850429
 physical_device_desc: "device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:04.0, compute capability: 7.0"]

In [None]:
# get additional info about the hardware in the cloud
%cat /proc/cpuinfo
%cat /proc/meminfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 79
model name	: Intel(R) Xeon(R) CPU @ 2.20GHz
stepping	: 0
microcode	: 0x1
cpu MHz		: 2200.000
cache size	: 56320 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa
bogomips	: 4400.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 b

Tutorial about google colab and GPU access: <br>
https://www.youtube.com/watch?v=f1UK8KPt-KU

In [None]:
# this allows for accessing files stored in your google drive using the path "/gdrive/My Drive/"
# mounting google drive locally:

from google.colab import drive
drive.mount('/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly&response_type=code

Enter your authorization code:
4/4AHiM91JRR2t5i8dvaTEN7UfwchOwsLcYrLz_VE2ftYhfh-nGOg6W4Y
Mounted at /gdrive


$\textbf{Background:}$ Tensors are data structures that you can think of as multi-dimensional arrays. Tensors are represented as n-dimensional arrays of base dataypes such as a string or integer -- they provide a way to generalize vectors and matrices to higher dimensions. The shape of a Tensor defines its number of dimensions and the size of each dimension. The rank of a Tensor provides the number of dimensions. Scalars can be used to create 0-d Tensors. Vectors and lists can be used to create 1-d Tensors. Matrices can be used to create 2-d or higher rank Tensors. The shape of a Tensor provides the number of elements in each Tensor dimension.

$\textbf{Neural Networks in Tensorflow:}$ We can also define neural networks in TensorFlow. TensorFlow uses a high-level API called Keras that provides a powerful, intuitive framework for building and training deep learning models. <br> 
Tensors can flow through abstract types called $\textit{Layers}$ -- the building blocks of neural networks. Layers implement common neural networks operations, and are used to update weights, compute losses, and define inter-layer connectivity <br>
<br>
Conveniently, TensorFlow has defined a number of Layers that are commonly used in neural networks, for example a Dense. Now, instead of using a single Layer to define our simple neural network, we'll use the Sequential model from Keras and a single Dense layer to define our network. With the Sequential API, you can readily create neural networks by stacking together layers like building blocks.

# Implementation

In [None]:
## -- Packages  -- ##

# General
import pandas as pd
import numpy as np

# Time formatting
import datetime

# Load and save data
import pickle
# progress bar
from tqdm import tqdm

# Plotting
import matplotlib.pyplot as plt
%matplotlib inline
#import tikzplotlib as tkz

In [None]:
##  NN libaries ##
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow.keras.backend as K

from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from math import sqrt


## ML libraries ##
from keras.wrappers.scikit_learn import KerasRegressor

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score

from sklearn.preprocessing import MinMaxScaler

### Load data

In [None]:
#load taxi data. Generated in notebook 'taxi_trips'
filename = '/gdrive/My Drive/Colab Notebooks/emergency_dispatches_bronx_H'
infile = open(filename,'rb')
emergency_ts = pickle.load(infile)
infile.close()

In [None]:
emergency_ts.shape

(8760,)

### Preprocessing

In [None]:
## Remove last 15 days of data since they are either erroneous or part of the test set to the robustness checks
emergency_ts_final = emergency_ts.iloc[0:-360]
emergency_ts_final.shape

(8400,)

In [None]:
## Set paramaters
input_lags = 60 # 2 and a half times the seasonal period
output_lags = 24 # we predict 24 hours ahead
n_test = 24 # output_lags  (changed)


In [None]:
## Split data in train and test set
train = emergency_ts_final[0:-n_test]
test = emergency_ts_final[-n_test:]
print(train.shape)
print(test.shape)

(8376,)
(24,)


In [None]:
## Create lagged values for both input and output window (24)
data = train.copy()
n_train = len(data)

##Create lagged values for input
df = pd.DataFrame()
for i in range(input_lags,0,-1):
    df['t-' + str(i)] = data.shift(i)

##Create lagged values for output
for j in range(0,output_lags,1):
    df['t+' + str(j)] = data.shift(-j)
    
df = df[input_lags:(n_train-output_lags+1)]

In [None]:
df.head()

Unnamed: 0_level_0,t-60,t-59,t-58,t-57,t-56,t-55,t-54,t-53,t-52,t-51,t-50,t-49,t-48,t-47,t-46,t-45,t-44,t-43,t-42,t-41,t-40,t-39,t-38,t-37,t-36,t-35,t-34,t-33,t-32,t-31,t-30,t-29,t-28,t-27,t-26,t-25,t-24,t-23,t-22,t-21,...,t-16,t-15,t-14,t-13,t-12,t-11,t-10,t-9,t-8,t-7,t-6,t-5,t-4,t-3,t-2,t-1,t+0,t+1,t+2,t+3,t+4,t+5,t+6,t+7,t+8,t+9,t+10,t+11,t+12,t+13,t+14,t+15,t+16,t+17,t+18,t+19,t+20,t+21,t+22,t+23
FIRST_ASSIGNMENT_DATETIME,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
2019-01-03 12:00:00,41.0,48.0,36.0,55.0,48.0,27.0,47.0,46.0,58.0,51.0,42.0,50.0,62.0,44.0,47.0,58.0,44.0,53.0,44.0,43.0,44.0,38.0,45.0,35.0,34.0,38.0,26.0,23.0,17.0,21.0,27.0,39.0,37.0,53.0,67.0,50.0,69.0,59.0,66.0,50.0,...,60.0,45.0,41.0,34.0,36.0,24.0,19.0,29.0,21.0,23.0,31.0,26.0,42.0,57.0,63.0,56.0,65,56.0,47.0,59.0,58.0,55.0,48.0,47.0,44.0,36.0,37.0,40.0,39.0,31.0,22.0,17.0,13.0,17.0,31.0,37.0,46.0,48.0,38.0,54.0
2019-01-03 13:00:00,48.0,36.0,55.0,48.0,27.0,47.0,46.0,58.0,51.0,42.0,50.0,62.0,44.0,47.0,58.0,44.0,53.0,44.0,43.0,44.0,38.0,45.0,35.0,34.0,38.0,26.0,23.0,17.0,21.0,27.0,39.0,37.0,53.0,67.0,50.0,69.0,59.0,66.0,50.0,70.0,...,45.0,41.0,34.0,36.0,24.0,19.0,29.0,21.0,23.0,31.0,26.0,42.0,57.0,63.0,56.0,65.0,56,47.0,59.0,58.0,55.0,48.0,47.0,44.0,36.0,37.0,40.0,39.0,31.0,22.0,17.0,13.0,17.0,31.0,37.0,46.0,48.0,38.0,54.0,68.0
2019-01-03 14:00:00,36.0,55.0,48.0,27.0,47.0,46.0,58.0,51.0,42.0,50.0,62.0,44.0,47.0,58.0,44.0,53.0,44.0,43.0,44.0,38.0,45.0,35.0,34.0,38.0,26.0,23.0,17.0,21.0,27.0,39.0,37.0,53.0,67.0,50.0,69.0,59.0,66.0,50.0,70.0,51.0,...,41.0,34.0,36.0,24.0,19.0,29.0,21.0,23.0,31.0,26.0,42.0,57.0,63.0,56.0,65.0,56.0,47,59.0,58.0,55.0,48.0,47.0,44.0,36.0,37.0,40.0,39.0,31.0,22.0,17.0,13.0,17.0,31.0,37.0,46.0,48.0,38.0,54.0,68.0,51.0
2019-01-03 15:00:00,55.0,48.0,27.0,47.0,46.0,58.0,51.0,42.0,50.0,62.0,44.0,47.0,58.0,44.0,53.0,44.0,43.0,44.0,38.0,45.0,35.0,34.0,38.0,26.0,23.0,17.0,21.0,27.0,39.0,37.0,53.0,67.0,50.0,69.0,59.0,66.0,50.0,70.0,51.0,55.0,...,34.0,36.0,24.0,19.0,29.0,21.0,23.0,31.0,26.0,42.0,57.0,63.0,56.0,65.0,56.0,47.0,59,58.0,55.0,48.0,47.0,44.0,36.0,37.0,40.0,39.0,31.0,22.0,17.0,13.0,17.0,31.0,37.0,46.0,48.0,38.0,54.0,68.0,51.0,55.0
2019-01-03 16:00:00,48.0,27.0,47.0,46.0,58.0,51.0,42.0,50.0,62.0,44.0,47.0,58.0,44.0,53.0,44.0,43.0,44.0,38.0,45.0,35.0,34.0,38.0,26.0,23.0,17.0,21.0,27.0,39.0,37.0,53.0,67.0,50.0,69.0,59.0,66.0,50.0,70.0,51.0,55.0,53.0,...,36.0,24.0,19.0,29.0,21.0,23.0,31.0,26.0,42.0,57.0,63.0,56.0,65.0,56.0,47.0,59.0,58,55.0,48.0,47.0,44.0,36.0,37.0,40.0,39.0,31.0,22.0,17.0,13.0,17.0,31.0,37.0,46.0,48.0,38.0,54.0,68.0,51.0,55.0,56.0


In [None]:
## splitting the training set into labels and features
X_train = df.iloc[:,:input_lags] # from the beginning to input_lags
Y_train = df.iloc[:,input_lags:] # from input_lags to the end

## Use the last window of the training set as the features for the test set. This requires a combination of 
## X_train and Y_train.
X_test = X_train.iloc[len(X_train) - 1,:][output_lags:]
X_test = X_test.append(Y_train.iloc[len(Y_train) - 1,:]).values.reshape(1,input_lags)
Y_test = test[:output_lags].values.reshape(1,output_lags)

X_train = X_train.values # 54 steps back (54 lags)
Y_train = Y_train.values # 24 steps ahead

print("X_train: " + "type: " + str(type(X_train)) + "\tshape: " + str(X_train.shape))
print("Y_train: " + "type: " + str(type(Y_train)) + "\tshape: " + str(Y_train.shape))
print("X_test: " + "type: " + str(type(X_test)) + "\tshape: " + str(X_test.shape))
print("Y_test: " + "type: " + str(type(Y_test)) + "\tshape: " + str(Y_test.shape))

X_train: type: <class 'numpy.ndarray'>	shape: (8293, 60)
Y_train: type: <class 'numpy.ndarray'>	shape: (8293, 24)
X_test: type: <class 'numpy.ndarray'>	shape: (1, 60)
Y_test: type: <class 'numpy.ndarray'>	shape: (1, 24)


$\textbf{When Should You Use Normalization And Standardization:}$

Normalization is a good technique to use when you do not know the distribution of your data or when you know the distribution is not Gaussian (a bell curve). Normalization is useful when your data has varying scales and the algorithm you are using does not make assumptions about the distribution of your data, such as k-nearest neighbors and artificial neural networks.

In [None]:
## normalizing the data

# MinMaxScaler() transforms features by scaling each feature to a given range (given by feature_range())
# The cost of having this bounded range is that we will end up with smaller standard deviations, which can 
# suppress the effect of outliers. Thus MinMax Scalar is sensitive to outliers

#scaler = MinMaxScaler(feature_range=(0, 1))
#df_x = df.iloc[:,0:24]
# the method fit_transform() computes the min and the max used for scaling and then carries out the transformation
#df_x_scaled = scaler.fit_transform(df_x)
# later, inverse_transform() can be used to undo the scaling to the feature_range

# normalizing the entire dataset
#df_normalized = scaler.fit_transform(df)

# Building the GRU neural network model

# Method 1 (Keras)

Tensorflow 2.0 Impelementation <br>
(tf.keras is TensorFlow's implementation of the Keras API specification. This is a high-level API to build and train models that includes first-class support for TensorFlow-specific functionality)


TF 2 keras RNN tutorial
https://www.tensorflow.org/guide/keras/rnn <br>
TF 2 time series forecasting tutorial
https://www.tensorflow.org/tutorials/structured_data/time_series

In [None]:
# configuring the inputs for the model
# For Keras, the input has to be in the shape (samples, time steps, features)
# 24 timestep with n features where n is equal to the shape of column [1] of X_train or X_test

X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])

In [None]:
X_train.shape

(8293, 1, 60)

In [None]:
X_test.shape

(1, 1, 60)

In [None]:
# creating a leaky_relu activation function

def my_leaky_relu(x):
    return tf.nn.leaky_relu(x)

## Gridsearch CV for the optimal hyperparameters

Guide to Hyperparameter tuning: <br>
https://towardsdatascience.com/simple-guide-to-hyperparameter-tuning-in-neural-networks-3fe03dad8594
Dropout regularization for RNNs: <br>
https://machinelearningmastery.com/how-to-reduce-overfitting-with-dropout-regularization-in-keras/


## One step Gridsearch CV for all hyperparameters

**Blogpost Hyperparametertuning LSTM/GRU:**

https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/

**Overview of Gradient Descent Algorithms:**

https://ruder.io/optimizing-gradient-descent/index.html#adagrad

In [None]:
# creating the parameter grid as a dictionary

##
batch_size = [100, 200, 400]
epochs = [500, 1000, 1500]
neurons = [1000, 1500, 2000]
dropout = [0.0]
learning_rate = [0.01, 0.001, 0.0005]
optimizer = ['Adam']


param_grid_cv = dict(batch_size=batch_size, epochs=epochs, neurons=neurons, dropout = dropout, optimizer=optimizer, learning_rate = learning_rate)

In [None]:
# refined grid

batch_size = [150, 250]
epochs = [750, 1000]
neurons = [1500, 2000]
dropout = [0.0]
learning_rate = [0,01, 0.001, 0.0001]
optimizer = ['Adam', 'Adadelta']

param_grid_cv = dict(batch_size=batch_size, epochs=epochs, neurons=neurons, dropout = dropout, optimizer=optimizer, learning_rate = learning_rate)

In [None]:
# setting up the model
# the default activation function is tanh()

def model(neurons = 128, epochs = 100, batch_size =100, dropout = 0.0, learning_rate = 0.001, optimizer="Adam"):
    model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
    model.add(tf.keras.layers.GRU(units=neurons, return_sequences = False, input_shape = (1,60), dropout = dropout, activation = my_leaky_relu))

# output layer with 24 neurons
    model.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = learning_rate),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
    
    return model


In [None]:
# using the KerasRegressor as a wrapper to carry out the GridSearchCV
model_cv = KerasRegressor(build_fn = model, verbose=1)

In [None]:
# k fold CV (NEW GRID for EMS data)
inner_splits = 3
inner_loop = TimeSeriesSplit(n_splits = inner_splits).split(X_train,Y_train)

# n_jobs set to 4 means that 4 cores are used for parallel processing; set n_jobs=-1 to use all available cores

grid_cv = GridSearchCV(estimator = model_cv, param_grid = param_grid_cv, cv = inner_loop, verbose = 3, n_jobs=-1)
grid_result = grid_cv.fit(X_train,Y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: -62.414964 using {'batch_size': 250, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.01, 'neurons': 2000, 'optimizer': 'Adam'}
-92.200343 (29.365974) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.01, 'neurons': 1500, 'optimizer': 'Adam'}
-161.357597 (86.164358) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.01, 'neurons': 1500, 'optimizer': 'Adadelta'}
-245.919278 (238.326770) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.01, 'neurons': 2000, 'optimizer': 'Adam'}
-286.490025 (206.873506) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.01, 'neurons': 2000, 'optimizer': 'Adadelta'}
-75.131762 (8.137463) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.001, 'neurons': 1500, 'optimizer': 'Adam'}
-76.444692 (7.812122) with: {'batch_size': 150, 'dropout': 0.0, 'epochs': 750, 'learning_rate': 0.001, 'neurons': 1500, 'optimizer': 'Adadelta'}
-7

# most recent gridsearch result:




## Fitting the optimal model with all gridsearched parameters

In [None]:
#  1 HL manually chosen lowest GSCV loss
#{'batch_size': 500, 'dropout': 0.0, 'epochs': 250, 'learning_rate': 0.001, 'neurons': 1000, 'optimizer': 'Adagrad'}
# fitting the optimal GRU model with the gridsearched hyperparameters
GRU_model_2 = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
#LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))
GRU_model_2.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), dropout = 0.0, activation= my_leaky_relu))


# output layer 
GRU_model_2.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
GRU_model_2.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
GRU_model_2.fit(X_train, Y_train, epochs=750, batch_size=150, verbose=1)
    


In [None]:
#  2 HL manually chosen lowest GSCV loss
#{'batch_size': 500, 'dropout': 0.0, 'epochs': 250, 'learning_rate': 0.001, 'neurons': 1000, 'optimizer': 'Adagrad'}
# fitting the optimal GRU model with the gridsearched hyperparameters
Deep_GRU_model_2 = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
#LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))
Deep_GRU_model_2.add(tf.keras.layers.GRU(units=2000, return_sequences = True, input_shape = (1,60), dropout = 0.0, activation= my_leaky_relu))

Deep_GRU_model_2.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), dropout = 0.0, activation= my_leaky_relu))


# output layer 
Deep_GRU_model_2.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
Deep_GRU_model_2.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
Deep_GRU_model_2.fit(X_train, Y_train, epochs=750, batch_size=150, verbose=1)
    


In [None]:
### 1HL GRU with hyperparameters from the new grid
#{'batch_size': 500, 'dropout': 0.0, 'epochs': 250, 'learning_rate': 0.001, 'neurons': 1000, 'optimizer': 'Adagrad'}
# fitting the optimal GRU model with the gridsearched hyperparameters
GRU_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
#LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))
GRU_model.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), dropout = 0.0, activation=my_leaky_relu))


# output layer 
GRU_model.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
GRU_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
GRU_model.fit(X_train, Y_train, epochs=750, batch_size=250, verbose=1)
    

Deep GRU network (input layer - 2 stacked GRU hidden layers - output layer)


In [None]:
# To stack GRU layers, we need to change the configuration of the prior GRU layer to output a 3D array as input for the subsequent layer.
# We can do this by setting the return_sequences argument on the layer to True (the default is False). 
# This will return one output for each input time step and provide a 3D array.

# fitting the optimal GRU model with the gridsearched hyperparameters
Deep_GRU_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
Deep_GRU_model.add(tf.keras.layers.GRU(units=2000, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 2nd GRU hidden layer
Deep_GRU_model.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer with 24 neurons
Deep_GRU_model.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
Deep_GRU_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
Deep_GRU_model.fit(X_train, Y_train, epochs=750, batch_size=250, verbose=1)



Deep GRU network (input layer - 3 stacked GRU layers - output layer)

In [None]:
# To stack GRU layers, we need to change the configuration of the prior GRU layer to output a 3D array as input for the subsequent layer.
# We can do this by setting the return_sequences argument on the layer to True (the default is False). 
# This will return one output for each input time step and provide a 3D array.

# fitting the optimal GRU model with the gridsearched hyperparameters
Deep_LSTM_model2 = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
Deep_LSTM_model2.add(tf.keras.layers.LSTM(units=1000, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 2nd GRU hidden layer
Deep_LSTM_model2.add(tf.keras.layers.LSTM(units=1000, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 3rd GRU hidden layer
Deep_LSTM_model2.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer with 10 ne
Deep_LSTM_model2.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
Deep_LSTM_model2.compile(optimizer=tf.keras.optimizers.Adagrad(learning_rate = 0.001),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
Deep_LSTM_model2.fit(X_train, Y_train, epochs=250, batch_size=500, verbose=1)

Deep GRU network (input layer - 5 stacked GRU layers - output layer)

In [None]:
# To stack GRU layers, we need to change the configuration of the prior GRU layer to output a 3D array as input for the subsequent layer.
# We can do this by setting the return_sequences argument on the layer to True (the default is False). 
# This will return one output for each input time step and provide a 3D array.

# fitting the optimal GRU model with the gridsearched hyperparameters
Deep_LSTM_model3 = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
Deep_LSTM_model3.add(tf.keras.layers.LSTM(units=1500, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 2nd GRU hidden layer
Deep_LSTM_model3.add(tf.keras.layers.LSTM(units=1500, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 3rd GRU hidden layer
Deep_LSTM_model3.add(tf.keras.layers.LSTM(units=1500, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 4th GRU hidden layer
Deep_LSTM_model3.add(tf.keras.layers.LSTM(units=1500, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# 5th GRU hidden layer
Deep_LSTM_model3.add(tf.keras.layers.LSTM(units=1500, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# output layer 
Deep_LSTM_model3.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
Deep_LSTM_model3.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.001),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
Deep_LSTM_model3.fit(X_train, Y_train, epochs=500, batch_size=250, verbose=1)

## Undoing the normalization, making predictions and computing the test error

In [None]:
# making predictions (regular GRU)
y_train_pred = GRU_model.predict(X_train)
y_test_pred = GRU_model.predict(X_test)

# making predictions (deep GRU 2 hidden layers)
y_train_pred_deep = Deep_GRU_model.predict(X_train)
y_test_pred_deep = Deep_GRU_model.predict(X_test)

# making predictions (deep GRU 3 hidden layers)
y_train_pred_deep2 = Deep_LSTM_model2.predict(X_train)
y_test_pred_deep2 = Deep_LSTM_model2.predict(X_test)

# making predictions (deep GRU 5 hidden layers)
y_train_pred_deep3 = Deep_LSTM_model3.predict(X_train)
y_test_pred_deep3 = Deep_LSTM_model3.predict(X_test)

# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred)
y_test_pred = scaler.inverse_transform(y_test_pred)

#invert originals
y_train_orig = scaler.inverse_transform(y_train)
y_test_orig = scaler.inverse_transform(y_test)

In [None]:
# MSE GRU model 2 forecast
mse_GRU = mean_squared_error(Y_test, y_test_pred2)
mse_GRU

In [None]:
# RMSE GRU model 2 forecast
rmse_GRU = sqrt(mse_GRU)
rmse_GRU

In [None]:
# mae GRU model 2
mae_LSTM = mean_absolute_error(Y_test, y_test_pred2)
mae_LSTM

In [None]:
# custom MAPE function
def mean_absolute_percentage_error(y_true, y_pred): 
    """Calculates MAPE given y_true and y_pred"""
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

y_true = Y_test
y_pred = y_test_pred2

In [None]:
# MAPE GRU model 2 1 HL

mean_absolute_percentage_error(y_true,y_pred)

In [None]:
# MSE deep GRU model 2 forecast
mse_GRU = mean_squared_error(Y_test, y_test_pred_deep2)
mse_GRU

In [None]:
# RMSE deep GRU model 2 forecast
rmse_GRU = sqrt(mse_GRU)
rmse_GRU

In [None]:
# mae deep GRU model 2
mae_LSTM = mean_absolute_error(Y_test, y_test_pred_deep2)
mae_LSTM

In [None]:
# custom MAPE function
def mean_absolute_percentage_error(y_true, y_pred): 
    """Calculates MAPE given y_true and y_pred"""
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

y_true = Y_test
y_pred = y_test_pred_deep2

In [None]:
# MAPE GRU model 2 2 HL

mean_absolute_percentage_error(y_true,y_pred)

In [None]:
# MSE  forecast
mse_GRU = mean_squared_error(Y_test, y_test_pred)
mse_GRU

In [None]:
# RMSE GRU forecast
rmse_GRU = sqrt(mse_GRU)
rmse_GRU

In [None]:
# mae GRU
mae_LSTM = mean_absolute_error(Y_test, y_test_pred)
mae_LSTM

In [None]:
# custom MAPE function
def mean_absolute_percentage_error(y_true, y_pred): 
    """Calculates MAPE given y_true and y_pred"""
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

y_true = Y_test
y_pred = y_test_pred



In [None]:
# MAPE GRU 1 HL
y_true = Y_test
y_pred = y_test_pred

mean_absolute_percentage_error(y_true,y_pred)

In [None]:
# MSE Deep GRU forecast (2HL)
mse_GRU1 = mean_squared_error(Y_test, y_test_pred_deep)
mse_GRU1

In [None]:
# RMSE Deep GRU forecast (2HL)
rmse_GRU1 = sqrt(mse_GRU1)
rmse_GRU1

In [None]:
# mae DeepGRU (2HL)
mae_GRU1 = mean_absolute_error(Y_test, y_test_pred_deep)
mae_GRU1

In [None]:
# MAPE GRU 2 HL
y_true = Y_test
y_pred = y_test_pred_deep

mean_absolute_percentage_error(y_true,y_pred)

In [None]:
# MSE Deep GRU forecast (3HL)
mse_LSTM2 = mean_squared_error(Y_test, y_test_pred_deep2)
mse_LSTM2

In [None]:
# RMSE Deep GRU forecast (3HL)
rmse_LSTM2 = sqrt(mse_LSTM2)
rmse_LSTM2

In [None]:
# mae DeepGRU (3HL)
mae_LSTM2 = mean_absolute_error(Y_test, y_test_pred_deep2)
mae_LSTM2

In [None]:
# MSE Deep LSTM forecast (5HL)
mse_LSTM3 = mean_squared_error(Y_test, y_test_pred_deep3)
mse_LSTM3

In [None]:
# RMSE Deep GRU forecast (5HL)
rmse_LSTM3 = sqrt(mse_LSTM3)
rmse_LSTM3

In [None]:
# mae DeepGRU (5HL)
mae_LSTM2 = mean_absolute_error(Y_test, y_test_pred_deep3)
mae_LSTM2

In [None]:
# MAPE GRU 3 HL
y_true = Y_test
y_pred = y_test_pred_deep2

mean_absolute_percentage_error(y_true,y_pred)

In [None]:
# MAPE GRU 5 HL
y_true = Y_test
y_pred = y_test_pred_deep3

mean_absolute_percentage_error(y_true,y_pred)

## Prediction intervals using quantile regression

In [None]:
# quantile regression loss = tilted loss = pinball loss
def tilted_loss(q,y,f):
    e = (y-f)
    return K.mean(K.maximum(q*e, (q-1)*e), axis=-1)

2 separate PI networks 1 hidden layer GRU

In [None]:
## build lower bound with different loss (95% PI)
q = 0.025


# fitting the GRU model with the gridsearched hyperparameters
lower_PI_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
lower_PI_model.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# output layer with 10 neurons
lower_PI_model.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
lower_PI_model.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=lambda y,f: tilted_loss(q,y,f))
lower_PI_model.fit(X_train, Y_train, epochs=300, batch_size=150, verbose=1)


In [None]:
## build upper bound with different loss (95% PI)
q = 0.975


# fitting the GRU model with the gridsearched hyperparameters
upper_PI_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
upper_PI_model.add(tf.keras.layers.GRU(units=2000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

# output layer with 10 neurons
upper_PI_model.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
upper_PI_model.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=lambda y,f: tilted_loss(q,y,f))
upper_PI_model.fit(X_train, Y_train, epochs=300, batch_size=150, verbose=1)

In [None]:
# predictions from PI models

lower_pred = lower_PI_model.predict(X_test)
upper_pred = upper_PI_model.predict(X_test)

PIs for the DLSTM model (2HL) two networks

In [None]:
## build lower bound with different loss (95% PI)
q = 0.025


# fitting the GRU model with the gridsearched hyperparameters
lower_PI_model_deep = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
lower_PI_model_deep.add(tf.keras.layers.GRU(units=1000, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

lower_PI_model_deep.add(tf.keras.layers.GRU(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer with 10 neurons
lower_PI_model_deep.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
lower_PI_model_deep.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=lambda y,f: tilted_loss(q,y,f))
lower_PI_model_deep.fit(X_train, Y_train, epochs=300, batch_size=200, verbose=1)

In [None]:
## build upper bound with different loss (95% PI)
q = 0.975


# fitting the GRU model with the gridsearched hyperparameters
upper_PI_model_deep = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
upper_PI_model_deep.add(tf.keras.layers.GRU(units=1000, return_sequences = True, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))

upper_PI_model_deep.add(tf.keras.layers.GRU(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer with 10 neurons
upper_PI_model_deep.add(tf.keras.layers.Dense(24))
# the compile() method configures the model for training
upper_PI_model_deep.compile(optimizer=tf.keras.optimizers.Adadelta(learning_rate = 0.01),
              loss=lambda y,f: tilted_loss(q,y,f))
upper_PI_model_deep.fit(X_train, Y_train, epochs=300, batch_size=200, verbose=1)

In [None]:
# predictions from PI models

lower_pred_deep = lower_PI_model_deep.predict(X_test)
upper_pred_deep = upper_PI_model_deep.predict(X_test)

# Exporting the PI bounds

In [None]:
upper_pred_1hl = pd.Series(upper_pred[0])
lower_pred_1hl = pd.Series(lower_pred[0])
upper_pred_2hl = pd.Series(upper_pred_deep[0])
lower_pred_2hl = pd.Series(lower_pred_deep[0])

In [None]:
# export the PI bounds for plotting in R
GRU_ems_pi_df = pd.DataFrame()



# adding a column with the test data
GRU_ems_pi_df["upper_bound_1hl"] = upper_pred_1hl
GRU_ems_pi_df["lower_bound_1hl"] = lower_pred_1hl
GRU_ems_pi_df["upper_bound_2hl"] = upper_pred_2hl
GRU_ems_pi_df["lower_bound_2hl"] = lower_pred_2hl


GRU_ems_pi_df


Unnamed: 0,upper_bound_1hl,lower_bound_1hl,upper_bound_2hl,lower_bound_2hl
0,50.33651,22.262304,50.464931,22.007679
1,46.607979,21.321659,46.405609,17.912231
2,41.474297,15.419744,40.987724,13.768996
3,38.642899,9.931201,44.039425,13.168715
4,37.261223,13.019662,32.470146,13.720415
5,40.333721,14.157906,36.401173,12.688701
6,43.842426,14.100899,44.542225,16.298822
7,46.362923,18.130318,45.785686,18.691341
8,56.363609,21.443565,56.10183,20.461889
9,66.476547,25.135046,65.836899,24.39127


In [None]:
# export the GRU predictions and the test data to google drive
from google.colab import  drive

drive.mount('/drive')

GRU_ems_pi_df.to_csv('/drive/My Drive/Colab Notebooks/GRU_ems_pis.csv', index=False)

Mounted at /drive


PI measure calculations

In [None]:
def PCIP(upper, lower, test_set):
    n = len(Y_test[0,:])
    count = 0
    PCIP = 0
    
    for i in range(n):
        if (upper[0,i] > test_set[0,i] and lower[0,i] < test_set[0,i]):
            count = count + 1 
            
    PCIP = count/n
    return PCIP

In [None]:
# reshape PI values
lower_PI_values_2HL = lower_PI_preds_2HL_95.values.reshape(1,24)
upper_PI_values_2HL = upper_PI_preds_2HL_95.values.reshape(1,24)

In [None]:
# DGRU (2HL) 
#PCIP(upper = upper_pred_dgru3, lower = lower_pred_dgru3, test_set = Y_test)
PCIP(upper = upper_PI_values_2HL, lower = lower_PI_values_2HL, test_set = Y_test)

0.9583333333333334

In [None]:
def MPIW(upper, lower):
    n = len(upper[0,:])
    diff = 0
    MPIW = 0
    
    for i in range(n):
        diff = diff + (upper[0,i] - lower[0,i])
    
    MPIW = diff/n
    return MPIW

In [None]:
# MPIW for the best DGRU model (2HL)
MPIW(upper = upper_PI_values_2HL, lower = lower_PI_values_2HL)

# Plotting the predictions

plot 2 PI networks

In [None]:
# GRU model (1HL) predictions vs test data

plt.xlim(0,24)
#plt.plot(np.transpose(y_test_pred2), label="GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
plt.fill_between(x = np.arange(0,24), y1= np.transpose(lower_pred).reshape(24), y2=np.transpose(upper_pred).reshape(24), color = "b", alpha = 0.10)

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("LSTM predictions vs test data", fontweight="bold")
#legend = plt.legend(["LSTM" predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()


In [None]:
# GRU model (2HL) predictions vs test data

plt.xlim(0,24)
#plt.plot(np.transpose(y_test_pred_deep2), label="GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
plt.fill_between(x = np.arange(0,24), y1= np.transpose(lower_pred_deep).reshape(24), y2=np.transpose(upper_pred_deep).reshape(24), color = "b", alpha = 0.10)

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("LSTM predictions vs test data", fontweight="bold")
#legend = plt.legend(["LSTM" predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()


plot single PI network

In [None]:
# GRU model (1HL) predictions vs test data

plt.xlim(0,24)
plt.plot(np.transpose(y_test_pred), label="GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
#plt.fill_between(x = np.arange(0,24), y1= lower_PI_preds_95, y2= upper_PI_preds_95, color = 'xkcd:sky blue')

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("GRU predictions vs test data", fontweight="bold")
legend = plt.legend(["GRU predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()


In [None]:
# Deep GRU model (2HL) predictions vs test data (1 PI network)

plt.xlim(0,24)
plt.plot(np.transpose(y_test_pred_deep), label="Deep GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
#plt.fill_between(x = np.arange(0,24), y1= lower_PI_preds_2HL_95, y2= upper_PI_preds_2HL_95, color = 'xkcd:sky blue')

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("Deep GRU predictions (2HL) vs test data", fontweight="bold")
legend = plt.legend(["GRU predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()

In [None]:
# Deep GRU model (3HL) predictions vs test data

plt.xlim(0,24)
plt.plot(np.transpose(y_test_pred_deep2), label="Deep GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
#plt.fill_between(x = np.arange(0,24), y1= np.transpose(lower_pred_dgru3).reshape(24), y2=np.transpose(upper_pred_dgru3).reshape(24), color = "b", alpha = 0.10)

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("Deep GRU predictions (3HL) vs test data", fontweight="bold")
legend = plt.legend(["GRU predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()

In [None]:
# Deep GRU model (5HL) predictions vs test data

plt.xlim(0,24)
plt.plot(np.transpose(y_test_pred_deep3), label="Deep GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
#plt.fill_between(x = np.arange(0,24), y1= np.transpose(lower_pred).reshape(24), y2=np.transpose(upper_pred).reshape(24), color = "b", alpha = 0.10)

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("Deep GRU predictions (5HL) vs test data", fontweight="bold")
legend = plt.legend(["GRU predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()

# Exporting the point forecasts


In [None]:
GRU_ems_preds = pd.Series(y_test_pred2[0])
GRU_ems_preds_deep = pd.Series(y_test_pred_deep2[0])


In [None]:
# export the GRU predictions and the test data to google drive
from google.colab import  drive

drive.mount('/drive')

GRU_ems_preds.to_csv('/drive/My Drive/Colab Notebooks/GRU_ems_preds.csv', index=False)
GRU_ems_preds_deep.to_csv('/drive/My Drive/Colab Notebooks/GRU_ems_preds_deep.csv', index=False)

Mounted at /drive


Plotting the Predictions from the Deep GRU Model

In [None]:
plt.xlim(0,24)
plt.plot(np.transpose(y_test_pred), label="GRU_predictions")
plt.plot(np.transpose(Y_test), label = "Test data")
plt.fill_between(x = np.arange(0,24), y1= np.transpose(lower_pred).reshape(24), y2=np.transpose(upper_pred).reshape(24), color = "b", alpha = 0.10)

# axis labels, title and legend
plt.xlabel('Time steps')
plt.ylabel("Traffic flow")
plt.title("GRU predictions vs test data (regularized)", fontweight="bold")
legend = plt.legend(["GRU predictions","Test data","95% PI"],loc='lower right')

#save_results_to = '/Users/Manu/Dropbox/MScThesis-Conor-Manu/Latex/'
#tkz.save(save_results_to + "GRU_predictions.tex")

plt.show()