# LSTM neural network for emergency demand predictions
This notebook contains the code for applying neural network models to smart city data <br>


In [None]:
# GPU check:

import tensorflow as tf
tf.test.gpu_device_name()


'/device:GPU:0'

In [None]:
from tensorflow.python.client import device_lib
print("Show System RAM Memory: \n \n")
!cat /proc/meminfo | egrep "MemTotal"
device_lib.list_local_devices()

Show System RAM Memory: 
 

MemTotal:       26751688 kB


[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 1754890282637008253, name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 8360702716429381997
 physical_device_desc: "device: XLA_CPU device", name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 11714794874272407656
 physical_device_desc: "device: XLA_GPU device", name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 15695549568
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 10097516626238525608
 physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"]

In [None]:
# get additional info about the hardware in the cloud
%cat /proc/cpuinfo
%cat /proc/meminfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 79
model name	: Intel(R) Xeon(R) CPU @ 2.20GHz
stepping	: 0
microcode	: 0x1
cpu MHz		: 2199.998
cache size	: 56320 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa
bogomips	: 4399.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 b

Tutorial about google colab and GPU access: <br>
https://www.youtube.com/watch?v=f1UK8KPt-KU

In [None]:
# this allows for accessing files stored in your google drive using the path "/gdrive/My Drive/"
# mounting google drive locally:

from google.colab import drive
drive.mount('/gdrive')

Mounted at /gdrive


In [None]:
# importing local data to google colab:
from google.colab import files
uploaded = files.upload()

Saving taxi_series_H to taxi_series_H


$\textbf{Background:}$ Tensors are data structures that you can think of as multi-dimensional arrays. Tensors are represented as n-dimensional arrays of base dataypes such as a string or integer -- they provide a way to generalize vectors and matrices to higher dimensions. The shape of a Tensor defines its number of dimensions and the size of each dimension. The rank of a Tensor provides the number of dimensions. Scalars can be used to create 0-d Tensors. Vectors and lists can be used to create 1-d Tensors. Matrices can be used to create 2-d or higher rank Tensors. The shape of a Tensor provides the number of elements in each Tensor dimension.

$\textbf{Neural Networks in Tensorflow:}$ We can also define neural networks in TensorFlow. TensorFlow uses a high-level API called Keras that provides a powerful, intuitive framework for building and training deep learning models. <br> 
Tensors can flow through abstract types called $\textit{Layers}$ -- the building blocks of neural networks. Layers implement common neural networks operations, and are used to update weights, compute losses, and define inter-layer connectivity <br>
<br>
Conveniently, TensorFlow has defined a number of Layers that are commonly used in neural networks, for example a Dense. Now, instead of using a single Layer to define our simple neural network, we'll use the Sequential model from Keras and a single Dense layer to define our network. With the Sequential API, you can readily create neural networks by stacking together layers like building blocks.

# Implementation

In [None]:
## -- Packages  -- ##

# General
import pandas as pd
import numpy as np

# Time formatting
import datetime

# Load and save data
import pickle
# progress bar
from tqdm import tqdm

# Plotting
import matplotlib.pyplot as plt
%matplotlib inline
#import tikzplotlib as tkz

In [None]:
##  NN libaries ##
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow.keras.backend as K

from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from math import sqrt


## ML libraries ##
from keras.wrappers.scikit_learn import KerasRegressor

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score

from sklearn.preprocessing import MinMaxScaler

### Load data

In [None]:
#load taxi data. Generated in notebook 'taxi_trips'
filename = '/gdrive/My Drive/Colab Notebooks/emergency_dispatches_bronx_H'
infile = open(filename,'rb')
emergency_ts = pickle.load(infile)
infile.close()

In [None]:
taxi_ts.shape

(8760,)

## Preprocessing for 48 step ahead

In [None]:
##
emergency_ts_48 = emergency_ts.iloc[0:-360 + 24]
emergency_ts_48.shape

(8424,)

In [None]:
## Set paramaters
input_lags = 60 # 2 and a half times the seasonal period
output_lags = 48 # we predict 48 hours ahead
n_test = 48 # output_lags  (changed)


In [None]:
## Split data in train and test set
train = emergency_ts_48[0:-n_test]
test = emergency_ts_48[-n_test:]
print(train.shape)
print(test.shape)

(8376,)
(48,)


In [None]:
## Create lagged values for both input and output window (48)
data = train.copy()
n_train = len(data)

##Create lagged values for input
df = pd.DataFrame()
for i in range(input_lags,0,-1):
    df['t-' + str(i)] = data.shift(i)

##Create lagged values for output
for j in range(0,output_lags,1):
    df['t+' + str(j)] = data.shift(-j)
    
df = df[input_lags:(n_train-output_lags+1)]

In [None]:
## splitting the training set into labels and features
X_train = df.iloc[:,:input_lags] # from the beginning to input_lags
Y_train = df.iloc[:,input_lags:] # from input_lags to the end

## Use the last window of the training set as the features for the test set. This requires a combination of 
## X_train and Y_train.
X_test = X_train.iloc[len(X_train) - 1,:][output_lags:]
X_test = X_test.append(Y_train.iloc[len(Y_train) - 1,:]).values.reshape(1,input_lags)
Y_test = test[:output_lags].values.reshape(1,output_lags)

X_train = X_train.values # 54 steps back (54 lags)
Y_train = Y_train.values # 24 steps ahead

print("X_train: " + "type: " + str(type(X_train)) + "\tshape: " + str(X_train.shape))
print("Y_train: " + "type: " + str(type(Y_train)) + "\tshape: " + str(Y_train.shape))
print("X_test: " + "type: " + str(type(X_test)) + "\tshape: " + str(X_test.shape))
print("Y_test: " + "type: " + str(type(Y_test)) + "\tshape: " + str(Y_test.shape))

X_train: type: <class 'numpy.ndarray'>	shape: (8269, 60)
Y_train: type: <class 'numpy.ndarray'>	shape: (8269, 48)
X_test: type: <class 'numpy.ndarray'>	shape: (1, 60)
Y_test: type: <class 'numpy.ndarray'>	shape: (1, 48)


In [None]:
# configuring the inputs for the model
# For Keras, the input has to be in the shape (samples, time steps, features)
# 24 timestep with n features where n is equal to the shape of column [1] of X_train or X_test

X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])

In [None]:
# creating a leaky_relu activation function

def my_leaky_relu(x):
    return tf.nn.leaky_relu(x)

### Fitting the optimal model with all gridsearched parameters

In [None]:


# fitting the optimal GRU model with the gridsearched hyperparameters
LSTM_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer 
LSTM_model.add(tf.keras.layers.Dense(48))
# the compile() method configures the model for training
LSTM_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
LSTM_model.fit(X_train, Y_train, epochs=500, batch_size=400, verbose=1)
    


Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7f4d72feceb8>

In [None]:
# making predictions (regular GRU)
y_train_pred = LSTM_model.predict(X_train)
y_test_pred = LSTM_model.predict(X_test)

In [None]:
# RMSE LSTM forecast
mse_LSTM = mean_squared_error(Y_test, y_test_pred)
sqrt(mse_LSTM)

5.577946344050255

In [None]:
# MAE LSTM
mae_LSTM = mean_absolute_error(Y_test, y_test_pred)
mae_LSTM

4.163619955380757

## Preprocessing for 72 step ahead

In [None]:
##
emergency_ts_72 = emergency_ts.iloc[0:-360 + 24*2]
emergency_ts_72.shape

(8448,)

In [None]:
## Set paramaters
input_lags = 60 # 2 and a half times the seasonal period
output_lags = 72 # we predict 48 hours ahead
n_test = 72 # output_lags  (changed)


In [None]:
## Split data in train and test set
train = emergency_ts_72[0:-n_test]
test = emergency_ts_72[-n_test:]
print(train.shape)
print(test.shape)

(8376,)
(72,)


In [None]:
## Create lagged values for both input and output window (24)
data = train.copy()
n_train = len(data)

##Create lagged values for input
df = pd.DataFrame()
for i in range(input_lags,0,-1):
    df['t-' + str(i)] = data.shift(i)

##Create lagged values for output
for j in range(0,output_lags,1):
    df['t+' + str(j)] = data.shift(-j)
    
df = df[input_lags:(n_train-output_lags+1)]

In [None]:
## splitting the training set into labels and features
X_train = df.iloc[:,:input_lags] # from the beginning to input_lags
Y_train = df.iloc[:,input_lags:] # from input_lags to the end

## Use the last window of the training set as the features for the test set. This requires a combination of 
## X_train and Y_train.
#X_test = X_train.iloc[len(X_train) - 1,:][output_lags:]
#X_test = X_test.append(Y_train.iloc[len(Y_train) - 1,:]).values.reshape(1,input_lags)
X_test = Y_train.iloc[len(Y_train) - 1,:][-input_lags:].values.reshape(1,input_lags)
Y_test = test[:output_lags].values.reshape(1,output_lags)

X_train = X_train.values # 54 steps back (54 lags)
Y_train = Y_train.values # 24 steps ahead

print("X_train: " + "type: " + str(type(X_train)) + "\tshape: " + str(X_train.shape))
print("Y_train: " + "type: " + str(type(Y_train)) + "\tshape: " + str(Y_train.shape))
print("X_test: " + "type: " + str(type(X_test)) + "\tshape: " + str(X_test.shape))
print("Y_test: " + "type: " + str(type(Y_test)) + "\tshape: " + str(Y_test.shape))

X_train: type: <class 'numpy.ndarray'>	shape: (8245, 60)
Y_train: type: <class 'numpy.ndarray'>	shape: (8245, 72)
X_test: type: <class 'numpy.ndarray'>	shape: (1, 60)
Y_test: type: <class 'numpy.ndarray'>	shape: (1, 72)


In [None]:
# configuring the inputs for the model
# For Keras, the input has to be in the shape (samples, time steps, features)
# 24 timestep with n features where n is equal to the shape of column [1] of X_train or X_test

X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])

### Fitting the optimal model with all gridsearched parameters

In [None]:


# fitting the optimal GRU model with the gridsearched hyperparameters
LSTM_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer 
LSTM_model.add(tf.keras.layers.Dense(72))
# the compile() method configures the model for training
LSTM_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
LSTM_model.fit(X_train, Y_train, epochs=500, batch_size=400, verbose=1)
    


Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7f4d6c461a90>

In [None]:
# making predictions (regular GRU)
y_train_pred = LSTM_model.predict(X_train)
y_test_pred = LSTM_model.predict(X_test)

In [None]:
# RMSE LSTM forecast
mse_LSTM = mean_squared_error(Y_test, y_test_pred)
sqrt(mse_LSTM)

5.731430228717213

In [None]:
# MAE LSTM
mae_LSTM = mean_absolute_error(Y_test, y_test_pred)
mae_LSTM

4.556013981501262

## Preprocessing for 96 step ahead

In [None]:
##
emergency_ts_96 = emergency_ts.iloc[0:-360 + 24*3]
emergency_ts_96.shape

(8472,)

In [None]:
## Set paramaters
input_lags = 60 # 2 and a half times the seasonal period
output_lags = 96 # we predict 48 hours ahead
n_test = 96 # output_lags  (changed)


In [None]:
## Split data in train and test set
train = emergency_ts_96[0:-n_test]
test = emergency_ts_96[-n_test:]
print(train.shape)
print(test.shape)

(8376,)
(96,)


In [None]:
## Create lagged values for both input and output window (24)
data = train.copy()
n_train = len(data)

##Create lagged values for input
df = pd.DataFrame()
for i in range(input_lags,0,-1):
    df['t-' + str(i)] = data.shift(i)

##Create lagged values for output
for j in range(0,output_lags,1):
    df['t+' + str(j)] = data.shift(-j)
    
df = df[input_lags:(n_train-output_lags+1)]

In [None]:
## splitting the training set into labels and features
X_train = df.iloc[:,:input_lags] # from the beginning to input_lags
Y_train = df.iloc[:,input_lags:] # from input_lags to the end

## Use the last window of the training set as the features for the test set. This requires a combination of 
## X_train and Y_train.
#X_test = X_train.iloc[len(X_train) - 1,:][output_lags:]
#X_test = X_test.append(Y_train.iloc[len(Y_train) - 1,:]).values.reshape(1,input_lags)
X_test = Y_train.iloc[len(Y_train) - 1,:][-input_lags:].values.reshape(1,input_lags)
Y_test = test[:output_lags].values.reshape(1,output_lags)

X_train = X_train.values # 54 steps back (54 lags)
Y_train = Y_train.values # 24 steps ahead

print("X_train: " + "type: " + str(type(X_train)) + "\tshape: " + str(X_train.shape))
print("Y_train: " + "type: " + str(type(Y_train)) + "\tshape: " + str(Y_train.shape))
print("X_test: " + "type: " + str(type(X_test)) + "\tshape: " + str(X_test.shape))
print("Y_test: " + "type: " + str(type(Y_test)) + "\tshape: " + str(Y_test.shape))

X_train: type: <class 'numpy.ndarray'>	shape: (8221, 60)
Y_train: type: <class 'numpy.ndarray'>	shape: (8221, 96)
X_test: type: <class 'numpy.ndarray'>	shape: (1, 60)
Y_test: type: <class 'numpy.ndarray'>	shape: (1, 96)


In [None]:
# configuring the inputs for the model
# For Keras, the input has to be in the shape (samples, time steps, features)
# 24 timestep with n features where n is equal to the shape of column [1] of X_train or X_test

X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])

### Fitting the optimal model with all gridsearched parameters

In [None]:
# fitting the optimal GRU model with the gridsearched hyperparameters
LSTM_model = tf.keras.Sequential()

# Add a GRU layer with 128 units (=dimensionality of the output space = number of neurons)
# option1: return sequences returns the hidden state output for each input time step.
# option2: return state returns the hidden state output and cell state for the last input time step.
# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 128)
LSTM_model.add(tf.keras.layers.LSTM(units=1000, return_sequences = False, input_shape = (1,60), activation=my_leaky_relu, dropout = 0.0))


# output layer 
LSTM_model.add(tf.keras.layers.Dense(96))
# the compile() method configures the model for training
LSTM_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.01),
              loss=tf.keras.losses.mean_squared_error,
              metrics=['mean_squared_error'])
LSTM_model.fit(X_train, Y_train, epochs=500, batch_size=400, verbose=1)
    


Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7f4d6c08edd8>

In [None]:
# making predictions (regular GRU)
y_train_pred = LSTM_model.predict(X_train)
y_test_pred = LSTM_model.predict(X_test)

In [None]:
# RMSE LSTM forecast
mse_LSTM = mean_squared_error(Y_test, y_test_pred)
sqrt(mse_LSTM)

5.652206697173143

In [None]:
# MAE LSTM
mae_LSTM = mean_absolute_error(Y_test, y_test_pred)
mae_LSTM

4.584723790486653