# <center> Recurrent Neural Networks - Long Short Term Memory</center>

# Recurrent Neural Networks (RNN)

As it was seen beforehand Neural Networks offer a very comprehensive and efficient solution to classification problems. However could it also be applied to prediction? 

Google Translate, Apple's Siri, ... What do these applications have in common? They rely on specific Neural Networks called Recurrent Neural Networks. The core difference with what has been seen is the ability of RNNs to "remember" informations from previous chunk of the network 

To illustrate this idea, a graphical representation is a sound choice

<img src="RNN-rolled.PNG" alt="Drawing" style="width: 100px;"/>

This "rolled" representation of the RNN shows us an input $x_t$ in the network that then outputs an $h_t$. This is a classic scheme for Neural Networks. However there is a loop on the network that can be unrolled (see below) to represent the loop mechanism or a RNN

<img src="RNN-unrolled.PNG" alt="Drawing" style="width: 600px;"/>

We see here that the input is in reality made out of several inputs for each step $1$ to $t$ 

Now that the core idea of a RNN has been introduced, let's try to implement one using Keras:

<center>Let's predict an Air Quality value (ie a pollutant's concentration in the air)

To make this exercise , we will use the data provided by the Europan Environment Agency (https://www.eea.europa.eu/data-and-maps/data/aqereporting-2/be) in which the amount of SO2, a very common pollutant, in Belgium has been recorded. 

In [1]:
# Importing packages
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import warnings
warnings.simplefilter(action = 'ignore', category = FutureWarning)
from sklearn.preprocessing import MinMaxScaler

from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential
from keras.layers import Dense, LSTM, SimpleRNN
from keras.callbacks import ModelCheckpoint, EarlyStopping

Using TensorFlow backend.


ModuleNotFoundError: No module named 'tensorflow'

In [None]:
train = pd.read_csv('../data/processed/train.csv', header=0, index_col=0).values.astype('float32')
valid = pd.read_csv('../data/processed/valid.csv', header=0, index_col=0).values.astype('float32')
test = pd.read_csv('../data/processed/test.csv', header=0, index_col=0).values.astype('float32')

def plot_loss(history, title):
    plt.figure(figsize=(10,6))
    plt.plot(history.history['loss'], label='Train')
    plt.plot(history.history['val_loss'], label='Validation')
    plt.title(title)
    plt.xlabel('Nb Epochs')
    plt.ylabel('Loss')
    plt.legend()
    plt.show()
    
    val_loss = history.history['val_loss']
    min_idx = np.argmin(val_loss)
    min_val_loss = val_loss[min_idx]
    print('Minimum validation loss of {} reached at epoch {}'.format(min_val_loss, min_idx))

In [None]:
n_lag = 14

train_data_gen = TimeseriesGenerator(train, train, length=n_lag, sampling_rate=1, stride=1, batch_size = 5)
valid_data_gen = TimeseriesGenerator(train, train, length=n_lag, sampling_rate=1, stride=1, batch_size = 1)
test_data_gen = TimeseriesGenerator(test, test, length=n_lag, sampling_rate=1, stride=1, batch_size = 1)

In [None]:
simple_rnn = Sequential()
simple_rnn.add(SimpleRNN(4, input_shape=(n_lag, 1)))
simple_rnn.add(Dense(1))
simple_rnn.compile(loss='mae', optimizer=RMSprop())

checkpointer = ModelCheckpoint(filepath='../model/simple_rnn_weights.hdf5'
                               , verbose=0
                               , save_best_only=True)
earlystopper = EarlyStopping(monitor='val_loss'
                             , patience=10
                             , verbose=0)
with open("../model/simple_rnn.json", "w") as m:
    m.write(simple_rnn.to_json())

simple_rnn_history = simple_rnn.fit_generator(train_data_gen
                                              , epochs=100
                                              , validation_data=valid_data_gen
                                              , verbose=0
                                              , callbacks=[checkpointer, earlystopper])
plot_loss(simple_rnn_history, 'SimpleRNN - Train & Validation Loss')

Example: Air Quality Prediction in Belgium

As sustainables energy and ecological concerns have arisen in the last years, it has been 

## Limits of RNNs: The vanishing gradient problem

However as we try to imporve our model, we will encounter one of the main issues of 

# Long Short Term Memory

In order to compose with RNNs short-comings, a new method was designed