# Preparations

Execute the following code blocks to configure the session and import relevant modules.

In [None]:
%config InlineBackend.figure_format ='retina'
%load_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
import os
import sys
import math
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, LSTM, GRU
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

There is also a utility module [rnnutils.py](https://raw.githubusercontent.com/NBISweden/workshop-neural-nets-and-deep-learning/master/session_recurrentNeuralNetworks/lecture_RNN/rnnutils.py) that you can use if you want to save time coding. Whenever it is used, it will be commented out to leave you the choice whether you want to use the functions or write up your own solution. In any case, make sure the file is located in the current directory to load it.

In [None]:
import rnnutils

# Lab session: predicting airline passengers

## Aims

In this lab the idea is to try out different RNN models on the Box & Jenkins monthly airline passengers dataset. The dataset is a monthly time series of airline passengers recorded in the 50'ies and 60'ies. Your task is to build a model to make a future prediction of the number of passengers given a number of observation.

You will download data and prepare it for later analyses. More specifically, you will partition the data into a training and test set. In order to create input / label pairs (X/Y), the data is split into time slices, where a slice corresponds to the input (X) and the consecutive time point the (known) output (Y).


To help you along the way, some of the steps have been prepared in advance, but in most cases, your task is to complete missing code. Don't hesitate to change parameter settings and experiment with the model architectures. Also, make sure to examine the contents of variables by printing them. Things to try:

- change the number of time steps
- change the number of epoch
- experiment with the network topology (e.g. number of units in the hidden layer)

See if you can improve on the model presented in the lecture.

# Session 1: Vanilla RNN

## Download data

Start by downloading the data and loading it into a pandas dataframe:

In [None]:
!wget https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv --no-check-certificate

We modify the data somewhat for easier processing. `df.head()` simply shows you the first entries of the data frame.

In [None]:
df = pd.read_csv('airline-passengers.csv')
df = df.rename(columns={'Month': 'time','Passengers': 'passengers'})
df['time'] = pd.to_datetime(df['time'], format='%Y-%m')
df['year'] = pd.DatetimeIndex(df['time']).year
df['month'] = pd.DatetimeIndex(df['time']).month
df.head()

Plot the data for overview:

In [None]:
plt.plot(df.time, df.passengers)

## Create training and test data


Next, we partition the data into training and test data sets. The [MinMaxScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html) function rescales the data to values in the range (0, 1) with unit variance.

In [None]:
train_fraction = 2/3
# Reshape data for MinMaxScaler
data = np.array(df['passengers'].values.astype('float32')).reshape(-1, 1)
split = int(len(data) * train_fraction)
# Rescale the data
scaler = MinMaxScaler(feature_range=(0, 1))
data = scaler.fit_transform(data).flatten()
train = data[range(split)]
test = data[split:]
# The above code is available in rnnutils.make_train_test and called as follows:
# train, test, scaler = rnnutils.make_train_test(data)

## Transform data to input - output pairs


Now that we have train and test data sets we need to convert the data to input - output (X/Y) pairs. The general idea is to take time slices (e.g. 12 data points) as input vectors and use the subsequent value as the known output. Since the time unit is months and there likely is a recurrent yearly seasonality in the data it makes sense to use 12 time steps, but this is a parameter you could modify to see what effect it has on the end results.

In [None]:
time_steps = 12
# trainX, trainY, trainX_indices, trainY_indices = rnnutils.make_xy(train, time_steps)
# testX, testY, testX_indices, testY_indices = rnnutils.make_xy(test, time_steps) 

## Define the model


Complete the model below to include a [SimpleRNN](https://keras.io/api/layers/recurrent_layers/simple_rnn/) layer and a [Dense](https://keras.io/api/layers/core_layers/dense/) output layer. If you look at the SimpleRNN documentation, you will find that inputs is a 3D tensor (`[batch, timesteps, feature]`). Since we are using univariate data (i.e. one feature per time step), `features=1`. Recall also that for the `input_shape` parameter you don't specify `batch`. You can go back to the lecture notes for a complete example, but try first to setup the model without peeking!

In [None]:
# model = Sequential()
# Add layers here
# model.add()
#
# model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
# model.summary()

Once you are happy with the configuration, fit the model and evaluate. 

In [None]:
# history = model.fit(trainX, trainY, ...)
# Ytrainpred = model.predict(trainX)
# Ytestpred = model.predict(testX)

You can use the utility plotting functions in `rnnutils` to plot training history and predictions

In [None]:
# data = {'train': (Ytrainpred, train, trainY_indices),
#       'test': (Ytestpred, test, testY_indices)}
# rnnutils.plot_pred(...)
# rnnutils.plot_history(...)







# Session 2: LSTM (and optionally GRU) 

Building on session 1, analyse the data set using LSTM layers. Here is a tentative model setup to get you started. Here you could try using multiple layers, in which case you need to return the sequences for all but the last layer (cf [Stacked Long Short-Term Memory Networks](https://machinelearningmastery.com/stacked-long-short-term-memory-networks/)). If you have time, you can also try out the GRU layers for comparison. Do you notice any difference?

In [None]:
# model = Sequential()
# model.add(LSTM(..., return_sequences=True, input_shape=(..., ...)))
# model.add(LSTM(..., return_sequences=False))
# model.compile(loss='mean_squared_error', optimizer='adam')