# Recurrent Neural Networks

Recurrent neural networks are a special type of neural network which work especially well on certain types of data where a value at time $t$ is dependent on previous values, i.e. values at time $t^- < t$. This includes text data, where a word in a sentence depends on the words that come before it, and on time series data, such as stock  market data where the share value of a particular stock depends on its value the previous day.

In a CNN all layers send information to the next layer. In an RNN, certain layers also loop back to themselves. 

In this notebook we'll predict the number of aeroplane passengers. To see how RNNs can be used on text data, you could also try the following tutorial https://towardsdatascience.com/recurrent-neural-networks-by-example-in-python-ffd204f99470.

In [None]:
import numpy
import matplotlib.pyplot as plt
from pandas import read_csv
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

## Load the data

First you need to download the file `airline-passengers.csv` and upload it within google colab. On the very left side of the screen, there are 3 symbols. The bottom one is files. Click that and then upload the data to be able to access it within google colab. 

In [None]:
# fix random seed for reproducibility
numpy.random.seed(7)
# load the dataset
dataframe = read_csv('airline-passengers.csv', usecols=[1], engine='python')
dataframe.head

`MinMaxScaler` is a **class** that allows us to scale the data between a certain range.

In [None]:
scaler = MinMaxScaler(feature_range=(0, 1))

This means that `scaler` is now an instance of the class `MinMaxScaler`, and we can call all functions within `MinMaxScaler` by using `scaler.function_name`. Lets use this to send the data to the range [0,1]

In [None]:
dataset = scaler.fit_transform(dataframe)
dataset

Now we will split the data into 67% training and 33% testing.

In [None]:
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

We want to convert the data so there are two columns: one at time step $t$, one at timestep $t+1$. We will create a function to do this for us.

You can think of this similar to how we have used supervision labels in the past, the first column is the input and the second column is what we are trying to predict.

In [None]:
# convert an array of values into a dataset matrix
def create_dataset(dataset, time_diff=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-time_diff-1):
        a = dataset[i:(i+time_diff), 0] #this is only really useful with time_diff > 1. For time_diff=1 we can just use i
        dataX.append(a)
        dataY.append(dataset[i + time_diff, 0])
        
    return numpy.array(dataX), numpy.array(dataY)

Now call the function we have just created.

In [None]:
time_diff = 1
trainX, trainY = create_dataset(train, time_diff)
testX, testY = create_dataset(test, time_diff)

In [None]:
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
[trainX[:5], trainY[:5]]

## Baseline Implementation
We will use a sequential model similar to the ones we explored in Practical 5. The difference is that we will add an LSTM layer. This makes the network a *Recurrent Neural Network*

### LSTM
To make the network an RNN, we will implement an LSTM cell. LSTM stands for long-short term memory and allows the network to remember information from a long time ago if it believes that information is useful. I don't want to go into too many details as the structure is quite complicated. I want to focus on how we implement it into a neural network.

The below model is similar to ones you have seen in the past.

In [None]:
model = Sequential()
model.add(LSTM(10, input_shape=(1, time_diff)))
model.add(Dense(6, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam') #Note that the loss is now mean squared error
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)

In [None]:
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

### Unscaling the data
During the pre-processing stage, we scaled the data down to be between 0 and 1. Now to view the results, we need to scale it back up to its initial size. We can call the function `inverse_transform` within `scaler` to do this. 

In [None]:
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

In [None]:
# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

### Plotting
The remaining cells set up our predictions so that we can plot them onto the original graph.

In [None]:
# shift train predictions for plotting
trainPredictPlot = numpy.empty_like(dataset)
trainPredictPlot[:, :] = numpy.nan
trainPredictPlot[time_diff:len(trainPredict)+time_diff, :] = trainPredict

In [None]:
# shift test predictions for plotting
testPredictPlot = numpy.empty_like(dataset)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(trainPredict)+(time_diff*2)+1:len(dataset)-1, :] = testPredict

In [None]:
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

We see that the green line (representing our test predictions) is very close to the actual data.

### Exercise
From next week, the practical sessions will be purely for project work. Google Colab have a large number of tutorials on how to use Colab for different applications at https://research.google.com/seedbank/. Some of these may be related to the projects you will undertake.

Choose one that you feel is similar to your project and work through it. These are generally slightly more advanced than the material we have been through. You can ask me questions and I will try to help out.