# Time series prediction with RNNs

In this notebook we will implement a simple RNN to male predictions on the future value of a univariate time series.

In [3]:
import numpy as np
import pandas as pd
from tensorflow.keras.layers import Input, Dense, SimpleRNN
from tensorflow.keras.models import Sequential, Model

## Download and display data

Let's volontary choose a difficult time series with seasonality. By default RNNs are not equipped to handle seasonality.

In [None]:
import datetime as dt
import pandas_datareader as pdr
import matplotlib as plt

start = dt.datetime(1950,1,1)
end = dt.datetime(2019,1,1)
data = pdr.get_data_fred('UNRATENSA', start=start, end=end)
data.plot(figsize=(16, 9))
plt.show()

We test the hypothesis that the time series is not stationnary with the Dickey-Fuller GLS testand the Phillips-Perron test.

In [None]:
from arch.unitroot import DFGLS, PhillipsPerron

dfgls = DFGLS(data['UNRATENSA'])
print(dfgls.summary().as_text())

pp = PhillipsPerron(data['UNRATENSA'])
print(pp.summary().as_text())

## SARIMA

Let's first see how a standar model would performed.

In [None]:
import statsmodels.api as sm

## ad-hoc split of the data
y_all = data['UNRATENSA']
y_train0 = y_all[:'2010-01-01']
y_test0 = y_all['2010-01-01':]

## compute test size for later
test_size = 1 - len(y_train0)/len(y_all)
print("test size is {}%".format(int(test_size*100)))

## make and train the model
mod = sm.tsa.statespace.SARIMAX(y_train0,
                                order=None, ## TODO
                                seasonal_order=None, ## TODO
                                enforce_stationarity=False,
                                enforce_invertibility=False)
results = mod.fit()

## display results
print(results.summary().tables[1])
results.plot_diagnostics(figsize=(15, 12))
plt.show()

In [None]:
## in-sample one-step predictions
pred = results.get_prediction(start=pd.to_datetime('2000-01-01'), dynamic=False)
pred_ci = pred.conf_int()

ax = y_all['2000-01-01':].plot(figsize=(16, 9))
pred.predicted_mean.plot(ax=ax, label='one-step ahead Forecast', alpha=.7)
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.2)
plt.legend()
plt.show()

In [None]:
## out-of-sample multiple-step forecast
pred = results.get_forecast(steps=12*9)
pred_ci = pred.conf_int()

ax = y_all['2000-01-01':].plot(figsize=(16, 9))
pred.predicted_mean.plot(ax=ax, label='multiple-step ahead Forecast', alpha=.7)
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.2)
plt.legend()
plt.show()

## RNN

We now build an RNN similarly as in the warmup notebook. Let's first prepare the data.

In [None]:
from sklearn.model_selection import train_test_split

## it is good practice to center and normalize your data with ML models
## if not, it may take a long time before the parameters converge to the right level and scale

def prepare_univariate_ts(ts):
    n = len(ts)
    ts = ts / ts[0] - 1
    data = ts[0:(n-1)]
    labels = ts[1:n]
    return np.array(data), np.array(labels)

X, y = prepare_univariate_ts(data['UNRATENSA'])
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=test_size, ## get the same split
                                                    random_state=42,
                                                    shuffle=False ## THIS IS IMPORTANT FOR TIME SERIES!!!
                                                    )

Let's now build the model and train a RNN. Keep in mind that we want to make out-of-sample prediction later. We will manually set the initial state.

In [None]:
from tensorflow.keras.layers import Input, Dense, SimpleRNN
from tensorflow.keras.models import Sequential, Model

def build_model(ydim, hdim=1):
    ## hdim is the dimension of the RNN hidden layer
    
    ## 2 inputs:
    input1 = Input(shape=(ydim, 1)) ## x_t time series
    input2 = Input(shape=(hdim))    ## h_0 initial state
    
    ## TODO
    ## 2 outputs the sequence of prediction, and the latent state
    
    return model

hdim = 20
model = build_model(ydim=len(y_train), hdim=hdim)

## display model summary
model.summary()

Our model has two outputs, so there are special care needed when compiling the model.

In [None]:
## specify the optimizer and loss function
model.compile(optimizer='adam', 
              loss=None, ## TODO
             )

## prepare the data for training
x_input = [X_train.reshape(1,-1,1), ## (1 TS, many months, 1 feature)
           np.array([0.0]*hdim).reshape(1,hdim) ## initial hidden states
           ]
y_input = [y_train.reshape(1,-1,1),
           np.zeros((1,hdim)) ## not used
          ]

## train
history = model.fit(x=x_input, 
                    y=y_input, 
                    epochs=200
                    )

In [None]:
## in-sample predictions
y_pred, h_last = model.predict(x_input)

## in-sample one-step predictions
pred = pd.Series((1 + y_pred.reshape(-1))*y_all[0],  ## need to shift and scale
                 y_all[:'2010-01-01'].index[1:]
                )

ax = y_all['2000-01-01':].plot(figsize=(16, 9))
pred['2000-01-01':].plot(ax=ax, label='one-step ahead Forecast', alpha=.7)
plt.legend()
plt.show()

In [None]:
## out-of-sample multiple-step forecast

## make a model to perform one-step predictions
model_onestep = build_model(ydim=1, hdim=hdim)

## transfer weights from trained model
model_onestep.set_weights([w for w in model.get_weights()])

## loop to make multiple steps predictions
pred = []
ht = h_last
yt = X_test[0]
for t in range(12*9):
    data = [yt.reshape(1,1,1), ht]
    yt, ht = model_onestep.predict(data)
    pred.append(yt.squeeze())

## reshape into a Series
pred = np.asarray(pred)    
pred = pd.Series((1 + pred.reshape(-1))*y_all[0],  ## need to shift and scale
                 y_all['2010-01-01':].index[1:]
                )

## make plot
ax = y_all['2000-01-01':].plot(figsize=(16, 9))
pred.plot(ax=ax, label='multiple-step ahead Forecast', alpha=.7)
plt.legend()
plt.show()