# Dilated CNN model

In this notebook, we demonstrate how to:
- prepare time series data for training a Convolutional Neural Network (CNN) forecasting model
- get data in the required shape for the keras API
- implement a CNN model in keras to predict 3 steps ahead (time *t+1* to *t+1*) in the time series
- enable early stopping to reduce the likelihood of model overfitting
- evaluate the model on a test dataset

The data in this example is taken from the GEFCom2014 forecasting competition<sup>1</sup>. It consists of 3 years of hourly electricity load and temperature values between 2012 and 2014. The task is to forecast future values of electricity load. In this example, we show how to forecast one time step ahead, using historical load data only.

<sup>1</sup>Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli and Rob J. Hyndman, "Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond", International Journal of Forecasting, vol.32, no.3, pp 896-913, July-September, 2016.

In [None]:
import os
import warnings
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import datetime as dt
from collections import UserDict
from IPython.display import Image
from sklearn.preprocessing import MinMaxScaler
%matplotlib inline

from common.utils import load_data, mape, TimeSeriesTensor, create_evaluation_df

pd.options.display.float_format = '{:,.2f}'.format
np.set_printoptions(precision=2)
warnings.filterwarnings("ignore")

Load the data from csv into a Pandas dataframe

In [None]:
data_dir = 'data/'
energy = load_data(data_dir)
energy.head()

## Create train, validation and test sets

We separate our dataset into train, validation and test sets. We train the model on the train set. The validation set is used to evaluate the model after each training epoch and ensure that the model is not overfitting the training data. After the model has finished training, we evaluate the model on the test set. We must ensure that the validation set and test set cover a later period in time from the training set, to ensure that the model does not gain from information from future time periods.

We will allocate the period 1st November 2014 to 31st December 2014 to the test set. The period 1st September 2014 to 31st October is allocated to validation set. All other time periods are available for the training set.

In [None]:
valid_start_dt = '2014-09-01 00:00:00'
test_start_dt = '2014-11-01 00:00:00'

energy.plot(y=['load', 'temp'], subplots=True, figsize=(15, 8), fontsize=12)
plt.show()

Load and temperature in first week of July 2014

In [None]:
energy['2014-07-01':'2014-07-07'].plot(y=['load', 'temp'], subplots=True, figsize=(15, 8), fontsize=12)
plt.show()

## Data preparation

For this example, we will set *T=24*. This means that the input for each sample is a vector of the prevous 24 hours of the energy load.

*HORIZON=3* specifies that we have a forecasting horizon of 3 (*t+1* to *t+3*)

In [None]:
T = 24
HORIZON = 3

### Data preparation - training set

In [None]:
# Create training dataset with load and temp features
train = energy.copy()[energy.index < valid_start_dt][['load', 'temp']]

# Fit a scaler for the y values
y_scaler = MinMaxScaler()
y_scaler.fit(train[['load']])

# Also scale the input features data (load and temp values)
X_scaler = MinMaxScaler()
train[['load', 'temp']] = X_scaler.fit_transform(train)

Use the TimeSeriesTensor convenience class to:
1. Shift the values of the time series to create a Pandas dataframe containing all the data for a single training example
2. Discard any samples with missing values
3. Transform this Pandas dataframe into a numpy array of shape (samples, time steps, features) for input into Keras

The class takes the following parameters:

- **dataset**: original time series
- **H**: the forecast horizon
- **tensor_structure**: a dictionary discribing the tensor structure in the form { 'tensor_name' : (range(max_backward_shift, max_forward_shift), [feature, feature, ...] ) }
- **freq**: time series frequency
- **drop_incomplete**: (Boolean) whether to drop incomplete samples

In [None]:
tensor_structure = {'X':(range(-T+1, 1), ['load', 'temp'])}
train_inputs = TimeSeriesTensor(dataset=train,
                            target='load',
                            H=HORIZON,
                            tensor_structure=tensor_structure,
                            freq='H',
                            drop_incomplete=True)
train_inputs.dataframe.head(5)

In [None]:
X_train = train_inputs['X']
y_train = train_inputs['target']

In [None]:
y_train.shape

In [None]:
y_train[:3]

In [None]:
X_train.shape

In [None]:
X_train[:3]

#### Data preparation - validation set

In [None]:
look_back_dt = dt.datetime.strptime(valid_start_dt, '%Y-%m-%d %H:%M:%S') - dt.timedelta(hours=T-1)
valid = energy.copy()[(energy.index >=look_back_dt) & (energy.index < test_start_dt)][['load', 'temp']]
valid[['load', 'temp']] = X_scaler.transform(valid)
valid_inputs = TimeSeriesTensor(valid, 'load', HORIZON, tensor_structure)
y_valid = valid_inputs['target']
X_valid = valid_inputs['X']

In [None]:
y_valid.shape

In [None]:
X_valid.shape

## Quiz: Implement multivariate CNN

In [None]:
from keras.models import Model, Sequential
from keras.layers import Conv1D, Dense, Flatten
from keras.callbacks import EarlyStopping, ModelCheckpoint

#### Fill in your code below and replace the question marks

Implement your CNN model with the data prepared above and the following requirements:
1. Use 2 features: past load and temperature
2. Stack 5 convolutional layers of kernel width 2 with dilation rates 1, 2, 4, 8, 16
3. Use 5 filters in each layer
4. Train for 10 epochs
5. Batch size 32

In [None]:
LATENT_DIM = ?
KERNEL_SIZE = ?
BATCH_SIZE = ?
EPOCHS = ?

In [None]:
# Fill in your code to replace the question mark
# Hint: there is a parameter you need to add when stacking multiple RNN layers
model = Sequential()
?
?
?
?
?
?
?

Once you done, run the rest of the notebook to check if your model works.

In [None]:
model.compile(optimizer='RMSprop', loss='mse')
model.summary()

Specify the early stopping criteria. We **monitor** the validation loss (in this case the mean squared error) on the validation set after each training epoch. If the validation loss has not improved by **min_delta** after **patience** epochs, we stop the training.

In [None]:
earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5)

history = model.fit(X_train,
                    y_train,
                    batch_size=BATCH_SIZE,
                    epochs=EPOCHS,
                    validation_data=(X_valid, y_valid),
                    callbacks=[earlystop],
                    verbose=1)

In [None]:
plot_df = pd.DataFrame.from_dict({'train_loss':history.history['loss'], 'val_loss':history.history['val_loss']})
plot_df.plot(logy=True, figsize=(10,10), fontsize=12)
plt.xlabel('epoch', fontsize=12)
plt.ylabel('loss', fontsize=12)
plt.show()

## Evaluate the model

Create the test set

In [None]:
look_back_dt = dt.datetime.strptime(test_start_dt, '%Y-%m-%d %H:%M:%S') - dt.timedelta(hours=T-1)
test = energy.copy()[test_start_dt:][['load', 'temp']]
test[['load', 'temp']] = X_scaler.transform(test)
test_inputs = TimeSeriesTensor(test, 'load', HORIZON, tensor_structure)
X_test = test_inputs['X']
y_test = test_inputs['target']

In [None]:
predictions = model.predict(X_test)
eval_df = create_evaluation_df(predictions, test_inputs, HORIZON, y_scaler)
eval_df.head()

In [None]:
eval_df['APE'] = (eval_df['prediction'] - eval_df['actual']).abs() / eval_df['actual']
eval_df.groupby('h')['APE'].mean()

In [None]:
plot_df = eval_df[(eval_df.timestamp<'2014-11-08') & (eval_df.h=='t+1')][['timestamp', 'actual']]
for t in range(1, HORIZON+1):
    plot_df['t+'+str(t)] = eval_df[(eval_df.timestamp<'2014-11-08') & (eval_df.h=='t+'+str(t))]['prediction'].values

fig = plt.figure(figsize=(15, 8))
ax = plt.plot(plot_df['timestamp'], plot_df['actual'], color='red', linewidth=4.0)
ax = fig.add_subplot(111)
ax.plot(plot_df['timestamp'], plot_df['t+1'], color='blue', linewidth=4.0, alpha=0.75)
ax.plot(plot_df['timestamp'], plot_df['t+2'], color='blue', linewidth=3.0, alpha=0.5)
ax.plot(plot_df['timestamp'], plot_df['t+3'], color='blue', linewidth=2.0, alpha=0.25)
plt.xlabel('timestamp', fontsize=12)
plt.ylabel('load', fontsize=12)
ax.legend(loc='best')
plt.show()