# Time Series Analysis and Forecasting with CNNs

Convolutional Neural Network models, or CNNs for short, can be applied to time series forecasting. I will be presenting different type of scenarios that we usually come across during solving problems related with time series and the variety of CNN architectures that we can use to tackle them.

Variety of time series related problems:
- CNN Models for Univariate time series
- CNN Models for Multivariate time series
- CNN Models for Multistep time series
- CNN Models for Multivariate and Multisteps time series


## Univariate CNN Models

Although traditionally developed for two-dimensional image data, CNNs can be used to model univariate time series forecasting problems. Univariate time series are datasets comprised of a single series of observations with a temporal ordering and a model is required to learn from the series of past observations to predict the next value in the sequence.

This section is divided into two parts:
1. Data Preparation 
2. CNN Model


### Data Preparation

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [7]:
df = pd.read_csv("data/Alcohol_Sales.csv", parse_dates=["DATE"], index_col="DATE")

In [9]:
df.head(2)

Unnamed: 0_level_0,S4248SM144NCEN
DATE,Unnamed: 1_level_1
1992-01-01,3459
1992-02-01,3458


In [15]:
df.rename({"S4248SM144NCEN":"Sales"}, axis = 1,inplace=True)

In [16]:
df.head()

Unnamed: 0_level_0,Sales
DATE,Unnamed: 1_level_1
1992-01-01,3459
1992-02-01,3458
1992-03-01,4002
1992-04-01,4564
1992-05-01,4221


In [17]:
df.tail()

Unnamed: 0_level_0,Sales
DATE,Unnamed: 1_level_1
2018-09-01,12396
2018-10-01,13914
2018-11-01,14174
2018-12-01,15504
2019-01-01,10718


In [19]:
df.count()

Sales    325
dtype: int64

So here the goal is to moniter Sales data over time. We have to model the Sales series 

In [22]:
df["Sales"].values

array([ 3459,  3458,  4002,  4564,  4221,  4529,  4466,  4137,  4126,
        4259,  4240,  4936,  3031,  3261,  4160,  4377,  4307,  4696,
        4458,  4457,  4364,  4236,  4500,  4974,  3075,  3377,  4443,
        4261,  4460,  4985,  4324,  4719,  4374,  4248,  4784,  4971,
        3370,  3484,  4269,  3994,  4715,  4974,  4223,  5000,  4235,
        4554,  4851,  4826,  3699,  3983,  4262,  4619,  5219,  4836,
        4941,  5062,  4365,  5012,  4850,  5097,  3758,  3825,  4454,
        4635,  5210,  5057,  5231,  5034,  4970,  5342,  4831,  5965,
        3796,  4019,  4898,  5090,  5237,  5447,  5435,  5107,  5515,
        5583,  5346,  6286,  4032,  4435,  5479,  5483,  5587,  6176,
        5621,  5889,  5828,  5849,  6180,  6771,  4243,  4952,  6008,
        5353,  6435,  6673,  5636,  6630,  5887,  6322,  6520,  6678,
        5082,  5216,  5893,  5894,  6799,  6667,  6374,  6840,  5575,
        6545,  6789,  7180,  5117,  5442,  6337,  6525,  7216,  6761,
        6958,  7070,

Okay so basically we need to model the above series and we need to transform a univariate series into a supervised learning problem.

In [41]:
import torch
import torch.nn as nn
import torch.functional as F
from numpy import array

In [66]:
#function for data transformation

def split_sequence(sequences, n_steps):

    X, y = [], []
    for i in range(len(sequences)):
        end_ix = i + n_steps
        if end_ix > len(sequences)-1:
            break 
        seq_x, seq_y = sequences[i:end_ix], sequences[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return torch.tensor(array(X), dtype=torch.long), torch.tensor(array(y), dtype=torch.long)

Running the example splits the univariate series into X and y where each row in X is of shape `n_steps` and y is single output per X's rows.

In [67]:
#create feature and target from the above function

sequences = df.Sales.values

X, y = split_sequence(sequences, 3)

In [68]:
X[:2], y[:2]

(tensor([[3459, 3458, 4002],
         [3458, 4002, 4564]]),
 tensor([4564, 4221]))

Now that we know how to prepare a univariate series for modeling, let’s look at developing a CNN model that can learn the mapping of inputs to outputs.

### CNN Model

A one-dimensional CNN is a CNN model that has a convolutional hidden layer that operates over a 1D sequence. This is followed by perhaps a second convolutional layer in some cases, such as very long input sequences, and then a pooling layer whose job it is to distill the output of the convolutional layer to the most salient elements. 

In [69]:
n_features = 1
X = X.reshape(X.shape[0], n_features, X.shape[1])


In [70]:
X.shape

torch.Size([322, 1, 3])

The CNN does not actually view the data as having time steps, instead, it is treated as a sequence over which convolutional read operations can be performed, like a one-dimensional image

Model Architecture:

- A convolutional layer with 64 filter maps and a kernel size of 2. 
- Max pooling layer and a dense layer to interpret the input feature 
- An output layer is specified that predicts a single numerical value
- Optimiser - Adam
- Loss function - MSE

In [71]:
class CNNTimeSeriesModel(nn.Module):

    def __init__(self, n_steps, n_features):

        super().__init__()

        self.cnn_layer = nn.Conv1d(n_steps, n_features, kernel_size=2)
        self.mp_layer = nn.MaxPool1d(kernel_size=2)
        self.ln_layer = nn.Linear()


In [79]:
cnn_layer = nn.Conv1d(in_channels= 1, out_channels= 64, kernel_size=1)

In [75]:
X

tensor([[[ 3459,  3458,  4002]],

        [[ 3458,  4002,  4564]],

        [[ 4002,  4564,  4221]],

        [[ 4564,  4221,  4529]],

        [[ 4221,  4529,  4466]],

        [[ 4529,  4466,  4137]],

        [[ 4466,  4137,  4126]],

        [[ 4137,  4126,  4259]],

        [[ 4126,  4259,  4240]],

        [[ 4259,  4240,  4936]],

        [[ 4240,  4936,  3031]],

        [[ 4936,  3031,  3261]],

        [[ 3031,  3261,  4160]],

        [[ 3261,  4160,  4377]],

        [[ 4160,  4377,  4307]],

        [[ 4377,  4307,  4696]],

        [[ 4307,  4696,  4458]],

        [[ 4696,  4458,  4457]],

        [[ 4458,  4457,  4364]],

        [[ 4457,  4364,  4236]],

        [[ 4364,  4236,  4500]],

        [[ 4236,  4500,  4974]],

        [[ 4500,  4974,  3075]],

        [[ 4974,  3075,  3377]],

        [[ 3075,  3377,  4443]],

        [[ 3377,  4443,  4261]],

        [[ 4443,  4261,  4460]],

        [[ 4261,  4460,  4985]],

        [[ 4460,  4985,  4324]],

        [[ 498

In [82]:
cnn_layer(torch.tensor([[12, 23]]))

RuntimeError: expected scalar type Long but found Float

In [59]:
m = nn.Conv1d(16, 33, 3, stride=2)
input = torch.randn(20, 16, 50)
output = m(input)

In [61]:
output.shape

torch.Size([20, 33, 24])