# 1d CNN for Timeseries

**Prerequisites**

- Tensorflow + keras
- CNN

**Outcomes**

- Recall main features of CNN
- Understand temporal correlation in time series
- Understand the 1d CNN model for time series analysis
- Build and apply 1d CNN to various time series

In [1]:
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

## Review: CNN for Image Classification

- Convolutional Neural networks are 
    - Feed forward networks
    - Built from convolutional, pooling, and dense layers
    - Take form: 
    
    $$y = (f_{\text{out}} \circ D_{\text{out}} \circ P_L \circ \cdots \circ P_6 \circ f_5 \circ C_{5} \circ f_4 \circ C_4 \underbrace{\circ P_3 \circ f_2 \circ C_{2} \circ f_1 \circ C_1}_{\text{key pattern}})(x)$$



### Convolutional Layers

- Connect output to local sub-region of input
- Identify patterns with sense of locality
- Built on convolution operation: `sum(sub_region * filter) + bias`
- Same filter *slides* across image from left to right, and top to bottom
- Parameters are shared for all sub-regions -- in practice find same pattern in multiple parts of image

### Pooling Layers

- Reduce local sub-region into single number
- Used to reduce the width and height of feature map
- Computed as `max(sub_region)`
- Require no parameters to be learned

### Dense Layers

- Appear at end of CNN
- Used to build a classifier taking final feature maps as input
- Often one or two dense layers at end of stack
- Connects every point from each feature map to every output of dense layer
- Requires *many* parameters: (`w*h*K*N + N`)

### How CNN learns

- Early layers apply filters to identify simple patterns
- Later layers combine multiple simple patterns to 
- Demo: https://www.cs.ryerson.ca/~aharley/vis/conv/flat.html

### CNN's Secret Weapon

- The CNN model is powerful because it forces the network to take *local context* into account
- By analyzing chunks of data that are *spatialy correlated*, CNN can recognize repeated patterns that are common in images (e.g. 2 eyes + nose + mouth + 2 ears on human head or shape of airplane)
- What if the CNN could also be applied to settings where data is locally correlated in another way ...

- IT CAN!
- Today we will learn about how to use a 1 dimensional CNN to analyze univariate time series data
- These data are *temporally correlated* ==> CNN can learn to recognize repeating patterns in time series

## Univariate Time Series

- We will consider a univariate time series to be a sequence of random variables $\{x_t \}$ indexed by $t \in \mathbb{Z}$ and taking values in $\mathbb{R}$
- We will assume that the time series begins at $t=0$ and that $t$ increases by 1 with each observation
- Examples: financial data, weather data, heart rates, vehicle position/velocity, videos (time series of images)

## 1D Convolution

- Let's learn how to learn to predict our time series using a 1d CNN
- We'll again explore the 1d CNN model from three persepctives
    1. Visual
    2. Mathematical
    3. Code
- Having already learned the 2d CNN, this will be realtively simple

### Visual Perspective

- Convolution uses fixed weights/filter
- Slides filter over windows of data 
- Moves from left (t=0) to right (t++)

![1d_cnn_visual.jpg](1d_cnn_visual.jpg)

### Mathematical Perspective

- Hyperparameters
    - $K$: number of filters
    - $F$: width of window
    - $S$: stride
    - $P$: padding 

#### Mathematical Formula

- Let $w \in \mathbb{R}^K$ be filter weights and $b \in mathhb{R}$ be bias
- Output $i$ computed as: $$z_i = \left(\sum_{j =0}^{F} w_j x_{i + j - F // 2} \right) + b$$
- Special care taken around edges (that's why we have padding)
- This is repeated for each each of the $K$ filters

### Code Perspective

- For the 1d CNN it is instructive to write code by hand

> When actually building networks and training we'll still use keras

In [2]:
def conv1d(x, w, b, S: int = 1, P: int = 0):
    """
    Apply single filter of 1d Convolution to x given
    filter weights (w), bias (b), stride (S), and padding (P)
    """
    assert P >= 0
    if P == 0:
        x_pad = x
    else:
        x_pad = np.concatenate([np.zeros(P), x, np.zeros(P)])
    N = len(x)
    F = len(w)
    half_F = F // 2
    out = []
    for i in range(half_F, len(x_pad) - half_F, S):
        window = x_pad[(i-half_F):(i+half_F + 1)]    
        out.append((w @ window) + b)
    
    return out

## 1d CNN in Keras

- Keras has a `keras.layers.Conv1D`, which is the 1d couterpart to the 2d conv layer we met for image analysis
- We build up a network in a similar way: `((conv)+(pool))+(dense)+`
- Let's try it out

In [3]:
T_input = 60  # see below -- just a number for now

In [4]:
model = keras.Sequential([
    layers.Conv1D(32, 7, activation="relu", input_shape=(T_input, 1)),
    layers.Conv1D(32, 7, activation="relu"),
    layers.MaxPooling1D(),
    layers.Flatten(),
    layers.Dense(30, activation="relu"),
    layers.Dense(1),
])

model.compile(optimizer='adam', loss='mse')

### Preparing Data

- We have a 1d time series of data
- We need to feed our CNN (X, y) pairs
- Each X should be its own time series and each y should be the subsequent observation to predict


### Example

- For example, suppose our input data were `x = [1, 2, 3, 4, 5, 6, 7, 8]`
- Let's split this data into subsequences of length 3
- The (X, y) data we would end up passing to keras is

```python
X = np.array([
    [1, 2, 3],  # y = 4
    [2, 3, 4],  # y = 5
    [3, 4, 5],  # y = 6
    [4, 5, 6],  # y = 7
    [5, 6, 7],  # y = 8
])

y = np.array([4, 5, 6, 7, 8])
```

### Data Prep with keras

- The accounting work needed to translate the 1d timeseries into multiple subsequences is tedious at best (and very difficult to implement efficiently!)
- Thankfully, `keras.preprocessing.timeseries_dataset_from_array` will do it for us
- This function expects three inputs:
    - x: features to be used as X
    - y: targets or labels
    - sequence_length: length of each sub-sequence

#### Keras data prep example

- Let's see how to keras it to repeat our example from above

In [5]:
# must be two dimensional
x = np.arange(1, 9)[:, None]
x

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

In [6]:
test_dataset = keras.preprocessing.timeseries_dataset_from_array(
    x,      # raw dataset
    x[3:],  # shifted dataset for predictions
    3       # subsequence length -- matches shift on `y`
)
print("test_dataset is a:", type(test_dataset))
test_dataset

test_dataset is a: <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>


<BatchDataset element_spec=(TensorSpec(shape=(None, None, 1), dtype=tf.int64, name=None), TensorSpec(shape=(None, 1), dtype=tf.int64, name=None))>

In [7]:
list(test_dataset) 

[(<tf.Tensor: shape=(5, 3, 1), dtype=int64, numpy=
  array([[[1],
          [2],
          [3]],
  
         [[2],
          [3],
          [4]],
  
         [[3],
          [4],
          [5]],
  
         [[4],
          [5],
          [6]],
  
         [[5],
          [6],
          [7]]])>,
  <tf.Tensor: shape=(5, 1), dtype=int64, numpy=
  array([[4],
         [5],
         [6],
         [7],
         [8]])>)]

In [8]:
#                  X: batch number
#                     X: x or y (x)
#                                      X: drop "extra" empty dimension
list(test_dataset)[0][0].numpy()[:, :, 0]

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 6],
       [5, 6, 7]])

In [9]:
#                  X: batch number
#                     X: x or y (y)
#                                   X: drop "extra" empty dimension
list(test_dataset)[0][1].numpy()[:, 0]

array([4, 5, 6, 7, 8])

### Training and Fitting

- Once prepared we can the cnn using the fit method
- We'll open up an official tensorflow guide/tutorial to see this in action
- You can access it here: https://www.tensorflow.org/tutorials/structured_data/time_series

## Extensions

- The 1d CNN model can be altered and adapted beyond the univariate time series case
- We'll talk through a few of these extensions

### Multiple Outputs

- The 1d CNN model can easily be extended to handle cases with more than one output variable
- For example, suppose we are given a time series of macroeconomic data and asked to forecast both the price of the S&P 500 and the price of Bitcoin
- To do this we will modify the output layer (final dense layer) to have two outputs instead of one

```python
model = keras.Sequential([
    # ... stays the same
    keras.layers.Dense(2)   # change `1` to `2`
])
```

### Multivariate Input

- The model can also be extended to have multiple time series as part of the input data
- Example: accelerometer data
    - Comes in (x, y, z, t) tuples
    - Tracks acceleration in 3 dimensions over time
    - Output of model might be classifying what activity is happening (sleep, sit, jump, stairs, walk, swim, etc.)
- To do this, we modify the first Conv1D layer to have >1 on last element of `input_shape`

```python
model = keras.Sequential([
    keras.layers.Conv1d(*same_args, input_shape=(, 3)  # each timestep has (x,y,z) data
    # ... stays the same
])
```

### Multi-in Multi-out

- You can combine both ideas at the same time
- Example: Given price data on top-5 cryptocurrencies by market cap, predict prices for next 5
- To do this we have both the updated input_shape for first layer and number of outputs in last layer:

```python
model = keras.Sequential([
    keras.layers.Conv1d(*same_args, input_shape=(, 5))  # each timestep has prices for top 5 coins
    # ... stays the same,
    keras.layers.Dense(5)  # predict prices for next 5 coins
])
```

### Multi-shot prediction

- Models so far have sought to predict either contemporaneous or one time period ahead
- Perhaps we want to predict multiple time periods ahead
- Example: Given daily data, predict prices for next 7 days
- How to:
    - Requires more processing of training data
    - Need to correctly "align" data to match inputs for timestep $t$ into output at $t+i$ where $i = 1, \dots, 7$
- Model will have `7` outputs instead of one

```python
model = keras.Sequential([
    # ... stays the same
    keras.layers.Dense(7)   # change `1` to `7`
])
```

## Conclusion

- We learned about how we can use the 1d CNN to analyze time-series data
- The CNN is looking for *repeated patterns* or *periodic behavior*
- We talked about a few extensions -- hopefully sparked some creativity as to what is possible with an understanding of the key building blocks:
    - Layers
    - Shapes
    - Losses
    - SGD