# Chapter 8:
# How to Develop CNNs for Time Series Forecasting

## 8.1 Tutorial Overview

This tutorial is divided into four parts; they are:
1. Univariate CNN Models
2. Multivariate CNN Models
3. Multi-step CNN Models
4. Multivariate Multi-step CNN Models

## 8.2 Univariate CNN Models

### 8.2.1 Data Preparation

In [8]:
# univariate data preparation
from numpy import array

#split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i +n_steps
        if end_ix > len(sequence) - 1:
            break
        #gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

In [9]:
# define input sequence
raw_seq = array([i for i in range(10, 100, 10)])

In [12]:
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)   

In [11]:
# summarize the data
for i in range(len(X)):
    print(X[i],y[i])

[10 20 30] 40
[20 30 40] 50
[30 40 50] 60
[40 50 60] 70
[50 60 70] 80
[60 70 80] 90


### 8.2.2 CNN Model

A one-dimensional CNN is a CNN Model that has a convolutional hidden layer that operates over a 1D sequence. This is followed by perhaps a second convolutional layer in some cases, such as very long input sequences, and then a pooling layer whose job it is to distill the output of the convolutional layer to the most salient elements. The concolutional and pooling layersare followed by a dense full connected layer that interprets the fatures extracted by the convolutionsl part of the model. A flattern layer is being used between the convolutional layers and the dense layer to reduce the feature maps to a single one-dimensional vector. We can define a 1D CNN Model for univariate time series forecasting as follows.

```python
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation = 'relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

The input shape for each sample is specified in the `input_shape` argument on the definition of the first hidden layer. We almost always have multiple samples, therefore, the model will expect the input component of training data to have dimensions or shape: `[samples, timesteps, features]`. Our `split_sequence()` fucntion in the previous section outputs the X with shape `[samples, timesteps]`, so we can easily reshape it to have an additional dimension for the one feature.

```python
# reshape from [samples, timesteps] into [samples, timestepsm features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

The CNN does not actually view the data as having time steps, instead, it is treated as a sequence over which convolutional reas operations can be performed, like a one-dimensional image, In this example, we define a convolutional layer with a 64 filter maps and a kernel size of 2. This is followed by a max pooling layer and a dense layer to interpret the input feature. An output layer is specified that precicts a single numerical value. The model is fit using the mean efficient Adam version of stochastic gradient descent and optimized using the mean squared error, or `mse`, loss function. Once the model is defined, we can fit it on the training dataset.

``` python
# fit the model
model.fit(X, y, epochs=1000, verbose=0)

After the model is fit, we can use it to make a prediction. The model expect the input shape to be three-dimensional with `[samples, timesteps, features]`, therefore, we must reshape the single input sample before making the prediction.

```python
# demonstrare prediction
x_input = array([70,80,90])
x_iunput = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

**Full example**

In [14]:
# univariate cnn example
from numpy import array
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Conv1D
from keras.layers import MaxPooling1D

#split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i +n_steps
        if end_ix > len(sequence) - 1:
            break
        #gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

# define input sequence
raw_seq = array([i for i in range(10, 100, 10)])
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)   
# reshape from [samples, timesteps] into [samples, timestepsm features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

In [None]:
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation = 'relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
2025-07-02 09:51:55.049425: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


In [16]:
# fit the model
model.fit(X, y, epochs=1000, verbose=1)

Epoch 1/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - loss: 4346.0005
Epoch 2/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step - loss: 4124.9727
Epoch 3/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step - loss: 3908.4114
Epoch 4/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step - loss: 3696.7527
Epoch 5/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step - loss: 3492.2551
Epoch 6/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 39ms/step - loss: 3294.2471
Epoch 7/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step - loss: 3113.2717
Epoch 8/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step - loss: 2941.0144
Epoch 9/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 46ms/step - loss: 2772.7393
Epoch 10/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m

<keras.src.callbacks.history.History at 0x75b897c71430>

In [18]:
# demonstrare prediction
x_input = array([70,80,90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

In [19]:
print(yhat)

[[100.61856]]


For an example of a CNN applied to real-world univariate time series forecasting problem see Chapter 14. For an example of grid searching CNN hyperparameters on a univariate time series forecasting problem, see Chapter 15.

## 8.3 Multivariate CNN Models

Multivariate time series data means where there is more than one observation for each time step. There are two main models that we may require with multivariate time series data; they are:

1. Multiple Input Series
2. Multiple Parallel Series

### 8.3.1 Multiple Input Series

A problem may have two or more parallel input time series and an output time series that is dependente on the input series. The input time series are parallel beacuse each series has observations at the same time steps. We can demonstrate this with a simple example of two parallel input time series where the output series is the simple addition of the input series.

In [32]:
from numpy import array
# define input sequence
in_seq1 = array([i for i in range(10, 100, 10)])
in_seq2 = array([i for i in range(15, 100, 10)])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])

In [33]:
in_seq1

array([10, 20, 30, 40, 50, 60, 70, 80, 90])

We can reshape these three arrays of data as a single dataset where each row is a time step and each column is a separate time series.

In [34]:
from numpy import hstack
# convert to [row, columns] structure

in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

#horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

In [35]:
in_seq1

array([[10],
       [20],
       [30],
       [40],
       [50],
       [60],
       [70],
       [80],
       [90]])

In [36]:
dataset

array([[ 10,  15,  25],
       [ 20,  25,  45],
       [ 30,  35,  65],
       [ 40,  45,  85],
       [ 50,  55, 105],
       [ 60,  65, 125],
       [ 70,  75, 145],
       [ 80,  85, 165],
       [ 90,  95, 185]])

As with univariate time series, we must structure these data into samples with input and output samples. A 1D CNN model needs suffciente context to learn a mapping from an input sequence to an output value. CNNs can support parallel input time series as separate channels, like red, green, and blue components of an image. Therefore, we need to split the data into samples maintaining the order of observations across the two input sequences.

In [39]:
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        end_ix = i + n_steps
        #check if we are beyond the dataset
        if end_ix > len(sequences):
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

In [40]:
# choose a number of time steps
n_steps = 3 
# convert into input/output
X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
    print(X[i], y[i])

(7, 3, 2) (7,)
[[10 15]
 [20 25]
 [30 35]] 65
[[20 25]
 [30 35]
 [40 45]] 85
[[30 35]
 [40 45]
 [50 55]] 105
[[40 45]
 [50 55]
 [60 65]] 125
[[50 55]
 [60 65]
 [70 75]] 145
[[60 65]
 [70 75]
 [80 85]] 165
[[70 75]
 [80 85]
 [90 95]] 185


Running the example first print the shape of X and y components. We can see that the X component has a three-dimensional structure. The first dimension is the number of samples, in this case 7. The second dimension is the number of time steps per sample, in this case 3, the value specified to the function. FInally, the last dimension specifies the number of parallel time series or the number of variables, in this case 2 for the two parallel series. This is the exact three dimensional structure expected by a 1D CNN as input.

**CNN Model**

We are now ready to fit a 1D CNN model on this data, specifying the expected number of time steps and features to expect for each input sample, in this case three and two respectively

In [41]:
# the dataset knows the number of features, e.g. 2
n_features= X.shape[2]

In [42]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten 
from keras.layers import Conv1D
from keras.layers import MaxPooling1D

In [43]:
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation = 'relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [44]:
# fit model
model.fit(X, y, epochs = 1000, verbose = 1)

Epoch 1/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 800ms/step - loss: 21032.2793
Epoch 2/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step - loss: 20420.4531
Epoch 3/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step - loss: 19825.5938
Epoch 4/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 41ms/step - loss: 19262.5156
Epoch 5/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step - loss: 18729.0801
Epoch 6/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - loss: 18227.7305
Epoch 7/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 50ms/step - loss: 17776.9414
Epoch 8/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 39ms/step - loss: 17326.5684
Epoch 9/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step - loss: 16874.2598
Epoch 10/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0

<keras.src.callbacks.history.History at 0x75b86065c2c0>

When making a prediction, the model expects three time steps for two input time series.

The shape of the one sample with three time steps and two variables must be [1,3,2].
We would expect the next value in the sequence to be 100 + 105 or 205.

In [45]:
# demonstrate the prediction
x_input = array([[80,85], [90,95], [100,105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

In [46]:
print(yhat)

[[209.71591]]


**Multi-headed CNN Model**

There is another, more elaborate way to model the problem. Each input series can be handled by a separate CNN and the output of each of these submodelsd before a prediction is made for the output sequence. We can refer to this as a multi-headed CNN model. It may allow more flexibility , case it allows you to configure each submodel differently for each input series, such as the number of filter maps and the kernel size.

First, we can define the first input model as a 1D CNN with an input layer that expects vectros with ``n_steps`` and 1 feature.

```python 
#first input model
visible1 = Input(shape=(n_steps, n_features))
cnn1 = COnv1D(64, 2, activation='relu')(visible1)
cnn1 = MaxPooling1D()(cnn1)
cnn1 = Flatten()(cnn1)

We can define the second input submodel in the same way

```python
#second input model
visible2 = Input(shape=(n_steps, n_features))
cnn2 = COnv1D(64, 2, activation='relu')(visible2)
cnn2 = MaxPooling1D()(cnn2)
cnn2 = Flatten()(cnn2)

Now that both input submodels have been defined, we can merge the output from each model into one long vector which can be interpreted before making a preciction for the output sequence

```python
# merge input models
merge = concatenate([cnn1, cnn2])
dense = Dense(50, activation='relu')(merge)
output = Dense(1)(dense)

We can then tie the inputs and outputs together

```python
#connect input and output models
model = Model(inputs=[visible1, visible2], outputs=output)

This model requieres input to be provided as a list of two elements where each element in the list contains data for one of the submodels. In order yto achieve this, we can split the 3D input data into two separate arrays of input data; that is from one array with shape ``[7,3,2]`` to two 3D arrays with ``[7,3,1]``

```python
# one time series per head
n_features = 1
# separate input data
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)
X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)

These data can then be provided in order to fit the model.

```python
# fit model
model.fit([X1,X2], y, epochs=1000, verbose=0)

Similarly we must prepare the data for a single sample as two separate two-dimensional arrays when making a single one-step prediction.

```python
# reshape one sample for making a prediction
x_input = array([[80,85], [90,95], [100,105]])
x1 = x_input[:, 0].reshape((1, n_steps, n_features))
x2 = x_input[:, 1].reshape((1, n_steps, n_features))

**Together we get**

In [49]:
from numpy import array
from numpy import hstack
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten 
from keras.layers import Conv1D
from keras.layers import MaxPooling1D
from keras.layers import concatenate

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        end_ix = i + n_steps
        #check if we are beyond the dataset
        if end_ix > len(sequences):
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)


# define input sequence
in_seq1 = array([i for i in range(10, 100, 10)])
in_seq2 = array([i for i in range(15, 100, 10)])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])
# convert to [row, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
#horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# choose a number of time steps
n_steps = 3 
# convert into input/output
X, y = split_sequences(dataset, n_steps)

# one time series per head
n_features = 1
# separate input data
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)
X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)

In [50]:
#first input model
visible1 = Input(shape=(n_steps, n_features))
cnn1 = Conv1D(64, 2, activation='relu')(visible1)
cnn1 = MaxPooling1D()(cnn1)
cnn1 = Flatten()(cnn1)

#second input model
visible2 = Input(shape=(n_steps, n_features))
cnn2 = Conv1D(64, 2, activation='relu')(visible2)
cnn2 = MaxPooling1D()(cnn2)
cnn2 = Flatten()(cnn2)

# merge input models
merge = concatenate([cnn1, cnn2])
dense = Dense(50, activation='relu')(merge)
output = Dense(1)(dense)

#connect input and output models
model = Model(inputs=[visible1, visible2], outputs=output)

model.compile(optimizer='adam', loss='mse')

In [51]:
# fit model
model.fit([X1,X2], y, epochs=1000, verbose=1)

Epoch 1/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - loss: 16544.3184
Epoch 2/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 46ms/step - loss: 15804.3994
Epoch 3/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 45ms/step - loss: 15151.2969
Epoch 4/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 46ms/step - loss: 14546.3770
Epoch 5/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - loss: 13944.8262
Epoch 6/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step - loss: 13350.3262
Epoch 7/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - loss: 12763.8623
Epoch 8/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - loss: 12187.3115
Epoch 9/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step - loss: 11621.7012
Epoch 10/1000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

<keras.src.callbacks.history.History at 0x75b8605d03e0>

In [52]:
# demosntrate prediction
x_input = array([[80,85], [90,95], [100,105]])
x1 = x_input[:, 0].reshape((1, n_steps, n_features))
x2 = x_input[:, 1].reshape((1, n_steps, n_features))

In [53]:
yhat = model.predict([x1,x2], verbose=0)
print(yhat)

[[205.8418]]


### 8.3.2 Multiple Parallel Series

An alternate time series problem is the case where there are multiple parallel time series and a calue muest be predicted for each.

In [62]:
# define input sequence
in_seq1 = array([i for i in range(10, 100, 10)])
in_seq2 = array([i for i in range(15, 100, 10)])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])
# convert to [row, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
#horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

In [63]:
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        end_ix = i + n_steps
        #check if we are beyond the dataset
        if end_ix > len(sequences)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

In [64]:
# choose a number of time steps
n_steps = 3 
# convert into input/output
X, y = split_sequences(dataset, n_steps)

In [65]:
print(X.shape, y.shape)

(6, 3, 3) (6, 3)


In [66]:
#summarize the data
for i in range(len(X)):
    print(X[i], y[i])

[[10 15 25]
 [20 25 45]
 [30 35 65]] [40 45 85]
[[20 25 45]
 [30 35 65]
 [40 45 85]] [ 50  55 105]
[[ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]] [ 60  65 125]
[[ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]] [ 70  75 145]
[[ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]] [ 80  85 165]
[[ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]] [ 90  95 185]


**Vector-Output CNN Model**

We are now ready to fit a 1D CNN model on this data. In this model, the number of time steps and parallel series features are specified for the input layer via the ``input_shape`` argument.

The number of parallel series is also used in the specification of the number of values to predict by the model in the output layer; agin, this is three.

In [67]:
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten 
from keras.layers import Conv1D
from keras.layers import MaxPooling1D

In [68]:
# the dataset knows the number of featuees, e.g. 2
n_features = X.shape[2]

In [69]:
# define the model
model = Sequential()
model.add(Conv1D(64, 2, activation = 'relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [73]:
# fit model 
model.fit(X, y, epochs=3000, verbose = 1)

Epoch 1/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 818ms/step - loss: 10448.5244
Epoch 2/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - loss: 9837.8643
Epoch 3/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step - loss: 9267.9092
Epoch 4/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 41ms/step - loss: 8717.2568
Epoch 5/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 41ms/step - loss: 8184.9517
Epoch 6/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step - loss: 7669.6816
Epoch 7/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step - loss: 7186.9883
Epoch 8/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 76ms/step - loss: 6734.5156
Epoch 9/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 50ms/step - loss: 6298.4927
Epoch 10/3000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s

<keras.src.callbacks.history.History at 0x75b8605ecbc0>

We can predict the next value in each of the three parallel series by providing an input of three time steps for each series.

The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3 features, or ``[1, 3, 3]``

In [74]:
# demonstrate prediction
x_input = array([[70,75, 145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose = 0)

We would expect the vector output to be: ``[100, 105, 205]``.

In [75]:
print(yhat)

[[101.29838  107.088844 208.42198 ]]


**Multi-output CNN Model**

As with multiple input series, there is another more elaborate way to model the probelm.

Each output series can be handled by a separate output CNN Model.

When training the model, it will require three separate output arrays per sample. We can achieve this by converting the output training data that has the shape ``[7,3]`` to three arrays with the shape ``[7,1]``

In [85]:
X.shape

(6, 3, 3)

In [86]:
X

array([[[ 10,  15,  25],
        [ 20,  25,  45],
        [ 30,  35,  65]],

       [[ 20,  25,  45],
        [ 30,  35,  65],
        [ 40,  45,  85]],

       [[ 30,  35,  65],
        [ 40,  45,  85],
        [ 50,  55, 105]],

       [[ 40,  45,  85],
        [ 50,  55, 105],
        [ 60,  65, 125]],

       [[ 50,  55, 105],
        [ 60,  65, 125],
        [ 70,  75, 145]],

       [[ 60,  65, 125],
        [ 70,  75, 145],
        [ 80,  85, 165]]])

In [76]:
y.shape

(6, 3)

In [87]:
y

array([[ 40,  45,  85],
       [ 50,  55, 105],
       [ 60,  65, 125],
       [ 70,  75, 145],
       [ 80,  85, 165],
       [ 90,  95, 185]])

In [101]:
# separate output
y1 = y[:, 0].reshape((y.shape[0], 1))
y2 = y[:, 1].reshape((y.shape[0], 1))
y3 = y[:, 2].reshape((y.shape[0], 1))

In [102]:
y3

array([[ 85],
       [105],
       [125],
       [145],
       [165],
       [185]])

In [103]:
#define model
visible = Input(shape=(n_steps, n_features))
cnn = Conv1D(64, 2, activation='relu')(visible)
cnn = MaxPooling1D()(cnn)
cnn = Flatten()(cnn)
cnn = Dense(50, activation='relu')(cnn)
#define output 1 
output1 = Dense(1)(cnn)
#define output 2
output2 = Dense(1)(cnn)
#define output 3 
output3 = Dense(1)(cnn)
# tie together
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')

In [104]:
# fit model
model.fit(X, [y1, y2, y3], epochs=2000, verbose=1)

Epoch 1/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - dense_26_loss: 6147.8335 - dense_27_loss: 5601.7114 - dense_28_loss: 23981.7676 - loss: 35731.3125
Epoch 2/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step - dense_26_loss: 6048.8218 - dense_27_loss: 5491.6992 - dense_28_loss: 23332.5332 - loss: 34873.0547
Epoch 3/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - dense_26_loss: 5951.2754 - dense_27_loss: 5392.2866 - dense_28_loss: 22704.1113 - loss: 34047.6719
Epoch 4/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 45ms/step - dense_26_loss: 5855.0884 - dense_27_loss: 5307.0635 - dense_28_loss: 22088.8770 - loss: 33251.0312
Epoch 5/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - dense_26_loss: 5760.7983 - dense_27_loss: 5231.4258 - dense_28_loss: 21482.3184 - loss: 32474.5410
Epoch 6/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[

<keras.src.callbacks.history.History at 0x75b8402cb680>

In [105]:
# demonstrate prediction
x_input = array([[70,75, 145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose = 0)

In [95]:
x_input

array([[[ 70,  75, 145],
        [ 80,  85, 165],
        [ 90,  95, 185]]])

In [106]:
print(yhat)

[array([[101.59847]], dtype=float32), array([[106.757904]], dtype=float32), array([[208.35326]], dtype=float32)]


## 8.4 Multi-step CNN Models

IN practice, there is little differen to the 1D CNN model in predictin a vector output represents different output variables (as in th eprevious example), or a vector outpu that represents multiple steps of one variables. Nevertheless, there are subtle and importatn differences in the way the training data is prepared.

### 8.4.1 Data Preparation

As with one-step forecasting, a time serie used for multi-step time series forecasting must be split inot samples with input and output components. Both the input and output components will be comprised of multiple time steps and may or may not have the same number of steps.
For example, given the univariate time series:

``[10 20 30 40 50 60 70 80 90]``

We could us the last three time steps as input and forecast the next two time steps

In [108]:
ejempli = array([i for i in range(10, 100, 10)])
print(ejempli)

[10 20 30 40 50 60 70 80 90]


In [109]:
# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequence)):
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out
        #check if wer are beyond the sequence
        if out_end_ix >  len(sequence):
            break
        # gather input and putput parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

In [110]:
raw_seq = array([i for i in range(10, 100, 10)])
# choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# split into samples 
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
# summarize the data
for i in range(len(X)):
    print(X[i], y[i])

[10 20 30] [40 50]
[20 30 40] [50 60]
[30 40 50] [60 70]
[40 50 60] [70 80]
[50 60 70] [80 90]


### 8.4.2 Vector Output Model

The 1D CNN can putput a vector directly that can be interpreted as a multi-step forecast. THis approach was seen in the previous section were one time step of each output time series was forecasted as a vector. Ad with the 1D CNN models for univariate data in a prior section, the prepared samples must first be reshaped. The CNN expects data to have a three-dimensional structure of ``[samples, timesteps, features]``, and in this case, we only have one feature so the reshape is straightforward.

In [111]:
# reshape from [samples, timesteps] into [samples. timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

With the number of input and output steps specified in the ``n_steps_in``a dn ``n_stepos_out`` variables, we can define a multistep time-series forecasting model

In [112]:
# define the model
model = Sequential()
model.add(Conv1D(64, 2, activation = 'relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [113]:
#fit model
model.fit(X, y, epochs=2000, verbose=1)

Epoch 1/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 837ms/step - loss: 4288.6167
Epoch 2/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step - loss: 4156.1895
Epoch 3/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 45ms/step - loss: 4029.2937
Epoch 4/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step - loss: 3909.3125
Epoch 5/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - loss: 3790.9575
Epoch 6/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 46ms/step - loss: 3674.0103
Epoch 7/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step - loss: 3558.3921
Epoch 8/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - loss: 3443.8179
Epoch 9/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step - loss: 3330.4058
Epoch 10/2000
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s

<keras.src.callbacks.history.History at 0x75b862276330>

The model can make a prediction for a single sample. AS expected by the model, the shape of the single sample of input data when making the prediction must be ``[1, 3, 1]`` for the 1 sample, 3 time steps of the input, and the single feature.

In [116]:
# demonstrate prediction
x_input = array([70,80,90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose = 0)

In [117]:
print(yhat)

[[100.32348 110.55093]]


## 8.5 Multivariate Multi-Step CNN Models