<a href="https://colab.research.google.com/github/cagBRT/timeSeries/blob/main/9b_Multivariate_MLPs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Our split sequence function used in the previous notebooks

In [None]:
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps): 
  X, y = list(), list()
  for i in range(len(sequences)):
    # find the end of this pattern
    end_ix = i + n_steps
    # check if we are beyond the dataset
    if end_ix > len(sequences): break
    # gather input and output parts of the pattern
    seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] 
    X.append(seq_x)
    y.append(seq_y)
  return array(X), array(y)

# **Multivariate MLP Models**

Multivariate time series data means data where there is more than one observation for each time step.

There are basically two models that we can use with multivariate time series data:<br>
- Multiple Input Series<br>
- Multiple Parallel Series<br>


In [None]:
from numpy import array
from numpy import hstack
from keras.models import Sequential 
from keras.layers import Dense

An example of a multiple input series. 

In [None]:
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])

In [None]:
out_seq

We can reshape these three arrays of data as a single dataset where each row is a time step and each column is a separate time series. <be>

This is a standard way of storing parallel time series in a CSV file.

In [None]:
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) 
in_seq2 = in_seq2.reshape((len(in_seq2), 1)) 
out_seq = out_seq.reshape((len(out_seq), 1))

hstack --> horizontally stack the data 

In [None]:
dataset = hstack((in_seq1, in_seq2, out_seq))

We get one row per dataset and one column for each input and one for the output

In [None]:
dataset

Select the number of time steps

In [None]:
num_steps = 3
X,y = split_sequences(dataset,num_steps)
print(X.shape, y.shape)

In [None]:
for i in range(len(X)):
  print(X[i], y[i])

**Flatten the input for the MLP**

MLPs require the data for each label be entered at the same time. So the inputs must be flattened. 

For example:<br>
[70,75]<br>
[80,85]<br>
[90,95]<br>
<br>
Needs to be flattened to:<br>
[70,75,80,85,90,95]

Calculate the length of the flattened data. <br>
Then reshape the vector to this size

In [None]:
n_input = X.shape[1] * X.shape[2]
X = X.reshape((X.shape[0], n_input))

**Define and train the model**

This model is an MLP that accepts flattened time series data.

In [None]:
model=Sequential()
model.add(Dense(100, activation='relu', input_dim=n_input))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

In [None]:
model.fit(X, y, epochs=2000, verbose=0)

**The prediction**

In [None]:
x_input = array([[80, 85], [90, 95], [100, 105]]) 
x_input = x_input.reshape((1, n_input))

In [None]:
yhat = model.predict(x_input, verbose=0)
print(yhat)

**Assignment 1**<br>
Add to the dataset, retrain the model and check the prediction. 

**Assignment 2**<br>
Change the number of time steps. <br>
How does the model perform?