## LSTM model

We will be working with **Multivariate time series data** which means that the data has more than one feature (input) for each time step

In [1]:
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

## Dataset

Simple example of two input time series (**X1** and **X2**) where the output series (**y**) is the simple addition of the input series (**y=X1+X2**)

In [2]:
N = 50 # Number of samples
in_1 = [i*10 for i in range(1,N)]
in_2 = [i*10+5 for i in range(1,N)]
out = [in_1[i]+in_2[i] for i in range(len(in_1))]

In [3]:
df = pd.DataFrame()
df["X1"] = in_1
df["X2"] = in_2
df["y"] = out
df.head()

Unnamed: 0,X1,X2,y
0,10,15,25
1,20,25,45
2,30,35,65
3,40,45,85
4,50,55,105


## Made data as 3D

This is the main step where we make our dataset as 3D shape (because LSTM requires it in this way) 

*(number_samples, number_timesteps, number_features)*

Obviously `number_features = 2` here (because **X1, X2** are the 2 input features)

Suppose if `number_timesteps = 3` then, our first sample input will be:

**Input**:

10, 15

20, 25

30, 35

and our output will be (corresponding to last sample):

**Output**:

65

We can define a function named **split_sequences()** that will make the dataset into the form as we have discussed above (through some very simple manipulations)

In this function, the arguement `sequences` denote the entire dataset and `n_steps` denote the **number_timesteps**

In [4]:
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        end_ix = i + n_steps
        if end_ix > len(sequences):
          break
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
        X.append(seq_x)
        y.append(seq_y)
    return np.array(X), np.array(y)

In [5]:
X, y = split_sequences(df.values, 3)

In [6]:
X.shape, y.shape

((47, 3, 2), (47,))

### First sample

In [7]:
X[0], y[0]

(array([[10, 15],
        [20, 25],
        [30, 35]], dtype=int64),
 65)

## Split to train and test

Here number of samples is $N=50$

So, we will split train and test as 40 and 10 samples respectively

In [8]:
X_train, y_train = X[:40], y[:40]
X_test, y_test = X[40:], y[40:]

In [9]:
class Ridge_R():
      
    def __init__(self, λ=1e-1):
        self.λ = λ # Ridge regression parameter
          
    def fit(self, X, y):
        self.m, self.n = X.shape
        z1 = (X.T @ X + self.λ * np.eye(self.n))
        z2 = X.T @ y
        self.W = np.linalg.inv(z1) @ z2

    def predict(self, X):    
        return X.dot(self.W)

In [10]:
row_sums = X.sum(axis=1)
X = X / row_sums[:, np.newaxis]

In [11]:
from sklearn.model_selection import train_test_split
X_train1, X_test, y_train1, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train1, y_train1, test_size = 0.2, random_state=42)

In [12]:
X_train = np.c_[np.ones(X_train.shape), X_train]
X_train1 = np.c_[np.ones(X_train1.shape), X_train1]
X_val = np.c_[np.ones(X_val.shape), X_val]
X_test = np.c_[np.ones(X_test.shape), X_test]

## Training the model

We are now ready to fit an LSTM model on this data

In [13]:
# define model
model = Sequential()

n_steps, n_features = 3, 2

# Each of our sample has input of shape (number_timesteps, number_features) 
# This is provided to the LSTM model via input_shape argument
# Activation function is set as 'relu' and there are 50 nodes in LSTM model

model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))

# Since each sample has an output of shape=1 only (remember first sample output was 65)
# Thus we add a Dense layer with only 1 node 

model.add(Dense(1))

# Our optimizer is "adam" and loss that we use for regression is "mse" (mean square)

model.compile(optimizer='adam', loss='mape')

## Summary of the model parameters

In [14]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, 50)                10600     
_________________________________________________________________
dense (Dense)                (None, 1)                 51        
Total params: 10,651
Trainable params: 10,651
Non-trainable params: 0
_________________________________________________________________


## Fit the model

Finally we fit the model on train dataset and predict it on test dataset

In [15]:
model.fit(X_train, y_train, batch_size=16,epochs=200, verbose=1)

Epoch 1/200


ValueError: in user code:

    C:\Users\porus\anaconda3\lib\site-packages\keras\engine\training.py:830 train_function  *
        return step_function(self, iterator)
    C:\Users\porus\anaconda3\lib\site-packages\keras\engine\training.py:813 run_step  *
        outputs = model.train_step(data)
    C:\Users\porus\anaconda3\lib\site-packages\keras\engine\training.py:770 train_step  *
        y_pred = self(x, training=True)
    C:\Users\porus\anaconda3\lib\site-packages\keras\engine\base_layer.py:989 __call__  *
        input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
    C:\Users\porus\anaconda3\lib\site-packages\keras\engine\input_spec.py:264 assert_input_compatibility  *
        raise ValueError('Input ' + str(input_index) +

    ValueError: Input 0 is incompatible with layer sequential: expected shape=(None, None, 2), found shape=(None, 3, 4)


# Make prediction
We predict the values on test data

In [None]:
yhat = model.predict(X_test, verbose=0)
yhat = yhat.flatten()

## First test sample

In [None]:
X_test[0]

In [None]:
y_test[0], yhat[0]

### So we observe such close prediction!



## Second Test sample

In [None]:
X_test[1]

In [None]:
y_test[1], yhat[1]

### Again the prediction is very close!