**Univariate Multi-Step LSTM Models**

A time series forecasting problem that requires a prediction of multiple time steps into the future can be referred to as multi-step time series forecasting.

Specifically, these are problems where the forecast horizon or interval is more than one time step.

There are two main types of LSTM models that can be used for multi-step forecasting; they are:

1- Vector Output Model

2- Encoder-Decoder Model


In [1]:
conda install numpy

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Retrieving notices: ...working... done

Note: you may need to restart the kernel to use updated packages.


In [2]:
# conda install -c intel mkl

In [2]:
pip install keras

Collecting keras
  Using cached keras-2.10.0-py2.py3-none-any.whl (1.7 MB)
Installing collected packages: keras
Successfully installed keras-2.10.0
Note: you may need to restart the kernel to use updated packages.


In [4]:
# conda install tensorflow-gpu=2.3 tensorflow=2.3=mkl_py37h936c3e2_0

In [3]:
pip install tensorflow

Collecting tensorflow
  Using cached tensorflow-2.10.0-cp39-cp39-macosx_10_14_x86_64.whl (241.2 MB)
Collecting opt-einsum>=2.3.2
  Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Collecting absl-py>=1.0.0
  Using cached absl_py-1.3.0-py3-none-any.whl (124 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1
  Using cached tensorflow_io_gcs_filesystem-0.27.0-cp39-cp39-macosx_10_14_x86_64.whl (1.6 MB)
Collecting tensorboard<2.11,>=2.10
  Using cached tensorboard-2.10.1-py3-none-any.whl (5.9 MB)
Collecting grpcio<2.0,>=1.24.3
  Using cached grpcio-1.50.0-cp39-cp39-macosx_10_10_x86_64.whl (4.5 MB)
Collecting astunparse>=1.6.0
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting gast<=0.4.0,>=0.2.1
  Using cached gast-0.4.0-py3-none-any.whl (9.8 kB)
Collecting flatbuffers>=2.0
  Using cached flatbuffers-22.10.26-py2.py3-none-any.whl (26 kB)
Collecting termcolor>=1.1.0
  Using cached termcolor-2.1.0-py3-none-any.whl (5.8 kB)
Collecting libclang>=13.0.0
  Using cached

In [None]:
# univariate stacked lstm example
import pandas as pd A
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Bidirectional
from keras.layers import RepeatVector

#for CNN and Conv LSTM
from keras.layers import Flatten
from keras.layers import TimeDistributed
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.layers import ConvLSTM2D

import warnings
warnings.filterwarnings('ignore')

In [None]:
series = pd.read_csv('dma.csv', hea2der=0, parse_dates=[0], index_col=0, squeeze=True)
dma_df = pd.DataFrame(series)

In [None]:
# reading CSV file
data = pd.read_csv('dma.csv')

 
# converting column data to list
date_week = data['date_week'].tolist()
dma = data['DMA_CODE'].tolist()
TOTAL_NURTEC_NBRX = data['TOTAL_NURTEC_NBRX'].tolist()
df= list(zip(dma, date_week, TOTAL_NURTEC_NBRX))
dma_df = pd.DataFrame(df, columns=['dma', 'date_week','TOTAL_NURTEC_NBRX'])

In [None]:
dma_df.shape

In [None]:
dma_df.dtypes

In [None]:
dma_df.head(3)


Before we look at these models, let’s first look at the preparation of data for multi-step forecasting.

In [None]:
# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
	X, y = list(), list()
	for i in range(len(sequence)):
		# find the end of this pattern
		end_ix = i + n_steps_in
		out_end_ix = end_ix + n_steps_out
		# check if we are beyond the sequence
		if out_end_ix > len(sequence):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)


**Vector Output Model**

The LSTM expects data to have a three-dimensional structure of 

[samples, timesteps, features]

In [None]:
# define model
def vector_model(n_steps_in, n_steps_out, n_features , ep ):
  vector_model = Sequential()
  vector_model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
  vector_model.add(LSTM(100, activation='relu'))
  vector_model.add(Dense(n_steps_out))
  vector_model.compile(optimizer='adam', loss='mse')

  # fit model
  vector_model.fit(X, y, epochs= ep, verbose=0)

  # demonstrate prediction
  df1 = dma_df.iloc[-3: , :]
  x_input = df1['TOTAL_NURTEC_NBRX']
  x_input = array(x_input )

  x_input = x_input.reshape((1, n_steps_in, n_features))
  yhat = vector_model.predict(x_input, verbose=0)
  return yhat

**Encoder-Decoder Model**

A model specifically developed for forecasting variable length output sequences is called the Encoder-Decoder LSTM.

The model was designed for prediction problems where there are both input and output sequences, so-called sequence-to-sequence, or seq2seq problems, such as translating text from one language to another.

This model can be used for multi-step time series forecasting.

In [None]:
# define model
def encoder_decoder_model(n_steps_in, n_steps_out, n_features, ep):
  encoder_model = Sequential()
  encoder_model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features)))
  encoder_model.add(RepeatVector(n_steps_out))
  encoder_model.add(LSTM(100, activation='relu', return_sequences=True))
  encoder_model.add(TimeDistributed(Dense(1)))
  encoder_model.compile(optimizer='adam', loss='mse')

  # fit model
  encoder_model.fit(X, y, epochs=ep, verbose=0)

  # demonstrate prediction
  df1 = dma_df.iloc[-3: , :]
  x_input = df1['TOTAL_NURTEC_NBRX']
  x_input = array(x_input )
  x_input = x_input.reshape((1, n_steps_in, n_features))
  yhat = encoder_model.predict(x_input, verbose=0)

  return yhat


**Running models on DMA daat:**

In [None]:
dma = []
v_model = []
e_model = []

for item in dma_df['dma'].unique():
  raw_seq= dma_df['TOTAL_NURTEC_NBRX']
  print(item)
  n_steps_in, n_steps_out = 3, 2
  X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
  n_features = 1
  X = X.reshape((X.shape[0], X.shape[1], n_features))
  y = y.reshape((y.shape[0], y.shape[1], n_features))
  ep = 200
  yhat_vector = vector_model(n_steps_in, n_steps_out, n_features , ep )
  yhat_encoder = encoder_decoder_model(n_steps_in, n_steps_out, n_features, ep)
  
  dma.append(item)
  v_model.append(yhat_vector)
  e_model.append(yhat_encoder)

In [None]:
dma_prediction = pd.DataFrame(
    {'DMA': dma,
     'vector_model': v_model,
     'encoder_model': e_model
    })

In [None]:
dma_prediction