# AIDI 1002 Final Term Project Report

#### Hrim Mehta (200567452)

####  Email: 200567452@student.georgianc.on.ca

#### Introduction:

The paper I am replicating, "Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting", proposes a new approach to multivariate timeseries forecasting that first transforms the time-series into block Hankel tensors (BHT) to capture mutual correlations between the multiple timeseries and then applies the ARIMA model to predict future samples. They've evaluated their approach on 3 public datasets (traffic, electricity, and smoke video) and compared their accuracy (RNMSE metric) with competing methods of ARIMA, VAR, XGBOOST, etc. In this project, I will be testing the BHT-ARIMA model on the currency exchange rate dataset and compare its performance with that of ARMAX and LSTM.

#### Background

| Reference      | Explanation                                                                                                       | Dataset/Input                         | Weakness                                                                                                                                                    |
|----------------|-------------------------------------------------------------------------------------------------------------------|---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Shi et al. [1] | They proposed an approach that performs MDT Hankelization prior to using ARIMA to forecast mutivariate timeseries | Electricity, Traffic, and Smoke video | Tested on 3 public datasets and their model performed best compared to competing models. Need to replicate with other datasets to check if this holds true. |

In [1]:
import numpy as np
import pandas as pd
import pickle as pkl

In [2]:
# https://github.com/laiguokun/multivariate-time-series-data - source for new dataset to test models on

exchange_rate = pd.read_csv("datasets/exchange_rate.txt", sep = ',', header = None)
exchange_rate.shape

(7588, 8)

In [3]:
df = exchange_rate
df_diff = df.diff()

In [4]:
test_size = df_diff.shape[0] - round(df_diff.shape[0] * 90 / 100)

In [40]:
train = df[0:-test_size]
test = df[-test_size:]

##### Model 1: VARMAX

In [5]:
var_pickle = pkl.load(open('var_model.pickle', 'rb'))

In [10]:
test_predicted = var_pickle.forecast(steps=1)

  return get_prediction_index(


In [7]:
# https://analyticsindiamag.com/complete-guide-to-dickey-fuller-test-in-time-series-analysis/

def inverse_diff(actual_df, pred_df):
    df_res = pred_df.copy()
    columns = actual_df.columns
    for col in columns:
        df_res[str(col)+'_1st_inv_diff'] = actual_df[col].iloc[-1] + df_res[col].cumsum()
    return df_res

In [14]:
inverted_diff_predicted = inverse_diff(df, test_predicted)

In [35]:
from sklearn.metrics import mean_squared_error

test_true = df.iloc[[round(df_diff.shape[0] * 90 / 100) - 1]] # df[-test_size:]

column_to_predict = 2
rmse = np.sqrt(mean_squared_error(test_true[column_to_predict], inverted_diff_predicted[str(column_to_predict) + '_1st_inv_diff']))
nrmse_mean = rmse / (test_true[column_to_predict].mean())
print('Model: VARMAX')
print('NRMSE: {:.3f}'.format(nrmse_mean))

Model: VARMAX
NRMSE: 2391.615


##### Model 2: LSTM

In [38]:
from tensorflow import keras

model = keras.models.load_model('./lstm_model')

In [42]:
# https://www.analyticsvidhya.com/blog/2020/10/multivariate-multi-step-time-series-forecasting-using-stacked-lstm-sequence-to-sequence-autoencoder-in-tensorflow-2-0-keras/
# https://analyticsindiamag.com/how-to-do-multivariate-time-series-forecasting-using-lstm/

def split_series(series, n_past, n_future):
    X, y = list(), list()
    for window_start in range(len(series)):
        past_end = window_start + n_past
        future_end = past_end + n_future
        if future_end > len(series):
            break
        past, future = series[window_start:past_end, :], series[past_end:future_end, :]
        X.append(past)
        y.append(future)
    return np.array(X), np.array(y)

n_past = 10
n_future = 1
n_features = df.shape[1]

X_test, y_test = split_series(test.values,n_past, n_future)
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1],n_features))
y_test = y_test.reshape((y_test.shape[0], y_test.shape[1], n_features))

In [43]:
predicted = model.predict(X_test)



In [44]:
column_to_predict = 2
rmse = np.sqrt(mean_squared_error(y_test[:,0,column_to_predict], predicted[:,0,column_to_predict]))
nrmse_mean = rmse / (y_test[:,0,column_to_predict].mean())
print('Model: LSTM')
print('NRMSE: {:.3f}'.format(nrmse_mean))

Model: LSTM
NRMSE: 0.006


##### Model 3: BHT-ARIMA

In [45]:
ori_ts = np.load('exchange_rate.npy').T

In [46]:
# based on original main.py of BHT-ARIMA - modified for my dataset

from BHT_ARIMA import BHTARIMA
from BHT_ARIMA.util.utility import get_index

ts = ori_ts[..., :-1] # training data,
label = ori_ts[..., -1] # label, take the last time step as label
p = 2 # p-order
d = 1 # d-order
q = 2 # q-order
taus = [8, 5] # MDT-rank
Rs = [5, 5] # tucker decomposition ranks
k =  10 # iterations
tol = 0.001 # stop criterion
Us_mode = 4 # orthogonality mode
model = BHTARIMA(ts, p, d, q, taus, Rs, k, tol, verbose=0, Us_mode=Us_mode)

In [47]:
result, _ = model.run()

In [52]:
predicted = result[..., -1]

In [53]:
nrmse = get_index(predicted, label)['nrmse']
print('Model: BHT-ARIMA')
print('NRMSE: {:.3f}'.format(nrmse))

Model: BHT-ARIMA
NRMSE: 37120.933


#### Conclusion and Future Direction

Based on the NRMSE metric, out of the 3 models (VARMAX, LSTM, and BHT-ARIMA) tested in this project for predicting future exchange rates of currencies, LSTM performs the best, followed by VARMAX, and lastly BHT-ARIMA. Both VARMAX and BHT-ARIMA need to be looked into further as I am seeing very high NRMSE values meaning that both of these models are very poor at predicting the currency exchange rates for countries.

#### References:

[1]  Shi, Qiquan, et al. "Block Hankel tensor ARIMA for multiple short time series forecasting." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 04. 2020.
[1] Shi, Q., Yin, J., Cai, J., Cichocki, A., Yokota, T., Chen, L., ... & Zeng, J. (2020, April). Block Hankel tensor ARIMA for multiple short time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 04, pp. 5758-5766).