<a href="https://colab.research.google.com/github/TashreefMuhammad/Transformer-to-predict-stock-market-using-time2vec/blob/main/Code.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transformer-Based Deep Learning Model for Stock Price Prediction: A Case Study on Bangladesh Stock Market

This notebook contaiins some of the initial components of the research title that have been given.

It is not the final version, but it comprises almost all of the required basic functionalities. It will help readers understand better the formation of the code.

This code is provided as a courtesy so that people who find the topic interesting, can get ideas on implementation of the code. Certain parts are hidden from the code as there are further studies that we are running in this area.

However, I hope that the code will be capable of running if correct environment is set-up.

All associated research related data for this topic is shared following

## Pre-print

The pre-print of the research was published on [arXiv](https://arxiv.org/). The paper can be found through the identifier [2208.08300](https://arxiv.org/abs/2208.08300). The pre-print does not contain all the details that are available in the published version. The published version is peer-reviewd and far more enhanced and polished with further investigation and assessments.

I would heavily encourage you to read the published version if you want to get the full potential of the study.

## Research Article
This paper was peer-reviewed and published in [International Journal of Computational Intelligence and Applications](https://www.worldscientific.com/worldscinet/ijcia) also known as IJCIA. Yoou can find the article [here](https://www.worldscientific.com/doi/10.1142/S146902682350013X).

## Dataset
The associated dataset of the conducted research, and researches related on this topic will be published through [Mendeley Data](https://data.mendeley.com/). The dataset can be found using the DOI: [10.17632/23553sm4tn](https://data.mendeley.com/datasets/23553sm4tn)

# Code

The remaining part of this notebook contains codes that should work. This can be considered as the official code as it is mostly similar to the ones used for producing the results of this study but has been modified by cleaning the outlook and removing personal informations for being shared on public repository.

# Importing Libraries

Import libraries that will be necessarry for the experiemnt

In [None]:
import numpy as np
import pandas as pd
from google.colab import drive

import tensorflow as tf
from keras import backend as K
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
print('Tensorflow version: {}'.format(tf.__version__))

import matplotlib.pyplot as plt
plt.style.use('seaborn')

import warnings
warnings.filterwarnings('ignore')

# Declaring Variables and Values

Declaring some variables and values that will be used throughout the experiment

In [None]:
batch_size = 32
seq_len = 8
features = 5
# I created a specific folder in my Google Drive to store data and to write data
# It is up to you, on how to set it up. I did it for easy coding
PATHRead = ''
PATHSave = ''

# The research consentrated on Daily abd Weekly Data. On the paper, it has been
# explained on how to gain Weekly data from Daily Data
chartType = ['Daily', 'Weekly']
# In the conducted study, we worked on eight companies. These are there
# Trading Codes
companyNames = ['1JANATAMF', 'AAMRANET', 'ABBANK', 'ACI', 'ACIFORMULA', 'AGRANINS', 'ALLTEX', 'DELTALIFE']

drive.mount('/content/gdrive')

# Building Transformer Model

Build the necessarry classes and instances for ***Transformer Model***. Ideas for this part was taken after reviewing codes from [JanSchm/CapMarket](https://github.com/JanSchm/CapMarket) where he used the *time2vec* and developed the backbone of the defined model.

## Defining Hyperparameters

Define hyperparameter for later use

In [None]:
d_k = 256
d_v = 256
n_heads = 12
ff_dim = 256

## Implementing Time2Vec

Implementation of time2vec

In [None]:
class Time2Vector(Layer):
  def __init__(self, seq_len, **kwargs):
    super(Time2Vector, self).__init__()
    self.seq_len = seq_len

  def build(self, input_shape):
    '''Initialize weights and biases with shape (batch, seq_len)'''
    self.weights_linear = self.add_weight(name='weight_linear',
                                shape=(int(self.seq_len),),
                                initializer='uniform',
                                trainable=True)

    self.bias_linear = self.add_weight(name='bias_linear',
                                shape=(int(self.seq_len),),
                                initializer='uniform',
                                trainable=True)

    self.weights_periodic = self.add_weight(name='weight_periodic',
                                shape=(int(self.seq_len),),
                                initializer='uniform',
                                trainable=True)

    self.bias_periodic = self.add_weight(name='bias_periodic',
                                shape=(int(self.seq_len),),
                                initializer='uniform',
                                trainable=True)

  def call(self, x):
    '''Calculate linear and periodic time features'''
    x = tf.math.reduce_mean(x[:,:,:4], axis=-1)
    time_linear = self.weights_linear * x + self.bias_linear
    time_linear = tf.expand_dims(time_linear, axis=-1)

    time_periodic = tf.math.sin(tf.multiply(x, self.weights_periodic) + self.bias_periodic)
    time_periodic = tf.expand_dims(time_periodic, axis=-1)
    return tf.concat([time_linear, time_periodic], axis=-1)

  def get_config(self):
    config = super().get_config().copy()
    config.update({'seq_len': self.seq_len})
    return config

## Attention Layers

Make attention layers

In [None]:
class SingleAttention(Layer):
  def __init__(self, d_k, d_v):
    super(SingleAttention, self).__init__()
    self.d_k = d_k
    self.d_v = d_v

  def build(self, input_shape):
    self.query = Dense(self.d_k,
                       input_shape=input_shape,
                       kernel_initializer='glorot_uniform',
                       bias_initializer='glorot_uniform')

    self.key = Dense(self.d_k,
                     input_shape=input_shape,
                     kernel_initializer='glorot_uniform',
                     bias_initializer='glorot_uniform')

    self.value = Dense(self.d_v,
                       input_shape=input_shape,
                       kernel_initializer='glorot_uniform',
                       bias_initializer='glorot_uniform')

  def call(self, inputs):
    q = self.query(inputs[0])
    k = self.key(inputs[1])

    attn_weights = tf.matmul(q, k, transpose_b=True)
    attn_weights = tf.map_fn(lambda x: x/np.sqrt(self.d_k), attn_weights)
    attn_weights = tf.nn.softmax(attn_weights, axis=-1)

    v = self.value(inputs[2])
    attn_out = tf.matmul(attn_weights, v)
    return attn_out

#############################################################################

class MultiAttention(Layer):
  def __init__(self, d_k, d_v, n_heads):
    super(MultiAttention, self).__init__()
    self.d_k = d_k
    self.d_v = d_v
    self.n_heads = n_heads
    self.attn_heads = list()

  def build(self, input_shape):
    for n in range(self.n_heads):
      self.attn_heads.append(SingleAttention(self.d_k, self.d_v))

    self.linear = Dense(input_shape[0][-1],
                        input_shape=input_shape,
                        kernel_initializer='glorot_uniform',
                        bias_initializer='glorot_uniform')

  def call(self, inputs):
    attn = [self.attn_heads[i](inputs) for i in range(self.n_heads)]
    concat_attn = tf.concat(attn, axis=-1)
    multi_linear = self.linear(concat_attn)
    return multi_linear

#############################################################################

class TransformerEncoder(Layer):
  def __init__(self, d_k, d_v, n_heads, ff_dim, dropout=0.1, **kwargs):
    super(TransformerEncoder, self).__init__()
    self.d_k = d_k
    self.d_v = d_v
    self.n_heads = n_heads
    self.ff_dim = ff_dim
    self.attn_heads = list()
    self.dropout_rate = dropout

  def build(self, input_shape):
    self.attn_multi = MultiAttention(self.d_k, self.d_v, self.n_heads)
    self.attn_dropout = Dropout(self.dropout_rate)
    self.attn_normalize = LayerNormalization(input_shape=input_shape, epsilon=1e-6)

    self.ff_conv1D_1 = Conv1D(filters=self.ff_dim, kernel_size=1, activation='relu')
    self.ff_conv1D_2 = Conv1D(filters=input_shape[0][-1], kernel_size=1)
    self.ff_dropout = Dropout(self.dropout_rate)
    self.ff_normalize = LayerNormalization(input_shape=input_shape, epsilon=1e-6)

  def call(self, inputs):
    attn_layer = self.attn_multi(inputs)
    attn_layer = self.attn_dropout(attn_layer)
    attn_layer = self.attn_normalize(inputs[0] + attn_layer)

    ff_layer = self.ff_conv1D_1(attn_layer)
    ff_layer = self.ff_conv1D_2(ff_layer)
    ff_layer = self.ff_dropout(ff_layer)
    ff_layer = self.ff_normalize(inputs[0] + ff_layer)
    return ff_layer

  def get_config(self):
    config = super().get_config().copy()
    config.update({'d_k': self.d_k,
                   'd_v': self.d_v,
                   'n_heads': self.n_heads,
                   'ff_dim': self.ff_dim,
                   'attn_heads': self.attn_heads,
                   'dropout_rate': self.dropout_rate})
    return config

## Create Transformer Model

Create instance of ***Transformer Model*** using the following method

One think I would like to point out, is that in the research, RMSE was used. But as you can see that we are using Tensorflow and Keras, they do not provide an easy option for using RMSE. Hence, I took idea from [StackOverflow](https://stackoverflow.com/questions/43855162/rmse-rmsle-loss-function-in-keras) question answers and implemented RMSE as the loss function.

In [None]:
def loss_fn(y_true, y_pred):
  return K.sqrt(K.mean(K.square(y_pred - y_true)))

def createTransformerModel():
  # Initialize time and transformer layers
  time_embedding = Time2Vector(seq_len)
  attn_layer1 = TransformerEncoder(d_k, d_v, n_heads, ff_dim)
  attn_layer2 = TransformerEncoder(d_k, d_v, n_heads, ff_dim)
  attn_layer3 = TransformerEncoder(d_k, d_v, n_heads, ff_dim)

  # Construct model
  in_seq = Input(shape=(seq_len, features))
  x = time_embedding(in_seq)
  x = Concatenate(axis=-1)([in_seq, x])
  x = attn_layer1((x, x, x))
  x = attn_layer2((x, x, x))
  x = attn_layer3((x, x, x))
  x = GlobalAveragePooling1D(data_format='channels_first')(x)
  x = Dropout(0.1)(x)
  x = Dense(64, activation='relu')(x)
  x = Dropout(0.1)(x)
  out = Dense(1, activation='linear')(x)

  model = Model(inputs=in_seq, outputs=out)
  model.compile(loss=loss_fn, optimizer='adam', metrics=['mae'])
  return model

## Print Developed Model

Print the developed model

In [None]:
def printModelStructure(model, path):
  tf.keras.utils.plot_model(
      model,
      to_file = path,
      show_shapes = True,
      show_layer_names = True,
      expand_nested = True,
      dpi = 96,)

printModelStructure(createTransformerModel(), '{}TransformerModelArchitecture.png'.format(PATHSave))

# Data Manipulation

Converting Data to a useful format

## Extract CSV File

Extract data from CSV file

In [None]:
def prepareData(path_data):
  df = pd.read_csv(path_data)

  # Formating the Data column as Bangladesh Standard System (d-m-y) is not directly interpreted in python
  df['Date'] = pd.to_datetime(df['Date'], dayfirst = True)

  # Replace 0 to avoid dividing by 0 later on
  df['Volume'].replace(to_replace=0, method='ffill', inplace=True)

  # Sorting by Date just to ensure dataframe is sorted by date
  df.sort_values(by = 'Date', inplace = True, ignore_index = True)

  # Apply moving average with a window of 10 days to all columns
  df[['Open', 'High', 'Low', 'Close', 'Volume']] = df[['Open', 'High', 'Low', 'Close', 'Volume']].rolling(10).mean()


  # Drop all rows with NaN values
  df.dropna(how='any', axis=0, inplace=True)
  df.reset_index(drop=True)

  return df

## Stationize and Normalize Data

Making data stationary and normalizing it

In [None]:
def normalizeData(df):
  ###################################
  #== Calculate percentage change ==#
  ###################################

  df['Open'] = df['Open'].pct_change() # Create arithmetic returns column
  df['High'] = df['High'].pct_change() # Create arithmetic returns column
  df['Low'] = df['Low'].pct_change() # Create arithmetic returns column
  df['Close'] = df['Close'].pct_change() # Create arithmetic returns column
  df['Volume'] = df['Volume'].pct_change()

  df.dropna(how='any', axis=0, inplace=True) # Drop all rows with NaN values
  # print(df)

  ##################################
  #== Find Indices for Splitting ==#
  ##################################

  times = sorted(df.index.values)
  last_10pct = sorted(df.index.values)[-int(0.1*len(times))] # Last 10% of series
  last_20pct = sorted(df.index.values)[-int(0.2*len(times))] # Last 20% of series

  ###############################################################################

  ###############################
  #== Normalize price columns ==#
  ###############################

  min_return = min(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].min(axis=0))
  max_return = max(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].max(axis=0))

  # Min-max normalize price columns (0-1 range)
  df['Open'] = (df['Open'] - min_return) / (max_return - min_return)
  df['High'] = (df['High'] - min_return) / (max_return - min_return)
  df['Low'] = (df['Low'] - min_return) / (max_return - min_return)
  df['Close'] = (df['Close'] - min_return) / (max_return - min_return)

  ###############################################################################

  ###############################
  #== Normalize volume column ==#
  ###############################

  min_volume = df[(df.index < last_20pct)]['Volume'].min(axis=0)
  max_volume = df[(df.index < last_20pct)]['Volume'].max(axis=0)

  # Min-max normalize volume columns (0-1 range)
  df['Volume'] = (df['Volume'] - min_volume) / (max_volume - min_volume)

  ###############################################################################

  ##################################################
  #== Create training, validation and test split ==#
  ##################################################

  df_train = df[(df.index < last_20pct)]  # Training data are 80% of total data
  df_val = df[(df.index >= last_20pct) & (df.index < last_10pct)]
  df_test = df[(df.index >= last_10pct)]

  # Remove date column
  df_train.drop(columns=['Date'], inplace=True)
  df_val.drop(columns=['Date'], inplace=True)
  df_test.drop(columns=['Date'], inplace=True)

  # Convert pandas columns into arrays
  train_data = df_train.values
  val_data = df_val.values
  test_data = df_test.values

  print('Training data shape: {}'.format(train_data.shape))
  print('Validation data shape: {}'.format(val_data.shape))
  print('Test data shape: {}'.format(test_data.shape))

  # df_train.head()
  return df, train_data, val_data, test_data, df_train, df_val, df_test

def unSplittedNormalize(df):
  tmp_data = df.copy()
  ###################################
  #== Calculate percentage change ==#
  ###################################

  df['Open'] = df['Open'].pct_change() # Create arithmetic returns column
  df['High'] = df['High'].pct_change() # Create arithmetic returns column
  df['Low'] = df['Low'].pct_change() # Create arithmetic returns column
  df['Close'] = df['Close'].pct_change() # Create arithmetic returns column
  df['Volume'] = df['Volume'].pct_change()

  df.dropna(how='any', axis=0, inplace=True) # Drop all rows with NaN values
  # print(df)
  raw_data = tmp_data[tmp_data['Date'].isin(df['Date'])]


  ##################################
  #== Find Indices for Splitting ==#
  ##################################

  times = sorted(df.index.values)
  last_10pct = sorted(df.index.values)[-int(0.1*len(times))] # Last 10% of series
  last_20pct = sorted(df.index.values)[-int(0.2*len(times))] # Last 20% of series

  ###############################################################################

  ###############################
  #== Normalize price columns ==#
  ###############################

  min_return = min(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].min(axis=0))
  max_return = max(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].max(axis=0))

  # Min-max normalize price columns (0-1 range)
  df['Open'] = (df['Open'] - min_return) / (max_return - min_return)
  df['High'] = (df['High'] - min_return) / (max_return - min_return)
  df['Low'] = (df['Low'] - min_return) / (max_return - min_return)
  df['Close'] = (df['Close'] - min_return) / (max_return - min_return)

  ###############################################################################

  ###############################
  #== Normalize volume column ==#
  ###############################

  min_volume = df[(df.index < last_20pct)]['Volume'].min(axis=0)
  max_volume = df[(df.index < last_20pct)]['Volume'].max(axis=0)

  # Min-max normalize volume columns (0-1 range)
  df['Volume'] = (df['Volume'] - min_volume) / (max_volume - min_volume)

  ###############################################################################

  ##################################################
  #== Create training, validation and test split ==#
  ##################################################


  # Remove date column
  df.drop(columns=['Date'], inplace=True)
  raw_data.drop(columns=['Date'], inplace=True)

  # Convert pandas columns into arrays
  data = df.values
  raw_data = raw_data.values

  print('Data shape: {}'.format(data.shape))

  # df_train.head()
  return df, raw_data, data

## Splitting Data

Splitting dataset into 8:1:1

In [None]:
def splitData(train_data, val_data, test_data, day_to_predict):
  # Training data
  day_to_predict -= 1
  X_train, y_train = [], []
  for i in range(seq_len, len(train_data) - day_to_predict):
    X_train.append(train_data[i-seq_len:i])               # Chunks of training data with a length of seq_len df-rows
    y_train.append(train_data[:, 3][i + day_to_predict])  # Value of 4th column (Close Price) of df-row seq_len
  X_train, y_train = np.array(X_train), np.array(y_train)

  ###############################################################################

  # Validation data
  X_val, y_val = [], []
  for i in range(seq_len, len(val_data) - day_to_predict):
    X_val.append(val_data[i-seq_len:i])
    y_val.append(val_data[:, 3][i + day_to_predict])
  X_val, y_val = np.array(X_val), np.array(y_val)

  ###############################################################################

  # Test data
  X_test, y_test = [], []
  for i in range(seq_len, len(test_data) - day_to_predict):
    X_test.append(test_data[i-seq_len:i])
    y_test.append(test_data[:, 3][i + day_to_predict])
  X_test, y_test = np.array(X_test), np.array(y_test)

  return X_train, y_train, X_val, y_val, X_test, y_test

## Plotting Data

Plot data in different parts of training and evaluating model

In [None]:
def plotRawData(df, name, path):
  fig = plt.figure(figsize=(15,5))
  st = fig.suptitle('Real Closing Price of {}'.format(name), fontsize=20)

  ax1 = fig.add_subplot(111)
  ax1.plot(df['Close'], label = name + ' Close Price')
  ax1.set_xticks(range(0, df.shape[0], df.shape[0] // 5))
  ax1.set_xticklabels(df['Date'].loc[::(df.shape[0] // 5)])
  ax1.set_ylabel('Close Price', fontsize = 18)
  ax1.legend(loc = 'upper left', fontsize = 12)

  plt.savefig(path)
  plt.show()


def plotNormalizedData(df, name, train_data, val_data, test_data, df_train, df_val, df_test, path):
  fig = plt.figure(figsize = (15,6))
  st = fig.suptitle('Stationization, Normalization and Data Separation for ' + name, fontsize=20)

  ax1 = fig.add_subplot(111)
  ax1.plot(np.arange(train_data.shape[0]), df_train['Close'], label = 'Training data')

  ax1.plot(np.arange(train_data.shape[0],
                     train_data.shape[0] + val_data.shape[0]), df_val['Close'], label = 'Validation data')

  ax1.plot(np.arange(train_data.shape[0] + val_data.shape[0],
                     train_data.shape[0] + val_data.shape[0] + test_data.shape[0]), df_test['Close'], label = 'Test data')
  ax1.set_xlabel('Date')
  ax1.set_ylabel('Normalized Closing Returns')
  ax1.legend(loc = 'best', fontsize = 12)

  plt.savefig(path)
  plt.show()

def plotPrediction(train_data, val_data, test_data, train_pred, val_pred, test_pred, name, model_Type, path):
  #######################
  #== Display results ==#
  #######################

  fig = plt.figure(figsize = (15,20))

  # Plot training data results
  ax11 = fig.add_subplot(311)
  ax11.plot(train_data[:, 3], label = name + ' Closing Returns')
  ax11.plot(np.arange(seq_len, train_pred.shape[0]+seq_len), train_pred, linewidth=3, label = 'Predicted ' + name + ' Closing Returns')
  ax11.set_title('Training Data', fontsize = 18)
  ax11.set_xlabel('Date')
  ax11.set_ylabel(name + ' Closing Returns')
  ax11.legend(loc = 'best', fontsize = 12)

  # Plot validation data results
  ax21 = fig.add_subplot(312)
  ax21.plot(val_data[:, 3], label = name + ' Closing Returns')
  ax21.plot(np.arange(seq_len, val_pred.shape[0]+seq_len), val_pred, linewidth=3, label = 'Predicted ' + name + ' Closing Returns')
  ax21.set_title('Validation Data', fontsize = 18)
  ax21.set_xlabel('Date')
  ax21.set_ylabel(name + ' Closing Returns')
  ax21.legend(loc = 'best', fontsize = 12)

  # Plot test data results
  ax31 = fig.add_subplot(313)
  ax31.plot(test_data[:, 3], label = name + ' Closing Returns')
  ax31.plot(np.arange(seq_len, test_pred.shape[0]+seq_len), test_pred, linewidth=3, label = 'Predicted ' + name + ' Closing Returns')
  ax31.set_title('Test Data', fontsize = 18)
  ax31.set_xlabel('Date')
  ax31.set_ylabel(name + ' Closing Returns')
  ax31.legend(loc = 'best', fontsize = 12)

  plt.savefig(path)
  plt.show()

def plotErrorTraining(history, model_type, name, path):
  #############################
  #== Display model metrics ==#
  #############################

  fig = plt.figure(figsize=(15,15))
  st = fig.suptitle('Performance Metrics for ' + name + ' Using ' + num2words(features) + ' features', fontsize = 22)
  st.set_y(0.92)

  # Plot model loss
  ax1 = fig.add_subplot(211)
  ax1.plot(history.history['loss'], label = 'Training loss (RMSE)')
  ax1.plot(history.history['val_loss'], label = 'Validation loss (RMSE)')
  ax1.set_title('Root Mean Squared Error (RMSE)', fontsize = 18)
  ax1.set_xlabel('Epoch')
  ax1.set_ylabel('Loss (RMSE)')
  ax1.legend(loc = 'best', fontsize = 12)

  # Plot MAE
  ax2 = fig.add_subplot(212)
  ax2.plot(history.history['mae'], label = 'Training MAE')
  ax2.plot(history.history['val_mae'], label = 'Validation MAE')
  ax2.set_title('Mean Average Error (MAE)', fontsize = 18)
  ax2.set_xlabel('Epoch')
  ax2.set_ylabel('Mean average error (MAE)')
  ax2.legend(loc = 'best', fontsize = 12)

  plt.savefig(path)
  plt.show()

# Training Model

Training the Transformer Model

In [None]:
for name in companyNames:
  for cType in chartType:
    PATHExtension = '{}/{}/'.format(cType, name)

    # Path to main data
    data_path = PATHRead + cType + '_' + name + '.csv'

    # Retrieve data to DataFrame
    data = prepareData(data_path)
    plotRawData(data, name, PATHSave + PATHExtension + '/Raw_Data.png')

    # Normalize Data
    data, train_data, val_data, test_data, data_train, data_val, data_test = normalizeData(data)
    plotNormalizedData(data, name, train_data, val_data, test_data, data_train, data_val, data_test, PATHSave +  PATHExtension + '/Processed.png')

    # Split Data
    X_train, y_train, X_val, y_val, X_test, y_test = splitData(train_data, val_data, test_data, 1)

    ##############################
    # Create the Transformer Model
    ##############################
    modelTransformer = createTransformerModel()
    modelTransformer.summary()
    callbackTransformer = tf.keras.callbacks.ModelCheckpoint(PATHSave +  PATHExtension + '/' + name + '_' + str(features) + '.hdf5',
                                                            monitor = 'val_loss',
                                                            save_best_only = True,
                                                            verbose = 1)
    historyTransformer = modelTransformer.fit(X_train, y_train,
                                              batch_size = batch_size,
                                              epochs = 50,
                                              callbacks = [callbackTransformer],
                                              validation_data = (X_val, y_val))
    # Calculate predication for training, validation and test data and Display
    train_pred = modelTransformer.predict(X_train)
    val_pred = modelTransformer.predict(X_val)
    test_pred = modelTransformer.predict(X_test)
    plotPrediction(train_data, val_data, test_data, train_pred, val_pred, test_pred, name, 'Transformer',
                   PATHSave +  PATHExtension + '/StationaryPrediction_' + str(features) + '.png')

    # Print evaluation metrics for train, val, test
    train_eval = modelTransformer.evaluate(X_train, y_train, verbose=0)
    val_eval = modelTransformer.evaluate(X_val, y_val, verbose=0)
    test_eval = modelTransformer.evaluate(X_test, y_test, verbose=0)
    print()
    print('Evaluation metrics for ' + name + ' using ' + str(features) + ' features After Training')
    print('Training Data - Loss (RMSE): {:.4f}, MAE: {:.4f}'.format(train_eval[0], train_eval[1]))
    print('Validation Data - Loss (RMSE): {:.4f}, MAE: {:.4f}'.format(val_eval[0], val_eval[1]))
    print('Test Data - Loss (RMSE): {:.4f}, MAE: {:.4f}'.format(test_eval[0], test_eval[1]))
    dataDict = {'Error': ['Training RMSE', 'Training MAE', 'Validation RMSE', 'Validation MAE', 'Testing RMSE', 'Testing MAE'],
                'Value': [train_eval[0], train_eval[1], val_eval[0], val_eval[1], test_eval[0], test_eval[1]]}
    dFrame = pd.DataFrame(dataDict)
    dFrame.to_csv('{}{}ErrorValues.csv'.format(PATHSave, PATHExtension), index = False)
    plotErrorTraining(historyTransformer, 'Transformer', name, PATHSave +  PATHExtension + '/Error_' + str(features) + '.png')

# Predicting Real Closing Price

Predicting the real closing price using developed model

In [None]:
errorData = pd.DataFrame(columns = ['TradingCode', 'ChartType', 'RMSE', 'MAE'])

for name in companyNames:
  for cType in chartType:

    PATHExtension = '{}/{}/'.format(cType, name)
    data_list = [name, cType]

    # Path to main data
    data_path = PATHRead + cType + '_' + name + '.csv'
    # Retrieve data to DataFrame
    data = prepareData(data_path)
    # Normalize
    data, raw_data, val = unSplittedNormalize(data)

    X, y = [], []
    for i1 in range(seq_len, len(val)):
      X.append(val[i1-seq_len:i1])
      y.append(val[:, 3][i1])
    X, y = np.array(X), np.array(y)

    model = tf.keras.models.load_model(PATHSave +  PATHExtension + name + '_' + str(features) + '.hdf5',
                                        custom_objects={'Time2Vector': Time2Vector,
                                                        'SingleAttention': SingleAttention,
                                                        'MultiAttention': MultiAttention,
                                                        'TransformerEncoder': TransformerEncoder,
                                                        'loss_fn': loss_fn})
    raw_pred = model.predict(X)
    raw_eval = model.evaluate(X, y, verbose = 0)

    print('Evaluation metrics for ' + name + ' using ' + str(features) + ' features On Full Data')
    print('Loss (RMSE): {:.4f}, MAE: {:.4f}'.format(raw_eval[0], raw_eval[1]))
    data_list.append(raw_eval[0])
    data_list.append(raw_eval[1])
    for i1 in range(len(raw_pred)):
      raw_pred[i1][0] += raw_data[seq_len + i1][3]

    #######################
    #== Display results ==#
    #######################

    plt.figure(figsize = (15,5))
    plt.title('Actual Closing Price Prediction for ' + name + ' Using Transformer Model with ' + num2words(features) + ' features', fontsize = 22)
    plt.plot(raw_data[:, 3], linewidth = 1, label = name + ' Real Closing Price')
    plt.plot(np.arange(seq_len, raw_pred.shape[0]+seq_len), raw_pred, linewidth = 2, color = 'red', label = 'Predicted ' + name + ' Closing Price')
    plt.xlabel('Date')
    plt.ylabel(name + ' Closing Price')
    plt.legend(loc = 'best', fontsize = 12)

    plt.savefig(PATHSave +  PATHExtension + name + '.png')
    plt.show()
    errorData.loc[len(errorData)] = data_list
errorData.to_excel(PATHSave + 'Errors.xlsx', index = False)

<center>==The End==</center>