# Load and Evaluate Trained Model:

## Introduction:
This notebook goes over the evaluation process. It loads in the trained models ($5$ by default) and applies them to the testing images (images not yet seen by the models when training). In this project, the testing dataset consists of images from the start of $2007$ to the end of $2019$. While more data is available post $2019$, they were purposefully excluded due to unexpected trends resulting from the COVID pandemic.

Furthermore, this notebook only goes over the $I5$ base images (training images consists of $5$ day periods) since the computer used to train the models could only compile the $I5$ architectures ($I20$ and $I60$ architectures are available in this project, yet may require substantial computing resources due to their more complex structures). 

The evaluation process consists of loading in the models, applying the models to the images to get predicted probabilities, sorting the returns based on predicted probabilities, and sorting model's estimates into decile portfolios to view overall model's performance. 

**NOTE**: This notebook documents the results compiled previously from my personal computer. Please view the `Data_Acquisition.ipynb` and `Model_Training.ipynb` to generate all necessary files if you want to compile this notebook with your own personal dataset. Furthermore, please make sure you have the $I5$ OHLC base images and their respective models in the `data` directory to run this notebook. You can download these datasets from the README file.

## Import Necessary Packages:
In order to correctly evaluate the model, we'll use the following packages.

In [1]:
## Import all necessary packages.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import sys

sys.path.append(os.path.abspath(os.path.join('..')))
import src.training.evaluate as stock_eval

## Initialize Working Directories:
Please use the following code to establish the base directory and the working directories containing all the dataset. 

**Note:** The directories below assumes the user used the notebooks/scripts to create the following datasets and models. If you have saved the dataset or models to other directories, please revise the code below.

In [2]:
## Initialize the base directory to be in .\stock_code.
base_dir = os.path.abspath(os.path.join('.', '..'))

## Uncomment the code below to view your current working directory to check if it's correct.
# print(base_dir)

In [3]:
## Establish the directories that are needed to compile this notebook.

## The directory that stores the testing images.
test_dir = os.path.join(base_dir, 'data', 'processed_img', 'stock_img', 'I5R5', 'base_img', 'testing_dataset')

## The directory that stores the true weekly returns for each testing image.
ret_dir = os.path.join(base_dir, 'data', 'processed_img', 'stock_img', 'I5R5', 'returns')

## The directory that stores the trained model's weights.
models_dir = os.path.join(base_dir, 'data', 'saved_models', 'I5R5', 'base_img')

## The directory that stores the predicted probabilities for each testing image.
pred_dir = os.path.join(base_dir, 'data', 'predictions', 'I5R5', 'base')

## Image Dictionary:
In order to make sure to keep track of the images belonging to each firm, we can create a dictionary. The `keys` will be the firm's name and the `value` will be the OHLC images for the firm from 2007 to 2019. Given the path to the testing images, a `defaultdict` will be created where each firm and its respective images will be appended to the dictionary. The images will be loaded in as an array using the function `tf.keras.preprocessing.image.load_img` and `tf.keras.preprocessing.image.img_to_array`. Finally, the images will be sorted based on their week number when assigned into the dictionary. To see more in depth, please check the `evaluate.py` module in `src.training`.

In [4]:
## Creates an image dictionary storing all the images for each firm.
img_dict = stock_eval.create_img_dict(test_dir)

Image Dictionary Created!


## Apply Model to Images:
Now that the image dictionary has been created, we can apply the models to each of the firm's images and get the probability of the stock going up or down in the future. Given the path to the saved models, we can use the `get_avg_prediction` in `sort_eval` to perform an ensemble voting prediction. Each model will be used to obtain the probability of a stock going up or down; the probabilities for each firm will be stored as an `np.array`. These predicted probabilities made by each model will be appended to a list, which we'll then use `np.mean` to find the average predicted values for each image. The output value would be a `np.array` of shape (number of weeks, number of firms), where each cell will be the predicited probability that the stock price will go up for a certain firm at a certain week. These values will be the output we'll be using for later analysis. This process can be done using the `get_avg_prediction` function in the `evaluate.py` module.

In [5]:
## Apply the models to images and get average predicted probabilities.

## The average predicted probabilities is stored as an np.array, where the first
## index stores the probabilities for the first firm and the second index stores the
## probabilities for the second firm, and so forth.
pred_avg = stock_eval.get_avg_prediction(img_dict, models_dir)

Model Prediction Finished!


In [6]:
## View the first five weeks predicted probabilities of each image for the first firm.
print(pred_avg[0][:5])

[[0.5173253 ]
 [0.5598717 ]
 [0.58961904]
 [0.549301  ]
 [0.46375316]]


We can now save the predicted probabilities to a directory storing the predictions. By using the function `save_predictions`, the predicted values will be saved as a `.csv` file for each firm in the image dictionary. 

In [7]:
## Save all the respective predictions of each firm to designated directory.
stock_eval.save_predictions(img_dict, pred_avg, pred_dir)

All Predictions Saved Successfully!


## Analyze Results:
Now that we've obtained our predicted values for each firm, we can analyze and evaluate the results. First, we'll load in the actual returns from our return directory, and their respective predicted probabilities from our directory storing the model's output. We'll now have two arrays of shape (number of weeks, number of firms, 1), where one array contains the historical returns and the other array contains the output from the model. After loading in these separate datasets, we'll use `np.hstack` on each array to horizontally stack so the the each row represent a specific week while each column shows the info for a specific firm (probability from the model or historical return). Our resulting shape for each return and predicted array will be (number of weeks, number of firms). This process is shown below using the function `returns_based_predictions`. 

In [8]:
## Get arrays for historical returns and model's predictions.
ret_stack, pred_stack = stock_eval.returns_based_predictions(ret_dir, pred_dir)

Now, we can sort the historical returns based on their predicted probabilities. First, we'll sort the predicted probabilities in each row of `pred_stack` in ascending order. We'll use `np.argsort` to keep track of the sorted indices. We'll then use the sorted indices to sort the return values in `ret_stack`, so that the historical returns will be sorted based on their probabilities predicted by the model. This will be done for each row (the week index) in `ret_stack`. So, the historical return for index 0 of row 0 in the sorted array will be the actual weekly return for the lowest predicted probability of the model in the first week. 

In [9]:
## Sort the returns based on their predicted probabilities.
ret_sorted = stock_eval.sort_returns(ret_stack, pred_stack)

To find the average annual returns of our sorted firms, we'll use the Compound Annual Growth Rate formula. 
\begin{equation*}
CAGR = \left( \frac{Value_{final}}{Value_{begin}} \right)^{\frac{1}{t}} - 1 = \prod_{i = 1}^{\text{num of weeks}} \left( \text{Weekly Returns}_{i} + 1 \right)^{\frac{1}{\text{Num of Years}}} - 1
\end{equation*}
Our returns are listed in our `ret_sorted` array, where each row marks the historical returns of a certain week sorted in their respective predicted probabilities. Hence, we can add $1$ to all the values in the array and multiply all the rows to get the overall growth across the span of $7$ years. In order to get the compound annual growth rate, we need to raise these values to the power of $\frac{1}{7}$ and subtract $1$. This process is done using the `get_comp_returns` function, shown below.

In [10]:
## Get the annual compound returns and save it to a Dataframe.
ret_df = stock_eval.get_comp_return(ret_sorted)

## Show the annual growth rates for each predicted probability.
print(ret_df)

       Returns
0    -0.757282
1    -0.792960
2    -0.662229
3    -0.407155
4    -0.746025
...        ...
1087  0.656889
1088  0.613227
1089  0.977258
1090  0.270810
1091  0.698200

[1092 rows x 1 columns]


We can now track the annual compound growth rate for our predicted probabilities. The first element in `ret_df` shows the annual growth for our lowest predicted probabilites while the last element shows the annual growth for our highest predicted probabilities. We'll now sort these values into a decile portfolio by first equally separating the values into $10$ groups using the function `pd.cut`. For each bin, we'll calculate the average of the values to get the average annual growth for that particular decile group. Hence, we can now clearly see the performance of the model by viewing the average annual growth rate per decile, with the first decile being the lowest predicted probability while the last decile being the highest predicted probability.

In [11]:
## Separates the return values equally into 10 deciles. 
ret_df['Decile'] = pd.cut(ret_df.index, bins = 10, labels = False)

## Find the average annual growth rate per decile.
decile_means = ret_df.groupby('Decile')['Returns'].mean()

## Print out the average growth rate per decile.
for i, mean in enumerate(decile_means):
    print(f'Decile {i + 1} mean: {mean}')

Decile 1 mean: -0.2720944385058328
Decile 2 mean: 0.00984343890829152
Decile 3 mean: 0.021044340640320944
Decile 4 mean: 0.06459797501896818
Decile 5 mean: 0.10604151658975122
Decile 6 mean: 0.14210747969697113
Decile 7 mean: 0.182349429766024
Decile 8 mean: 0.24672336338500783
Decile 9 mean: 0.32090490415409295
Decile 10 mean: 0.42171993331102015
