In [1]:
import pandas as pd
import os
import sys
import numpy as np

current_dir = os.getcwd()
parent_dir = os.path.abspath(os.path.join(current_dir, '../..'))
sys.path.append(parent_dir)
from src.Evaluation.Others.testing import *

## LSTM Architecture
 
LSTMs (Long Short-Term Memory) are well-suited for time-series data as they address the vanishing gradient problem in RNNs. They effectively capture long-term dependencies, making them ideal for mid-price trend prediction. In general, RNNs (Recurrent Neural Networks) are the perfect candidates for sequential data, as they are designed to process data where current inputs depend on previous ones, enabling them to model temporal dynamics effectively.

**Architecture:**  

![Image description](../images/LSTM_Architecture.png)

1. **Input**: Sequence of vectors from time $t-T-1$ to $t$ (with T the sequence length).  
2. **LSTM Layers**: Two stacked LSTM layers for learning complex temporal dependencies.
3. **Layer Normalizaton**: Used to stabilize the training process
4. **Dropout**: Helps prevent overfitting while training on noisy financial data.  
5. **Fully Connected Layer**: Maps the last hidden state to output class logits.
6. **Softmax function**: for multi-class classification
7. **Output**: Down (D), Stable (S), or Up (U) predictions.
   
**Optimization**:
Cross-Entropy Loss & Adam Optimizer

**General Hyperparameters:**  
- `sequence_length`: $10$
- `batch_size`: $64$  

**Specific Hyperparameters:**  
- `num_layers`: $2$ LSTM layers  
- `hidden_size`: $50$  layers size
- `dropout`: $0.2$
- `learning_rate`: $0.001$
- `epochs`: $2$

We decided to select relevant handcrafted hyperparameters to reduce training time and save time for the implementation of more advanced architectures like Transformers. 

The number of layers is set to 2 to create a complex anough architecture while keeping time complexity reasonable, in practice LSTM architectures rarely have more than 3 or 4 layers. 

The size of the layers (hidden_size) is set to 50, as larger sizes did not improve performance and only increased computational cost. Additionally, we included a dropout layer to prevent overfitting, given the high level of noise typically present in financial data. 

Since we are using the Adam optimizer, the selection of the learning rate may have less impact on the final performance because the optimizer adjusts the learning rate automatically, that's why we chose the default learning rate of 0.001. 

The number of epochs selected is sufficient to learn the pattern while being low enough to maintain a manageable time complexity.

In [2]:
test_lstm_model()

Starting tests for LSTM model.

=== Predictions for LSTM Model | Horizon = 10 ===
Scaler loaded from ../trained_models/model\LSTM\scaler_lstm_horizon_10.pkl
LSTM model loaded from ../trained_models/model\LSTM\lstm_horizon_10.pth


Making Predictions LSTM Horizon 10: 39818it [01:21, 490.54it/s]


--- LSTM Model | Horizon = 10 ---
Accuracy: 0.972170
Weighted F1 Score: 0.971935

Classification Report:
              precision    recall  f1-score   support

           D       0.94      0.90      0.92    251726
           S       0.98      0.99      0.99   2053299
           U       0.92      0.90      0.91    242633

    accuracy                           0.97   2547658
   macro avg       0.95      0.93      0.94   2547658
weighted avg       0.97      0.97      0.97   2547658


--------------------------------------------------

=== Predictions for LSTM Model | Horizon = 20 ===
Scaler loaded from ../trained_models/model\LSTM\scaler_lstm_horizon_20.pkl
LSTM model loaded from ../trained_models/model\LSTM\lstm_horizon_20.pth


Making Predictions LSTM Horizon 20: 39817it [01:20, 495.72it/s]


--- LSTM Model | Horizon = 20 ---
Accuracy: 0.975886
Weighted F1 Score: 0.975657

Classification Report:
              precision    recall  f1-score   support

           D       0.96      0.93      0.95    409499
           S       0.98      1.00      0.99   1745124
           U       0.96      0.93      0.94    393025

    accuracy                           0.98   2547648
   macro avg       0.97      0.95      0.96   2547648
weighted avg       0.98      0.98      0.98   2547648


--------------------------------------------------

=== Predictions for LSTM Model | Horizon = 50 ===
Scaler loaded from ../trained_models/model\LSTM\scaler_lstm_horizon_50.pkl
LSTM model loaded from ../trained_models/model\LSTM\lstm_horizon_50.pth


Making Predictions LSTM Horizon 50: 39817it [01:21, 491.15it/s]


--- LSTM Model | Horizon = 50 ---
Accuracy: 0.908761
Weighted F1 Score: 0.907280

Classification Report:
              precision    recall  f1-score   support

           D       0.80      0.93      0.86    686217
           S       0.97      0.99      0.98   1200712
           U       0.94      0.74      0.83    660689

    accuracy                           0.91   2547618
   macro avg       0.90      0.89      0.89   2547618
weighted avg       0.91      0.91      0.91   2547618


--------------------------------------------------

=== Predictions for LSTM Model | Horizon = 100 ===
Scaler loaded from ../trained_models/model\LSTM\scaler_lstm_horizon_100.pkl
LSTM model loaded from ../trained_models/model\LSTM\lstm_horizon_100.pth


Making Predictions LSTM Horizon 100: 39816it [01:21, 491.33it/s]


--- LSTM Model | Horizon = 100 ---
Accuracy: 0.822822
Weighted F1 Score: 0.816769

Classification Report:
              precision    recall  f1-score   support

           D       0.70      0.89      0.78    819038
           S       0.91      0.97      0.94    930007
           U       0.90      0.59      0.71    798523

    accuracy                           0.82   2547568
   macro avg       0.84      0.81      0.81   2547568
weighted avg       0.84      0.82      0.82   2547568


--------------------------------------------------

=== Metrics Summary - LSTM ===
Model_Type Horizon  Accuracy  Weighted_F1
      LSTM      10  0.972170     0.971935
      LSTM      20  0.975886     0.975657
      LSTM      50  0.908761     0.907280
      LSTM     100  0.822822     0.816769


LSTM metrics summary saved at: ../trained_models/model\lstm_summary_metrics.csv

