<h1 align="center">Long Short-Term Memory (LSTM) </h1>

Cryptocurrency has become a popular and volatile investment in recent years, and forecasting the prices of various coins has become an important task for traders and investors. In this Jupyter notebook, we will use Long Short-Term Memory (LSTM) models to predict the closing prices of eight different cryptocurrencies. We will use PyTorch to implement the LSTM models and evaluate the performance of each model with various metrics.

## Steps

1. Data cleaning, feature engineering, and preprocessing: We will start by cleaning and preparing our data, which includes feature engineering and scaling the data.

2. Train-validation-test split: We will split the data into training, validation, and test sets, where the validation set will be used for monitoring the training process of the LSTM models, and the test set will be used for evaluating the final predictions.

3. PyTorch Dataset class initialization: We will create a PyTorch Dataset class object for our data to create the PyTorch dataloaders for our LSTM models.

4. Multilayer LSTM model architecture initialization: We will initialize the architecture of our multilayer LSTM model.

5. Training: We will train one LSTM model for each of the eight coins and evaluate the progress of training with metrics such as mean squared error (MSE), mean absolute error (MAE), coefficient of determination (R^2), and root mean squared error (RMSE) using the validation set.

6. Evaluation: We will evaluate each model and then compute the mean prediction for the eight coins.

7. Saving: We will save the test predictions and actual values and metrics for each coin to plot and evaluate later.

By following these steps, we hope to generate accurate predictions for cryptocurrency prices that can be used for trading or investment decisions, while also identifying the best LSTM architectures and hyperparameters for this task.


### Package imports and system configuration

In [None]:
from datetime import datetime
from os.path import join
import math
import time
import json


from tqdm.notebook import tqdm

import pandas as pd
import numpy as np


import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import plotly.graph_objs as go
from IPython.display import Markdown

from sklearn.preprocessing import StandardScaler

import torch
import torch.nn as nn
import torch.optim as optim
import torchmetrics
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

### Necessary paths

In [None]:
data_raw_path ='io/input/data_raw/Crypto_July_2019_2023/4H_2019'
export_path = 'io/output/exports/'
test_path = 'io/input/base_data/test.csv'

predictions_path = export_path + 'predictions/'
metrics_plot_path = export_path + 'metrics_plots/'
results_path = export_path + 'experiments_results/'

###  Loading data

In [None]:
available_coins = ['ADA', 'BNB', 'BTC', 'DASH', 'ETH', 'LINK', 'LTC', 'XRP']
df = pd.DataFrame({'Available coins': available_coins})
display(df)

In [None]:
def read_coin_data(coin_name: str) -> pd.DataFrame:    
    data_df = pd.read_csv(f"{data_raw_path}/{coin_name}/{coin_name.lower()}_2019.csv",index_col=False)
    return data_df

In [None]:
coin_name = 'BTC'
coin_df = read_coin_data(coin_name=coin_name)
coin_df = coin_df.rename(columns={"Time":"Date"})
coin_df.describe()

In [None]:
coin_df.head()

### Exploratory analysis

In [None]:
def plot_coin_interactive(plot_df):
    plot_df = coin_df.copy()
    plot_df['Date'] = pd.to_datetime(plot_df['Date'])
    fig = px.line(plot_df, x='Date', y='Close', title=f'{coin_name} Close Price Over Time')
    fig.update_layout(xaxis_title='Date', yaxis_title='Close Price', xaxis_tickangle=-45)
    fig.show()
    
def plot_coin_static(plot_df):
    plot_df = coin_df.copy()
    plot_df['Date'] = pd.to_datetime(plot_df['Date'])

    plt.plot(plot_df['Date'], plot_df['Close'])
    plt.title(f'{coin_name} Close Price Over Time')
    plt.xlabel('Date')
    plt.xticks(rotation=-45)
    plt.ylabel('Close Price')
    plt.show()

In [None]:
plot_coin_interactive(plot_df=coin_df)
plot_coin_static(plot_df=coin_df)

### Feature Engineering
* The purpose of the `append_date_features` function is to add additional date-related features to a pandas DataFrame. It enhances the original DataFrame by extracting various date components from a column named "Date" and appending them as separate columns.

* The purpose of the `create_trigonometric_columns` function is to create trigonometric representations of date-related columns in a pandas DataFrame. By converting the date components into sine and cosine values, **it captures cyclical patterns in a continuous numerical form**.



In [None]:
def append_date_features(df: pd.DataFrame) -> pd.DataFrame:
    df['Date'] = pd.to_datetime(df['Date'])
    df['Year'] = df['Date'].dt.year
    df['Month'] = df['Date'].dt.month
    df['Day'] = df['Date'].dt.day
    df['Week_of_Year'] = df['Date'].dt.isocalendar().week
    return df

In [None]:
def create_trigonometric_columns(df) -> pd.DataFrame:
    # Create sine and cosine columns for Year, Month and Day
    df['Year_sin'] = df['Year'].apply(lambda x: math.sin(2*math.pi*x/2023))
    df['Year_cos'] = df['Year'].apply(lambda x: math.cos(2*math.pi*x/2023))
    df['Month_sin'] = df['Month'].apply(lambda x: math.sin(2*math.pi*x/12))
    df['Month_cos'] = df['Month'].apply(lambda x: math.cos(2*math.pi*x/12))
    df['Day_sin'] = df['Day'].apply(lambda x: math.sin(2*math.pi*x/31))
    df['Day_cos'] = df['Day'].apply(lambda x: math.cos(2*math.pi*x/31))
    return df

In [None]:
coin_df = append_date_features(df=coin_df)
coin_df = create_trigonometric_columns(df=coin_df)
# Set date as index
coin_df.set_index('Date', inplace=True)
coin_df.head()

### Create the target variable

The purpose of this function is to create a target variable that can be used for training a machine learning model to make predictions based on historical data. It achieves this by shifting the values of a specified column (commonly referred to as the "Close" column) in the DataFrame by a specified number of time steps (determined by the forecast_lead parameter).

In [None]:
def create_target_variable(df: pd.DataFrame, forecast_lead: int = 1) -> (pd.DataFrame, str):    
    target_column = "Close"
    features = list(df.columns.difference([target_column]))
    
    target_name = f"{target_column}_lead_{forecast_lead}"
    df[target_name] = df[target_column].shift(-forecast_lead)
    df = df.iloc[:-forecast_lead]
    return df, target_name

In [None]:
coin_df, target = create_target_variable(df=coin_df)
display("Target added to dataframe", coin_df[['Close', target]].head(), coin_df.shape)

In [None]:
features = [col for col in coin_df.columns if col != target]
features_str = ', '.join(features)
display(Markdown(f"<strong>Features:</strong> {features_str}<br><strong>Target:</strong> {target}"))

### Split data
* The purpose of the `split_train_valid_test` function is to split a pandas DataFrame into training, validation, and testing sets based on specific date ranges.


In [None]:
def split_train_valid_test(data: pd.DataFrame):    
    # Split the data into training and testing sets
    split_date_1 = datetime(2022, 1, 1)
    split_date_2 = datetime(2022, 12, 1)
    train_data = data.loc[data.index < split_date_1]
    valid_data = data.loc[(split_date_1<= data.index) & (data.index <= split_date_2)]
    test_data = data.loc[data.index > split_date_2]

    return train_data, valid_data, test_data

In [None]:
train_data, valid_data, test_data = split_train_valid_test(data=coin_df)
print("Train set fraction:", round((len(train_data) / len(coin_df)), 2),'%')
print("Valid set fraction:", round((len(valid_data) / len(coin_df)), 2),'%')
print("Test set fraction:", round((len(test_data) / len(coin_df)), 2),'%')
print("Train shape: ", train_data.shape)
train_data.head()

## Preprocessing

This method creates a StandardScaler object and fits it on the training data only.
The scaler is then applied to transform the training, validation, and test input feature data. Finally, the method concatenates the transformed input feature data with their respective target variable and returns the resulting scaled training, validation, and test data as pandas dataframes.

In [None]:
def apply_scaling(train_data, valid_data, test_data, target):
    # Separate the input features and target variable in each dataframe
    X_train = train_data.drop(columns=[target])
    y_train = train_data[target]

    X_val = valid_data.drop(columns=[target])
    y_val = valid_data[target]

    X_test = test_data.drop(columns=[target])
    y_test = test_data[target]

    # Define a scaler object and fit it on the training data only
    scaler = StandardScaler()
    X_train_scaled = pd.DataFrame(scaler.fit_transform(X_train), columns=X_train.columns, index=X_train.index)
    X_valid_scaled = pd.DataFrame(scaler.transform(X_val), columns=X_val.columns, index=X_val.index)
    X_test_scaled = pd.DataFrame(scaler.transform(X_test), columns=X_test.columns, index=X_test.index)
    
    train_scaled = pd.concat([X_train_scaled, y_train],axis = 1)
    valid_scaled = pd.concat([X_valid_scaled, y_val],axis = 1)
    test_scaled = pd.concat([X_test_scaled, y_test],axis = 1)
    return train_scaled, valid_scaled, test_scaled

In [None]:
train_scaled, valid_scaled, test_scaled = apply_scaling(train_data, valid_data, test_data, target)

## Dataset class

Class definition for a custom PyTorch dataset, SequenceDataset, which takes a pandas DataFrame and converts it into a PyTorch tensor for sequence modeling. It includes, 
- __init__() method to set up the dataset with the desired target variable, input features, and sequence length.
- __len__() method returns the number of samples in the dataset
- __getitem__() method returns a single sample as a tuple of the input sequence x and the corresponding target y.

In [None]:
class SequenceDataset(Dataset):
    def __init__(self, dataframe, target, features, sequence_length=5):
        self.features = features
        self.target = target
        self.sequence_length = sequence_length
        self.y = torch.tensor(dataframe[target].values).float()
        self.X = torch.tensor(dataframe[features].values).float()

    def __len__(self):
        return self.X.shape[0]

    def __getitem__(self, i): 
        if i >= self.sequence_length - 1:
            i_start = i - self.sequence_length + 1
            x = self.X[i_start:(i + 1), :]
        else:
            padding = self.X[0].repeat(self.sequence_length - i - 1, 1)
            x = self.X[0:(i + 1), :]
            x = torch.cat((padding, x), 0)

        return x, self.y[i]

In [None]:
i = 5
sequence_length = 3
features = [col for col in train_scaled.columns if col != target]


train_dataset = SequenceDataset(
    train_scaled,
    target=target,
    features=features,
    sequence_length=sequence_length
)

X, y = train_dataset[i]
X, y

In [None]:
train_data[features].iloc[(i - sequence_length + 1): (i + 1)]

### Creating and loading a PyTorch dataset and dataloader

- The **get_dataset_obj** function takes in a Pandas dataframe dataframe, a list of features features, a target variable target, and a sequence length sequence_length. It creates a SequenceDataset object using these inputs and returns it.

- The **get_dataloader** function takes in a dataset_obj, which is a SequenceDataset object, a batch size batch_size, and a boolean flag do_shuffle that indicates whether or not to shuffle the data. It creates a DataLoader object using these inputs and returns it. The DataLoader object is used to load the data in batches during training.

In [None]:
def get_dataset_obj(dataframe, features, target, sequence_length):
    sequence_dataset = SequenceDataset(
                            dataframe=dataframe,
                            target=target,
                            features=features,
                            sequence_length=sequence_length
                            )
    return sequence_dataset
    
def get_dataloader(dataset_obj, batch_size, do_shuffle = False):
    loader = DataLoader(dataset_obj, batch_size=batch_size, shuffle=do_shuffle)
    return loader

In [None]:
torch.manual_seed(99)
train_loader = DataLoader(train_dataset, batch_size=3)
X, y = next(iter(train_loader))
print(X.shape)

In [None]:
sequence_length = 16
train_dataset = get_dataset_obj(train_scaled, target=target, features=features, sequence_length=sequence_length)
validation_dataset = get_dataset_obj(valid_scaled, target=target, features=features, sequence_length=sequence_length)
test_dataset = get_dataset_obj(test_scaled, target=target, features=features, sequence_length=sequence_length)

batch_size = 16
train_loader = get_dataloader(train_dataset, batch_size=batch_size, do_shuffle=True)
validation_loader = get_dataloader(validation_dataset, batch_size=batch_size)
test_loader = get_dataloader(test_dataset, batch_size=batch_size)

X, y = next(iter(train_loader))
print("Features shape:", X.shape)
print("Target shape:", y.shape)

### Create DataLoaders for training

1. Load data of a crypto coin given its name and apply preprocessing.
2. Create a target variable and split the data into training, validation, and testing sets.
3. Apply scaling to the input features in each dataset.
4. Initialize PyTorch Dataset objects for the scaled data with a specified sequence length and target variable.
5. Initialize PyTorch DataLoader objects for each Dataset object with a specified batch size, and return the datasets, loaders, and target variable as output.

In [None]:
def prepare_data(coin: str, sequence_length, batch_size):
    df = read_coin_data(coin_name=coin)
    df.rename(columns={"Time":"Date"}, inplace=True)
    df = append_date_features(df=df)
    df = create_trigonometric_columns(df=df)
    df.set_index('Date', inplace=True)
    df, target = create_target_variable(df=df)
    train_df, valid_df, test_df = split_train_valid_test(data=df)
    datasets = (train_df, valid_df, test_df)
    train_scaled, valid_scaled, test_scaled = apply_scaling(train_df, valid_df, test_df, target)
    features = [col for col in train_data.columns if col != target]
    
    # initialize Dataset objects
    train_dataset = get_dataset_obj(train_scaled, target=target, features=features, sequence_length=sequence_length)
    validation_dataset = get_dataset_obj(valid_scaled, target=target, features=features, sequence_length=sequence_length)
    test_dataset = get_dataset_obj(test_scaled, target=target, features=features, sequence_length=sequence_length)

    # initialize DataLoader objects
    train_loader = get_dataloader(train_dataset, batch_size=batch_size)
    validation_loader = get_dataloader(validation_dataset, batch_size=batch_size)
    test_loader = get_dataloader(test_dataset, batch_size=batch_size)
    loaders = (train_loader, validation_loader, test_loader)
    
    return datasets, loaders, target  

In [None]:
datasets, loaders, target = prepare_data(coin='BTC', sequence_length=16, batch_size=16)
train_loader, validation_loader, test_loader = loaders

### LSTM architecture
- Input:
  - Number of features: `num_sensors`

- LSTM Layer:
  - Input size: `num_sensors`
  - Hidden size: `hidden_units`
  - Batch first: True
  - Number of layers: `num_layers`

- Fully Connected Layers:
  - `fc1`:
    - Input features: `hidden_units`
    - Output features: 64
  - Dropout (`dropout1`) with a probability of `dropout_prob`
  - Batch normalization (`bn1`)
  - ReLU activation function (`relu1`)
  
  - `fc2`:
    - Input features: 64
    - Output features: 16
  - Dropout (`dropout2`) with a probability of `dropout_prob`
  - Batch normalization (`bn2`)
  - ReLU activation function (`relu2`)
  
  - `fc3`:
    - Input features: 16
    - Output features: 1

- Forward Pass:
  - Initialize the LSTM layer with zeros for the initial hidden state and cell state.
  - Pass the input through the LSTM layer.
  - Select the output of the last LSTM layer.
  - Pass the output through `fc1`, `relu1`, `fc2`, `dropout2`, `relu2`, and `fc3`.
  - Squeeze the output tensor to remove the last dimension of size 1.
  - Return the output.


In [None]:
class DeepRegressionLSTM(nn.Module):
    def __init__(self, num_sensors, hidden_units, num_layers, dropout_prob=0.2):
        super().__init__()
        self.num_sensors = num_sensors  # this is the number of features
        self.hidden_units = hidden_units
        self.num_layers = num_layers
        self.dropout_prob = dropout_prob

        self.lstm = nn.LSTM(
            input_size=num_sensors,
            hidden_size=hidden_units,  # Use the first value in the list for the first layer
            batch_first=True,
            num_layers=self.num_layers
        )


        self.fc1 = nn.Linear(in_features=hidden_units, out_features=64)
        self.dropout1 = nn.Dropout(p=self.dropout_prob)
        self.bn1 = nn.BatchNorm1d(64)
        self.relu1 = nn.ReLU()

        self.fc2 = nn.Linear(in_features=64, out_features=16)
        self.dropout2 = nn.Dropout(p=self.dropout_prob)
        self.bn2 = nn.BatchNorm1d(16)
        self.relu2 = nn.ReLU()

        self.fc3 = nn.Linear(in_features=16, out_features=1)

    def forward(self, x):
        batch_size = x.shape[0]
        h0 = torch.zeros(self.num_layers, batch_size, self.hidden_units).requires_grad_()
        c0 = torch.zeros(self.num_layers, batch_size, self.hidden_units).requires_grad_()

        # Pass input through the first LSTM layer
        out, (hn, _) = self.lstm(x, (h0.detach(), c0.detach()))

        out = hn[-1]  # Select the output of the last LSTM layer
        out = self.fc1(out)
#         out = self.dropout1(out)
#         out = self.bn1(out)
        out = self.relu1(out)
        out = self.fc2(out)
        out = self.dropout2(out)
#         out = self.bn2(out)
        out = self.relu2(out)
        out = self.fc3(out).squeeze()  # Squeeze to remove the last dimension of size 1
        
        return out

In [None]:
num_hidden_units = 64
num_of_layers = 3
model = DeepRegressionLSTM(num_sensors=15, hidden_units=num_hidden_units, num_layers=num_of_layers)

## Training details

The **calculate_evalution_metrics** method calculates evaluation metrics for a given set of predictions and true values. The evaluation metrics include: 
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- R2 Score
- Root Mean Squared Error (RMSE)

In [None]:
def calculate_evaluation_metrics(y_pred, y_true, loss_fn):
    mse = loss_fn(y_pred, y_true)
    mae = torch.mean(torch.abs(y_pred - y_true))
    r2 = torchmetrics.functional.r2_score(y_pred.view(-1), y_true.view(-1))
    rmse = torch.sqrt(torch.mean(torch.pow(y_pred - y_true, 2)))
    
    return mse, mae, r2, rmse

1. `plot_comparison(actual, pred, coin)`: 
   - Goal: Plot a comparison between the actual and predicted values for a given coin's close prices.
   - Content: Plot the actual and predicted values on a graph.

2. `train_model(data_loader, model, loss_function, optimizer, ix_epoch)`: 
   - Goal: Train a given model using the provided data loader, loss function, and optimizer.
   - Content: 
     - Calculate evaluation metrics (MSE, MAE, R2, RMSE).
     - Return the trained model and computed metrics.

3. `evaluate_model(data_loader, model, loss_function, coin, ix_epoch = None)`: 
   - Goal: Evaluate a trained model using the provided data loader and loss function.
   - Content: 
     - Calculate the loss and evaluation metrics (MSE, MAE, R2, RMSE).
     - Plot a comparison between actual and predicted values if the epoch is divisible by 5.
     - Return the evaluated metrics as a dictionary.

4. `train_and_evaluate_model(train_loader, val_loader, model, loss_function, learning_rate, epochs, coin)`: 
   - Goal: Combine the training and evaluation processes for a given model.
   - Content: 
     - Train the model using the provided training data loader, loss function, optimizer, learning rate, and epochs.
     - Return the trained model.

5. `predict(data_loader, model)`: 
   - Goal: Predict output values using a trained model.
   - Content: Feed the input to the model and collect the predicted values.

In [None]:
def plot_comparison(actual, pred, coin):
    plt.plot(actual, label='actual')
    plt.plot(pred, label='prediction')
    plt.xlabel('Date')
    plt.ylabel('Close price')
    plt.legend()
    plt.title(f'{coin} Validation actual vs prediction')
    plt.show()


def train_model(data_loader, model, loss_function, optimizer, ix_epoch) -> dict:
    num_batches = len(data_loader)
    total_loss = 0
    model.train()
    
    mse_list, mae_list, r2_list, rmse_list = [], [], [], []
    
    for X, y in data_loader:
        output = model(X)
        loss = loss_function(output, y)

        optimizer.zero_grad()
        # computes gradients of the loss
        loss.backward()
        # updates the model parameters
        optimizer.step()

        total_loss += loss.item()
    
        mse, mae, r2, rmse = calculate_evaluation_metrics(y_pred=output, y_true=y, loss_fn=loss_function)
        mse_list.append(mse.item())
        mae_list.append(mae.detach().numpy())
        r2_list.append(r2.detach().numpy())
        rmse_list.append(rmse.detach().numpy())
    
    mse = sum(mse_list) / num_batches
    mae = sum(mae_list) / num_batches
    r2 = sum(r2_list) / num_batches
    rmse = sum(rmse_list) / num_batches
    print("Epoch {}, Train || MSE: {:.7f}, MAE: {:.7f}, R2: {:.7f}, RMSE: {:.7f}".format(ix_epoch, mse, mae, r2, rmse))
    metrics = {'mse': mse, 'mae': mae, 'r2': r2, 'rmse': rmse}
    return model, metrics

def evaluate_model(data_loader, model, loss_function, coin, ix_epoch = None) -> dict:
    
    num_batches = len(data_loader)
    total_loss = 0

    mse_list, mae_list, r2_list, rmse_list = [], [], [], []
    
    model.eval()
    actual_, pred_ = [], []
    with torch.no_grad():
        for X, y in data_loader:
            output = model(X)
            total_loss += loss_function(output, y).item()
            mse, mae, r2, rmse = calculate_evaluation_metrics(y_pred=output, y_true=y, loss_fn=loss_function)
            mse_list.append(mse.item())
            mae_list.append(mae.detach().numpy())
            r2_list.append(r2.detach().numpy())
            rmse_list.append(rmse.detach().numpy())
            
            actual_.append(y.numpy().reshape(-1))
            pred_.append(output.numpy().reshape(-1))
            
        actual_ = np.hstack(actual_)
        pred_ = np.hstack(pred_)
        

    mse = sum(mse_list) / num_batches
    mae = sum(mae_list) / num_batches
    r2 = sum(r2_list) / num_batches
    rmse = sum(rmse_list) / num_batches
    if ix_epoch is not None:
        print("Epoch {}, Evaluation || MSE: {:.7f}, MAE: {:.7f}, R2: {:.7f}, RMSE: {:.7f}".format(ix_epoch, mse, mae, r2, rmse))
    metrics = {'mse': mse, 'mae': mae, 'r2': r2, 'rmse': rmse}
    
    # plot every 2 epochs
    if ix_epoch is not None and ix_epoch % 5 == 0:   
        plot_comparison(actual=actual_, pred=pred_, coin=coin)
    return metrics

def train_and_evaluate_model(train_loader, val_loader, model, loss_function, learning_rate, epochs, coin):
    
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

    start = time.time()
    for ix_epoch in tqdm(range(epochs), desc=f"Training {coin} coin..."):
        print("\n---------")
        num_batches = len(train_loader)
        total_loss = 0
        model.train()

        mse_list, mae_list, r2_list, rmse_list = [], [], [], []

        for X, y in train_loader:
            output = model(X)
            loss = loss_function(output, y)

            optimizer.zero_grad()
            # computes gradients of the loss
            loss.backward()
            # updates the model parameters
            optimizer.step()

            total_loss += loss.item()

            mse, mae, r2, rmse = calculate_evaluation_metrics(y_pred=output, y_true=y, loss_fn=loss_function)
            mse_list.append(mse.item())
            mae_list.append(mae.detach().numpy())
            r2_list.append(r2.detach().numpy())
            rmse_list.append(rmse.detach().numpy())

        mse = sum(mse_list) / num_batches
        mae = sum(mae_list) / num_batches
        r2 = sum(r2_list) / num_batches
        rmse = sum(rmse_list) / num_batches
        print("Epoch {}, Train || MSE: {:.7f}, MAE: {:.7f}, R2: {:.7f}, RMSE: {:.7f}".format(ix_epoch+1, mse, mae, r2, rmse))
        metrics = {'mse': mse, 'mae': mae, 'r2': r2, 'rmse': rmse}
        val_metrics = evaluate_model(val_loader, model, loss_function, coin, ix_epoch=ix_epoch+1)
        print()

    return model
 
    
def predict(data_loader, model):

    output = torch.tensor([])
    model.eval()
    with torch.no_grad():
        for X, _ in data_loader:
            y_pred = model(X)
            output = torch.cat((output, y_pred), 0)
    
    return output

### Train one model for each coin
1. `train_all_coins(coin_list: list, epochs, learning_rate, loss_function, num_hidden_units, num_of_layers, batch_size, sequence_length)`:
   - **Goal:** Train and evaluate regression models for a list of coins using the given parameters.
   - **Content:**
     - Initialize dictionaries and dataframes for storing model results, predictions, and actual values.
     - Iterate over each coin in the coin list.
     - Prepare data for the current coin.
     - Create an instance of the `DeepRegressionLSTM` model.
     - Train and evaluate the model using the training and validation data loaders.
     - Evaluate the model on the test data and store the test metrics.
     - Generate predictions using the trained model on the test data and store them in a dataframe.
     - Store the actual values of the test data in a dataframe.
     - Store the results, test predictions, and actual values for each coin.
     - Return the model results, predictions dataframe, and actual values dataframe.

2. `append_means(predictions_df, actual_df)`:
   - **Goal:** Append the column-wise mean values to the predictions and actual dataframes.
   - **Content:**
     - Compute the mean values for each column in the predictions and actual dataframes.
     - Append the mean values as new columns to the respective dataframes.
     - Return the updated predictions and actual dataframes.

3. `compute_mean_metrics(coin_results: dict)`:
   - **Goal:** Compute the mean values of the test metrics for all the coins.
   - **Content:**
     - Initialize variables to store the sum of the test metrics and the total number of coins.
     - Iterate over the results of each coin.
     - Accumulate the test metric values.
     - Calculate the mean values of the test metrics.
     - Create a new dictionary with the mean results and return it.


In [None]:
def train_all_coins(coin_list: list, epochs, learning_rate, loss_function, num_hidden_units,
                    num_of_layers, batch_size, sequence_length):
    
    
    model_results = {"learning_rate": learning_rate, "epochs": epochs, "batch_size": batch_size}
    model_results['results'] = {}
    predictions_df = pd.DataFrame()
    actual_df = pd.DataFrame()

    for coin in tqdm(coin_list, desc="Processing coins..."):
        results = {}
        datasets, loaders, target = prepare_data(coin=coin, sequence_length=sequence_length, batch_size=batch_size)
        train_dataset, validation_dataset, test_dataset = datasets
        train_loader, validation_loader, test_loader = loaders
        
        features = train_dataset.shape[1]-1
        model = DeepRegressionLSTM(num_sensors=features, hidden_units=num_hidden_units, num_layers=num_of_layers)
        
        trained_model = train_and_evaluate_model(train_loader, validation_loader, model, loss_function,
                                                  learning_rate, epochs, coin)
        test_metrics = evaluate_model(test_loader, trained_model, loss_function, coin=coin, ix_epoch=None)
        results['test_metrics'] = test_metrics
        test_predictions = predict(test_loader, trained_model).numpy()
        predictions_df[coin] = list(test_predictions)
        actual_df[coin] = test_dataset[target].tolist()
        model_results['results'][coin] = results
        
        
    predictions_df.index = test_dataset.index
    actual_df.index = test_dataset.index
    return model_results, predictions_df, actual_df

def append_means(predictions_df, actual_df):
    predictions_df['mean'] = predictions_df.mean(axis=1)
    actual_df['mean'] = actual_df.mean(axis=1)
    return predictions_df, actual_df

def compute_mean_metrics(coin_results: dict):
    num_of_coins = len(coin_results)
    sum_mse, sum_mae, sum_r2, sum_rmse, sum_time = 0, 0, 0, 0, 0
    results_dict = coin_results['results']
    for coin, results in results_dict.items():
        sum_mse += results['test_metrics']['mse']
        sum_mae += results['test_metrics']['mae']
        sum_r2 += results['test_metrics']['r2']
        sum_rmse += results['test_metrics']['rmse']

    mean_results = coin_results    
    mean_results['mean_mse'] = sum_mse/num_of_coins
    mean_results['mean_mae'] = sum_mae/num_of_coins
    mean_results['mean_r2'] = sum_r2/num_of_coins
    mean_results['mean_rmse'] = sum_rmse/num_of_coins
    
    return mean_results

In [None]:
epochs = 50
learning_rate = 0.001
loss_function = nn.MSELoss()
# num_hidden_units = (64, 128)
num_hidden_units = 128
num_of_layers = 1
batch_size = 16
sequence_length = 16
coin_results, predictions_df, actual_df = train_all_coins(available_coins, epochs, learning_rate, loss_function,
                                                          num_hidden_units, num_of_layers, batch_size, sequence_length)

### Save results


In [None]:
timestamp = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
predictions_df, actual_df = append_means(predictions_df, actual_df)
predictions_df.to_csv(f'./io/output/exports/predictions/LSTM_predictions_{timestamp}_epochs_{epochs}.csv')
actual_df.to_csv(f'./io/output/exports/predictions/LSTM_actual_{timestamp}_epochs_{epochs}.csv')
print('Dataframes saved!')

mean_results = compute_mean_metrics(coin_results=coin_results)
print(f"mean mse: {round(mean_results['mean_mse'], 6)}")
print(f"mean mae: {round(mean_results['mean_mae'], 6)}")
print(f"mean r2: {round(mean_results['mean_r2'], 6)}")
print(f"mean rmse: {round(mean_results['mean_rmse'], 6)}")

# Save the dictionary to a file
with open('./io/output/exports/metrics_plots/lstm_metrics.json', 'a') as f:
    json.dump(mean_results, f, indent=len(mean_results))
print("Results saved to json file!")

In [None]:
def plot_test_actual_vs_pred(actual_df, pred_df, coin):
#     plot_df = pd.DataFrame(index=pred_df.index)
    plt.figure(figsize=(12, 8))
    predictions = pred_df[coin]
    actual_values = actual_df[coin]
    plt.plot(pred_df.index, predictions, label='Predictions')
    plt.plot(pred_df.index, actual_values, label='Actual')

    plt.xlabel('Date')
    plt.ylabel('Close')
    plt.title(f'{coin}: Closes vs Predictions')
    
    plt.legend()
    plt.savefig(join(metrics_plot_path, f'LSTM_{coin}_pred_vs_true.pdf'))
    plt.show()

In [None]:
coins_for_plot = ['BTC', 'ETH']
for coin in coins_for_plot:
    plot_test_actual_vs_pred(actual_df=actual_df, pred_df=predictions_df, coin=coin)