# <span style='font-family: CMU Sans Serif, sans-serif;'> Feed-Forward Neural-Networks  </span> 

## <span style='font-family: CMU Sans Serif, sans-serif;'> Workflow  </span> 

Below we look at different feed-forward achitectures. To find the *optimal models* within each achitectures we implement grid search with successive handling to stop poorly performing models early. We use a pythonian approach to building the models (*i.e. we create modules for each model*). For architecture we will define the following hyper-parameters.

| Hyper-parameter               | Variable                                                           |
|-------------------------------|--------------------------------------------------------------------|
| Number of epochs              | $E$                                                                |
| Amount of hidden layers       | $R$                                                                |
| Amount of neuron per layer    | $n^{(r)}$                                                          |
| Learning rate                 | $\alpha$                                                           |
| Mini-batch size               | $m$                                                                |
| $L_1$ or $L_2$ regularization | $L_1$ or $L_2$                                                     |
| Regularization factor         | $\lambda$                                                          |
| Optimization algorithm        | $\text{Adam}$, or $\text{RMSprop}$, or $\text{SGD}$ with momentum. |
| Momentum in $\text{SGD}$      | $\mu$                                                              |
| Activation function           | $\text{ReLU}$, $\text{PReLU}$, $\text{Leaky ReLU}$                 |
| Dropout (bool)                | $\text{Dropout}^{T}$ or $\text{Dropout}^{F}$                       |
| Dropout parameter             | $p$                                                                |
| Last hidden layer $\tanh$     | $\text{True}$ or $\text{False}$                                    |

## <span style='font-family: CMU Sans Serif, sans-serif;'> Package Import  </span> 

In [48]:
# Data handling
from typing import Union, List, Dict, Optional
from sklearn.model_selection import train_test_split
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
import pandas as pd
import numpy as np
import json
import uuid
import os


# Neural networks
import tensorflow as tf
from tensorflow.keras.utils import plot_model

import keras.optimizers
import keras.losses
import keras.metrics

## <span style='font-family: CMU Sans Serif, sans-serif;'> Data </span> 

### <span style='font-family: CMU Sans Serif, sans-serif;'> Import  </span> 

Below we import the main dataframe and the list of remaining primary features.

In [115]:
# Import main data
data_gfd_usa = pd.read_csv('../data__clean/usa__gfd__cleaned_v2.csv')

# Import primary features remaining
with open('../data__clean/listPrimaryFeaturesRemaining.json', 'r') as f:
    list_primary_features = json.load(f)

### <span style='font-family: CMU Sans Serif, sans-serif;'> Train/Test Split  </span> 

We need to split data into test and train for training and model validation. This is done below with the scikit-learn package.

In [116]:
# Extract features and labels from data
data_features = data_gfd_usa[list_primary_features]
data_labels = data_gfd_usa['ret_exc_lead1m']

# Split features and labels into test and train
data_train_features, data_test_features, data_train_labels, data_test_labels = train_test_split(data_features, data_labels, test_size=0.2)

# <span style='font-family: CMU Sans Serif, sans-serif;'> Building Modules  </span> 

What is below: (1 custom classes that can be used on all models, and (2) specific classes for each architecture (or complexities within architectures).

## <span style='font-family: CMU Sans Serif, sans-serif;'> Functions and classes  </span> 

### <span style='font-family: CMU Sans Serif, sans-serif;'> Get optimizer function  </span> 

To keep things (*semi*) simple only 3 optimizers can be used: adam, SGD, RMSprop.

In [46]:
def get_optimizer(optimizer: str = "adam", **kwags):
    dflt_vals = {
        "sgd": {"learning_rate": 0.01, "momentum": 0.0},
        "adam": {
            "learning_rate": 0.01,
            "beta_1": 0.9,
            "beta_2": 0.999,
            "epsilon": 1e-07,
        },
        "rmsprop": {
            "learning_rate": 0.01,
            "rho": 0.9,
            "momentum": 0.0,
            "epsilon": 1e-07,
            "centered": False,
        },
    }

    # Optimizer class for each optimizer
    optimizer_classes = {
        "adam": tf.keras.optimizers.Adam,
        "sgd": tf.keras.optimizers.SGD,
        "rmsprop": tf.keras.optimizers.RMSprop,
    }

    # Normalize optimizer name
    optimizer_name = optimizer.lower()

    # Check if optimizer is supported
    if optimizer_name not in dflt_vals:
        raise ValueError(f"Optimizer '{optimizer_name}' not supported.")

    # Get default values for chosen optimizer
    dflt = dflt_vals[optimizer_name]

    # Get optimizer params (dict)
    optimizer_params = {param: kwags.get(param, dflt[param]) for param in dflt}

    # Get optimizer class
    optimizer_class = optimizer_classes[optimizer_name]

    # Return optimizer with given or default params
    return optimizer_class(**optimizer_params)

### <span style='font-family: CMU Sans Serif, sans-serif;'> Callback function  </span> 

In [35]:
def get_callbacks(earlystop=False, checkpoint=False, **kwargs):
    callback_map = {
        "earlystop": (
            EarlyStopping,
            {
                "monitor": "val_loss",
                "min_delta": 0,
                "patience": 0,
                "verbose": 0,
                "mode": "auto",
                "baseline": None,
                "restore_best_weights": False,
                "start_from_epoch": 0,
            },
        ),
        "checkpoint": (
            ModelCheckpoint,
            {
                "filepath": "test",
                "monitor": "val_loss",
                "verbose": 0,
                "save_best_only": False,
                "save_weights_only": False,
                "mode": "auto",
                "save_freq": "epoch",
                "initial_value_threshold": None,
            },
        ),
    }

    callbacks = []

    all_params = {
        "earlystop": None,
        "checkpoint": None,
    }

    for callback_name in callback_map.keys():
        if locals()[callback_name]:
            constructor, defaults = callback_map[callback_name]

            merged_params = {
                **defaults,
                **{k: v for k, v in kwargs.items() if k in defaults},
            }
            callbacks.append(constructor(**merged_params))
            all_params[callback_name] = merged_params

    return callbacks, all_params

### <span style='font-family: CMU Sans Serif, sans-serif;'> Model manager class  </span> 

In [123]:
# TODO:
#   See this chat for additional ideas on model manager class https://chatgpt.com/c/67e5e6d4-c9b0-8004-98c4-375eb9e89c38


class ModelManager:
    # Class-level attribute to store the names of every model
    _all_model_names = set()

    @classmethod
    def get_all_model_names(cls):
        return list(cls._all_model_names)

    @classmethod
    def remove_model_name(cls, model_name):
        cls._all_model_names.discard(model_name)

    def __init__(self, model, optimizer, model_name):

        self.model = model
        self.model_name = model_name
        self.optimizer = optimizer
        self.callbacks = []
        self.training_history = {}
        ModelManager._all_model_names.add(model_name)

    def compile_model(self, loss: str = "mse", metrics: list = ["mse"]):
        self.model.compile(optimizer=self.optimizer, loss=loss, metrics=metrics)
        print(f"Model '{self.model_name}' compiled successfully.")

    def setup_callbacks(self, earlystop=False, checkpoint=False, **kwargs):
        # Default parameter values for the callbacks
        callback_map = {
            "earlystop": (
                EarlyStopping,
                {
                    "monitor": "val_loss",
                    "min_delta": 0,
                    "patience": 0,
                    "verbose": 0,
                    "mode": "auto",
                    "baseline": None,
                    "restore_best_weights": False,
                    "start_from_epoch": 0,
                },
            ),
            "checkpoint": (
                ModelCheckpoint,
                {
                    "filepath": "test",  # Placeholder filepath
                    "monitor": "val_loss",
                    "verbose": 0,
                    "save_best_only": False,
                    "save_weights_only": False,
                    "mode": "auto",
                    "save_freq": "epoch",
                    "initial_value_threshold": None,
                },
            ),
        }

        # Start with an empty list of callbacks
        self.callbacks = []

        # EarlyStopping Callback
        if earlystop:
            # Extract the class and default params from the callback_map
            callback_class, default_params = callback_map["earlystop"]
            # Merge defaults with any kwargs passed by the user
            merged_params = {
                **default_params,
                **{k: v for k, v in kwargs.items() if k in default_params},
            }
            earlystop_callback = callback_class(**merged_params)
            self.callbacks.append(earlystop_callback)
            print(f"Configured early stop params: {merged_params}")

        # ModelCheckpoint Callback
        if checkpoint:
            # Extract the class and default params from the callback_map
            callback_class, default_params = callback_map["checkpoint"]
            # Create the 'check_points' folder if it doesn't exist
            checkpoint_dir = "check_points"
            if not os.path.exists(checkpoint_dir):
                os.makedirs(checkpoint_dir)

            # Merge defaults with any kwargs passed by the user
            merged_params = {
                **default_params,
                **{k: v for k, v in kwargs.items() if k in default_params},
            }
            # Use the model_name for the checkpoint file path
            merged_params["filepath"] = os.path.join(
                checkpoint_dir, f"{self.model_name}_check_points.keras"
            )
            checkpoint_callback = callback_class(**merged_params)
            self.callbacks.append(checkpoint_callback)
            print(f"Configured checkpoint params: {merged_params}")

        print(f"Callbacks for model '{self.model_name}' have been set up.")

    def train_model(
        self,
        x,
        y,
        epochs=10,
        batch_size=32,
        verbose=1,
        validation_split=0.2,
    ):
        self.model.fit(
            x,
            y,
            epochs=epochs,
            batch_size=batch_size,
            verbose=verbose,
            callbacks=self.callbacks,
            validation_split=validation_split,
        )
        print(f"Model '{self.model_name}' trained succesfully.")

# <span style='font-family: CMU Sans Serif, sans-serif;'> Building networks  </span> 

## <span style='font-family: CMU Sans Serif, sans-serif;'> Vanilla FFN  </span> 

### <span style='font-family: CMU Sans Serif, sans-serif;'> Vanilla FNN function  </span> 

Below we define a function for building vanilla feed forward neural networks.

We allow for varying the following parameters dynamically:
- number of neurons in each layer;
- type of activation function used in each layer.

In [140]:
def build_vanilla_fnn(
    input_shape = (141,),
    layer_neurons  = [32, 16, 1],
    activations = ["relu", "tanh", None]
):
    # Same length of inputs
    assert len(layer_neurons) == len(
        activations
    ), "layer_neurons and activations must have same length."
    model_id = str(uuid.uuid4())
    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=input_shape))

    for i, (units, activation) in enumerate(zip(layer_neurons[:-1], activations[:-1])):
        model.add(
            tf.keras.layers.Dense(
                units, activation=activation, name=f"{model_id}_dense_{i+1}"
            )
        )
    model.add(
        tf.keras.layers.Dense(
            layer_neurons[-1],
            activation=activations[-1],
            name=f"{model_id}_dense_{len(layer_neurons)}",
        )
    )

    return model

### <span style='font-family: CMU Sans Serif, sans-serif;'> Creating a NN  </span> 

Lest try our creating a first neural network.

In [141]:
data_train_features.shape

(351524, 141)

In [142]:
simple_optimizer = get_optimizer('adam')
simple_fnn = build_vanilla_fnn()

In [143]:
fnn_mm = ModelManager(model = simple_fnn, optimizer = simple_optimizer, model_name = "first_vanilla_fnn")

In [144]:
fnn_mm.compile_model()

Model 'first_vanilla_fnn' compiled successfully.


In [145]:
fnn_mm.setup_callbacks(True)

Configured early stop params: {'monitor': 'val_loss', 'min_delta': 0, 'patience': 0, 'verbose': 0, 'mode': 'auto', 'baseline': None, 'restore_best_weights': False, 'start_from_epoch': 0}
Callbacks for model 'first_vanilla_fnn' have been set up.


In [146]:
fnn_mm.callbacks

[<keras.src.callbacks.early_stopping.EarlyStopping at 0x177b2d0d0>]

In [147]:
fnn_mm.train_model(data_train_features, data_train_labels)

Epoch 1/10
[1m8789/8789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 432us/step - loss: 0.0267 - mse: 0.0267 - val_loss: 0.0267 - val_mse: 0.0267
Epoch 2/10
[1m8789/8789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 426us/step - loss: 0.0171 - mse: 0.0171 - val_loss: 0.0175 - val_mse: 0.0175
Epoch 3/10
[1m8789/8789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 430us/step - loss: 0.0177 - mse: 0.0177 - val_loss: 0.0181 - val_mse: 0.0181
Model 'first_vanilla_fnn' trained succesfully.
