# Hyperparameter Tuning

## Details

### Time Period Considered

This project tests forecast accuracy over the entire year of 2024. Thus, hyperparameter tuning will only be performed when comparing model accuracy to data in 2023. We will use 6 forecast periods: the first of the month through 48 hours into the future for July through December. Further, for computational purposes we use 6 months of training data for a given forecast period. 

Note that this training data is not fully representative of the year, where there are seasonal cycles, but the goal of this is just to select a strong model architecture. We are not training the final model.

### Restricted Grid Search

In this project, we are conducting a "restricted" grid search. This means that were are fixing a number of hyperparameters to make the search computationally feasible. Notably, we are fixing the feature list across all machine learning models used.

For the RNN hyperparameter tuning, we use a 2 step optimization to simplify things:
1. Tune the model architecture, including type of layers, number of layers, and number of units. 
2. Tune the optimization hyperparameters, including batch size, learning rate


For the **model architecture search**, we use the following constraints:
* We consider only LSTM and dense layers
* Recurrent layers always come first
* We consider 1 or 2 recurrent layers
* We consider 0, 1 or 2 dense layers
* For each layer, we consider a grid of 3 numbers of unitss: 16, 32, 64 
* Use the funnel structure where number of units always stays the same or decreases. So 64 units can feed into 32 or 16, while 16 units can only feed into 16

For optimization-related hyperparameters, we tune the folowing in a grid search:
* Learning Rate: 0.01, 0.001, or 0.0001
* Batch Size: 32, 64, or 128

We are using early stopping with patience 5, meaning that training is halted if accuracy on the validation set does not improve for 5 epochs. We set the number of epochs to 100, which in practice is unnecessarily large but is typically halted with early stopping.

### Fixed Hyperparameters

Some hyperpararameters are fixed to commonly accepted defaults. These include:

| Hyperparameter | Default | 
|---------------|---------| 
| **Features List** | (see paper description) | 
| **Sequence Length (aka Timesteps)** | `48` | 
| **LSTM Activation Function** | `tanh` | 
| **Dense Activation Function** | `relu` | 
| **Recurrent Activation Function (LSTM)** | `sigmoid` | 
| **Batch Normalization (On/Off)** | Off | 
| **Optimizer Type** | `Adam` | 
| **Dropout Rate (incl. Recurrent Dropout)** | `0.2` | 

## Setup

In [None]:
from sklearn.model_selection import ParameterGrid
from itertools import product
import sys
sys.path.append("../src")
from utils import Dict, read_yml
from models.moisture_rnn import model_grid, optimization_grid

In [None]:
# Full RNN Model Params
params_rnn = Dict(read_yml("../etc/params_models.yaml", subkey="rnn"))
params_rnn

In [None]:
# Hyperparam Tuning Setup
hyper_params = Dict(read_yml("../etc/rnn_hyperparam_tuning_config.yaml"))
hyper_params

In [None]:
hyper_params['optimization']

## Create Grids

In [None]:
import importlib
import models.moisture_rnn
importlib.reload(models.moisture_rnn)
from models.moisture_rnn import optimization_grid, model_grid

In [None]:
hyper_params['model_architecture']

In [None]:
model_params_grid = model_grid(hyper_params['model_architecture'])
model_params_grid

In [None]:
opt_grid = optimization_grid(hyper_params['optimization'])
opt_grid