## Hyperparameter tuning is changing the parameters around which the model is built
This notebook is a demonstration on how changing different hyperparameters of a random forest model changes the outputs.  

The hyperparameters we will be changing are:
 - Number of trees:  This is the number of predictor trees that will make up our forest
 - Depth of trees:  This is how many nodes we allow the trees to reach.  without setting this parameter, the trees will grow until each branch ends in a 'leaf' (a discrete data point).
 - Maximum features: This is how many features are available at each node.  If none, then all features are available to each node during model training, and we lose the randomness of a random forest.

In [1]:
# Set a working directory
#!pip install GitPython
import git
import os

repo = git.Repo('.', search_parent_directories=True)


os.chdir(repo.working_tree_dir)

In [2]:
# running the functions notebook so we can use those at will
%run 'cross_validation/functions.ipynb'

We are going to predict the biomass of cruise KM2010 with different hyperparameters to illustrate the differences.

In [3]:
import plotly.graph_objects as go
train_features, test_features, train_labels, test_labels = k_fold(features_pro, labels_pro, 8)

#### These are the hyperparameters we have been using so far:

In [4]:
hyperparameters={'n_estimators': 200, 'max_depth': 12, 'max_features': 'sqrt'}
predict_cruise(hyperparameters, 'pro')


#### Here we train the model with both many fewer trees and a shorter tree depth

In [None]:
hyperparameters={'n_estimators': 40, 'max_depth': 4, 'max_features': 'sqrt'}
predict_cruise(hyperparameters,'pro')

####  Many trees but shallow depth

In [None]:
hyperparameters={'n_estimators': 400, 'max_depth': 4, 'max_features': 'sqrt'}
predict_cruise(hyperparameters, 'pro')

#### Few trees but deep depth

In [None]:
hyperparameters={'n_estimators': 40, 'max_depth': 40, 'max_features': 'sqrt'}
predict_cruise(hyperparameters,'pro')

#### Many trees and deep depth

In [None]:
hyperparameters={'n_estimators': 40, 'max_depth': 40, 'max_features': 'sqrt'}
predict_cruise(hyperparameters,'pro')

#### Few trees and shallow depth, no max features
setting the max features to none means that we lose the randomization of the model, so this is absolutely not recommended for predictions

In [None]:
hyperparameters={'n_estimators': 80, 'max_depth': 10, 'max_features': None}
predict_cruise(hyperparameters,'pro')

#### Very low max features

In [None]:
hyperparameters={'n_estimators': 120, 'max_depth': 10, 'max_features': 2}
predict_cruise(hyperparameters, 'pro')

### Optional:  this is a function that will iterate through each possible combination of a defined set of hyperparameters and return the ones with the lowest test metric

This will take 5-20 mins to run depending on what hyperparameter grid you define

This is how we first found hyperparameters for the model, but it is absolutely not the best way to tune your model and understanding how each hyperparameter affects performance is very important. 

In [None]:
param_grid = {
        'n_estimators': [60, 80, 120, 160],
        'max_depth': [6, 8, 12, 18],
        'max_features': ['sqrt']
    }

metrics = ['neg_root_mean_squared_error', 'r2']
# neg root mean squared error is the same as root mean squared error, but since the function chooses the best score we need to use the negative 

grid_search_hyperparams(param_grid, metrics, features_pro, labels_pro)

Fitting 8 folds for each of 16 candidates, totalling 128 fits
Best hyperparameters for  neg_root_mean_squared_error are: {'max_depth': 8, 'max_features': 'sqrt', 'n_estimators': 160}
Fitting 8 folds for each of 16 candidates, totalling 128 fits
Best hyperparameters for  r2 are: {'max_depth': 6, 'max_features': 'sqrt', 'n_estimators': 120}
