# Models Tune
In this notebook we tune the hyperparameters of the **Random Forest** and the **XGBoost Classifiers** with the **Random Search** Technique following the **K-Fold Cross Validation** procedure on a 80%-20% splitted dataset. Check utilities in wfml class for more info.

In [1]:
import import_ipynb
from ipynb.fs.full.WFML_Utilities import wfml

## XGBoost

In [2]:
wfml.hyperparameter_tune(
    n_jobs        = 2, #Carefull not to exhaust the CPU - Check underlying infra capability
    classifier    = 'XGB',
    cv_dimensions = [5, 10],
    parameters    = { #Latest Optimization
        'max_depth' : range(30, 60, 2),
        'n_estimators' :[120, 160, 200, 240, 280],
        'subsample': [1],
        'alpha' : [5, 10, 20],
        'gamma' : [0.01, 0.1, 1],
        'lambda' : range(20, 40, 5),
        'scale_pos_weight': [80, 100, 120]
})


cv =  5
Fitting 5 folds for each of 5 candidates, totalling 25 fits
Elapsed Time: 00:01:15
Best Score: 0.990
Best Parameters: {'subsample': 1, 'scale_pos_weight': 80, 'n_estimators': 120, 'max_depth': 56, 'lambda': 35, 'gamma': 0.1, 'alpha': 20}
Fold 5 completed in 75.71 seconds.

cv =  10
Fitting 10 folds for each of 10 candidates, totalling 100 fits
Elapsed Time: 00:05:33
Best Score: 0.991
Best Parameters: {'subsample': 1, 'scale_pos_weight': 100, 'n_estimators': 280, 'max_depth': 34, 'lambda': 30, 'gamma': 1, 'alpha': 10}
Fold 10 completed in 333.07 seconds.


## Random Forest

In [3]:
wfml.hyperparameter_tune(
    n_jobs        = 2, #Carefull not to exhaust the CPU - Check underlying infra capability
    classifier    = 'RF',
    cv_dimensions = [5, 10],
    parameters    = { #Latest Optimization
        "max_depth": [50, 100],
        "n_estimators": [250, 300, 400],
        "min_samples_split": [2, 4, 8],
        "min_samples_leaf": [10, 30],
        "criterion": ["gini", "entropy"],
        "max_features": [12, 20],
        "bootstrap": [True, False],
        "class_weight": [{0: 1, 1: 9}]
})


cv =  5
Fitting 5 folds for each of 5 candidates, totalling 25 fits
Elapsed Time: 00:20:54
Best Score: 0.991
Best Parameters: {'n_estimators': 300, 'min_samples_split': 2, 'min_samples_leaf': 30, 'max_features': 12, 'max_depth': 100, 'criterion': 'gini', 'class_weight': {0: 1, 1: 9}, 'bootstrap': False}
Fold 5 completed in 1254.79 seconds.

cv =  10
Fitting 10 folds for each of 10 candidates, totalling 100 fits
Elapsed Time: 01:12:58
Best Score: 0.992
Best Parameters: {'n_estimators': 250, 'min_samples_split': 8, 'min_samples_leaf': 30, 'max_features': 12, 'max_depth': 50, 'criterion': 'gini', 'class_weight': {0: 1, 1: 9}, 'bootstrap': False}
Fold 10 completed in 4378.75 seconds.
