In [None]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns 
import warnings 

from sklearn.model_selection import train_test_split

warnings.filterwarnings('ignore')

# Parameters VS Hyperparameters

## Parameters

**Parameters** are components of the model learned during the modeling process. 

You cannot set them. They are discovered by the algorithm.

In linear models these parameters are the coefficients and the intercept. In tree based algorithms these params are node decisionas (feature and value to split on)

## Hyperparameters

**Hyperparameters** are something you set before the modeling process 

The algorithm does not learn these during the training

Some hyperparameters are more important than others.

To find the parameters of a model, create one and print it out:


In [None]:
from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier() 
rfc

To know the meaning of each parameter, go to the sklearn documentation page 

Some hyperparameters are more important than others. Some wont help model performance (the ones about keeping information about the training, the verbosity level or the random state for instance)


In [None]:
df = pd.read_csv('../data/credit-card-full.csv')


# Hyperparameter Values

There is no one size that fits all. The values for each hyperparameter depend on the algorithm and hyperparameter. There are some best practices for them though.

## Conflicting Values

Be aware of conflicting hyperparameter choices. This should be clearly stated in the sklearn documentation.

## Silly Values

There are values for different algorithms that most likely wont yield any good result like low number of trees in a random forest model, one neighbor in KNN, or increment a hyperparameter by a very small value...

# Automating Hyperparameter Choice

It an be automated with for loops over arrays of possible values, storing scores, but it becomes unmanageble when we consider multiple hyperparameters

## Grid Search 

We could nest a loop per hyperparameter to consider, what would increase the computational charge in an exponentional manner.

Grid search consist in train a model for each combination of values for the different hyperparameters.

### Pro's 

You dont have to write thousands of lines of code

The best model within the grid values is guaranteed to be found (unless silly or conflicting values are there)

### Cons 

Computationally expensive

Its uninformed: each model doesnt take into account the outcome of previous trainings. 

## Grid Search in SciKit Learn

Steps to perform grid search: 
1. Choose an algortithm to tune its hyperparameters (estimator)
2. Define the hyperparameters to be tuned
3. Define range of values for each hyperparameter
4. setting a cross validation scheme
5. define a scoring function
6. include extra useful information or functions

[sklearn GridSearchCV documentation](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html)

In [None]:
from sklearn.model_selection import GridSearchCV

# Create a Random Forest Classifier with specified criterion
rf_class = RandomForestClassifier(criterion='entropy')

# Create the parameter grid
param_grid = {'max_depth': [2, 4, 8, 15], 'max_features': ['auto', 'sqrt']} 

# Create a GridSearchCV object
grid_rf_class = GridSearchCV(
    estimator=rf_class,
    param_grid=param_grid,
    scoring='roc_auc',
    n_jobs=4,
    cv=5,
    refit=True, return_train_score=True)
print(grid_rf_class)

Fitting a model via GridSearchCV returns: 
- A results log (*.cv_results_*)
- The best results (*.best_index_*, *.best_params_* and *.best_score_*)
- Extra Information (*.scorer_*, *.n_splits_* and *.refit_time_*)

# Random Search 

Very similar to grid search in terms of setup, but instead of training a model for every single combination of parameters, it randomly picks these combinations.



In [None]:
from sklearn.model_selection import RandomizedSearchCV

# Create a Random Forest Classifier with specified criterion
rf_class = RandomForestClassifier(criterion='entropy')

# Create the parameter grid
param_grid = {'max_depth': [2, 4, 8, 15], 'max_features': ['auto', 'sqrt']} 

# Create a RandomizedSearchCV object
rand_rf_class = RandomizedSearchCV(
    estimator=rf_class,
    param_distributions=param_grid,
    scoring='roc_auc',
    n_jobs=4,
    cv=5,
    refit=True, 
    return_train_score=True, 
    n_iter=50
)
print(rand_rf_class)

# Informed Search 

An informed search performs an initial random search to evaluate the best regions to perform a grid search

## Bayesian Hyperparameter Tuning

It uses bayes theory updating beliefs using evidence on model performance.

The library [Hyperopt](https://hyperopt.github.io/hyperopt/) implements it

In [None]:
import hyperopt as hp

# Set up space dictionary with specified hyperparameters
space = {'max_depth': hp.('max_depth', 2, 10, 2),'learning_rate': hp.uniform('learning_rate', 0.001, 0.9)}

# Set up objective function
def objective(params):
    params = {'max_depth': int(params['max_depth']),'learning_rate': params['learning_rate']}
    gbm_clf = GradientBoostingClassifier(n_estimators=100, **params) 
    best_score = cross_val_score(gbm_clf, X_train, y_train, scoring='accuracy', cv=2, n_jobs=4).mean()
    loss = 1 - best_score
    return loss

# Run the algorithm
best = fmin(fn=objective,space=space, max_evals=20, rstate=np.random.default_rng(42), algo=tpe.suggest)
print(best)

## Genetic Algorithms

[TPOT](https://epistasislab.github.io/tpot/) implements this