# Optimization
**__Article__**:

https://towardsdatascience.com/hyperparameter-optimization-intro-and-implementation-of-grid-search-random-search-and-bayesian-b2f16c00578a

This notebook will be my notes on the article on towardsdatascience about Hyperparameter Optimization. 

## Grid Search
* Grid Search is a brute force method of looping through each combination of hyperparameters in the given search space. 
* It works, but its time consuming

In [17]:
# Import libraries
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
import time

# Load Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Define the hyperparameter search space
search_space = {'n_estimators': [10, 100, 500, 1000],
              'max_depth': [2, 10, 25, 50, 100],
              'min_samples_split': [2, 5, 10],
              'min_samples_leaf': [1, 5, 10]}

# Define the random forest classifier
clf = RandomForestClassifier(random_state=1234)

# Create the optimizer object
optimizer = GridSearchCV(clf, search_space, cv=5, scoring='accuracy')

# Store start time to calculate total elapsed time
start_time = time.time()

# Fit the optimizer on the data
optimizer.fit(X, y)

# Store end time to calculate total elapsed time
end_time = time.time()

# Print the best set of hyperparameters and corresponding score
print(f"selected hyperparameters:")
print(optimizer.best_params_)
print("")
print(f"best_score: {optimizer.best_score_}")
print(f"elapsed_time: {round(end_time-start_time, 1)}")

selected hyperparameters:
{'max_depth': 10, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 500}

best_score: 0.9666666666666668
elapsed_time: 211.0


## Random Search
* Random search will randomly sampling hyperparameter combinations from a given search space, instead of going through each variation. 
* It can be much faster than GridSearch, but may miss the best value due to randomness

In [18]:
# Import libraries
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

# Create a RandomizedSearchCV object
optimizer = RandomizedSearchCV(clf, param_distributions=search_space,
                               n_iter=50, cv=5, scoring='accuracy',
                               random_state=1234)

# Store start time to calculate total elapsed time
start_time = time.time()

# Fit the optimizer on the data
optimizer.fit(X, y)

# Store end time to calculate total elapsed time
end_time = time.time()

# Print the best set of hyperparameters and corresponding score
print(f"selected hyperparameters:")
print(optimizer.best_params_)
print("")
print(f"best_score: {optimizer.best_score_}")
print(f"elapsed_time: {round(end_time-start_time, 1)}")

selected hyperparameters:
{'n_estimators': 500, 'min_samples_split': 2, 'min_samples_leaf': 1, 'max_depth': 100}

best_score: 0.9666666666666668
elapsed_time: 68.3


## Bayesian Optimization
* Bayesian tries to use probabalistic models to try and "learn" from previously attempted combinations of hyperparameters. 
* Broken into 4 steps:
    1. Define a prior model. The probabilistic model of what we think the hyperparameters are at a given time
    2. Evaluate the model for a sample of hyperparameters
    3. Update the prior model to a posterior model. Think y and y_hat.
    4. Repeat until y and y_hat converge, resources are exhausted, or some other pre-defined metric is met.

In [20]:
# Import libraries
from skopt import BayesSearchCV

# Perform Bayesian Optimization
optimizer = BayesSearchCV(estimator=RandomForestClassifier(),
                          search_spaces=search_space,
                          n_iter=10,
                          cv=5,
                          scoring='accuracy',
                          random_state=1234)

# Store start time to calculate total elapsed time
start_time = time.time()

optimizer.fit(X, y)

# Store end time to calculate total elapsed time
end_time = time.time()

# Print the best set of hyperparameters and corresponding score
print(f"selected hyperparameters:")
print(optimizer.best_params_)
print("")
print(f"best_score: {optimizer.best_score_}")
print(f"elapsed_time: {round(end_time-start_time, 1)}")

selected hyperparameters:
OrderedDict([('max_depth', 25), ('min_samples_leaf', 5), ('min_samples_split', 5), ('n_estimators', 10)])

best_score: 0.9666666666666668
elapsed_time: 19.0
