Grid Search

- Grid search is a method of exclusively searching through a collection of possible parameters values.
- For example if you have 2 hyperparameters you would like to tune,and 4 possible values for each hyperparameter, then a grid search over that parameter space would try all 16 possible parameter configurations
- Number of models= number of distinct values per hyperparameter multiplied by across each hyperparameter
- In grid search you can try parameter configuration, evaluate some metric for that configuration and pick the parameter configuration that give you the best value for the metric which is in our case will be the root mean squared error

In [1]:
import pandas as pd
import xgboost as xgb
import numpy as np
from sklearn.model_selection import GridSearchCV


In [3]:
housing_data = pd.read_csv('../data/ames_housing.csv')
X, y = housing_data[housing_data.columns[:-1]], housing_data[housing_data.columns[-1]]
housing_dmatrix = xgb.DMatrix(data=X, label=y)

gbm_params_grid = {'learning_rate': [0.01, 0.1, 0.5, 0.9], 'n_estimators': [200], 'subsample': [0.3, 0.5, 0.9]}

gbm = xgb.XGBRegressor()
grid_mse = GridSearchCV(estimator=gbm, param_grid=gbm_params_grid, scoring='neg_mean_squared_error', cv=4, verbose=2)

grid_mse.fit(X, y)

print('Best parameters found: ', grid_mse.best_params_)
print('Lowest RMSE found: ', np.sqrt(np.abs(grid_mse.best_score_)))

Fitting 4 folds for each of 12 candidates, totalling 48 fits
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.3; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.3; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.3; total time=   0.1s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.3; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.5; total time=   0.1s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.5; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.5; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.5; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.9; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.9; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, subsample=0.9; total time=   0.2s
[CV] END learning_rate=0.01, n_estimators=200, s

Random Search

- Random search is significantly different from the grid search in that the number of models that you iterate are required to iterate over does not grow as you expand the overall hyperparameter space.
- Create possibly(infinite) range of hyperparameter value per hyperparameter that you would like to search over
- In random search you get to decide how many models or iteration you want to try out before stopping
- During each iteration  Random search simply involving drawing a random combination of possible values for each hyperparameter searched over and train/evaluate a model with those hyperparameters.
- After you reached the maximum number of iteration select the hyperparameter configuration with the best evaluated score.  

In [5]:
from sklearn.model_selection import RandomizedSearchCV

gbm_params_grid = {'learning_rate': np.arange(0.05, 1.05, .05), 'n_estimators': [200],
                   'subsample': np.arange(0.05, 1.05, 0.05)}
gbm = xgb.XGBRegressor()
randomized_rmse = RandomizedSearchCV(estimator=gbm, param_distributions=gbm_params_grid, n_iter=25,
                                     scoring='neg_mean_squared_error', cv=4, verbose=1)
randomized_rmse.fit(X, y)
print('Best parameters found: ', randomized_rmse.best_params_)
print('Lowest RMSE found: ', np.sqrt(np.abs(randomized_rmse.best_score_)))

Fitting 4 folds for each of 25 candidates, totalling 100 fits
Best parameters found:  {'subsample': 0.2, 'n_estimators': 200, 'learning_rate': 0.05}
Lowest RMSE found:  28928.97143360788


Limits of GridSearchCV and RandomSearchCV
- Grid Search
 - Number of models must build with every additional new parameter grows quickly

- Random Search
  - Parameter space to explore can be massive
  - Randomly jumping throughout the space looking for a best result becomes a waiting time.
     