# Hyperparameter tuning 

Some models have various of different parameters and those parameters have sometimes even infitinte number of possibilites to choose from. 

It is of course impossible, to check all the possibilities manually, to have the best fit for hyperparameters in the model. In order to make it more efficient we can check different options by using such tools as Randomized Search Cross Validation or Grid Search Cross Validation. 

Examples?
* hyperparameter tuning on ridge/lasso regression and choosing the right value of alpha,
* choosing the right value of n_neighbors in knn model.

While choosing the best hyperparameters we use Cross Validation to avoid overfitting. We can also do a traditional train/test split before hand.

In [5]:
from sklearn.datasets import load_boston
from sklearn.model_selection import GridSearchCV, KFold
from sklearn.linear_model import Ridge
import numpy as np

X, y = load_boston(return_X_y=True)
ridge = Ridge()


# Define the hyperparameter grid
param_grid = {
    'alpha': np.arange(0.0001, 1, 10),
    'solver': ['auto', 'sag', 'cholesky', 'lsqr', 'sparse_cg']
}

# Create the GridSearchCV object with Ridge regression and k-fold cross-validation
grid_search = GridSearchCV(estimator=ridge, 
                           param_grid=param_grid, 
                           cv=KFold(n_splits=5,
                                   shuffle=True,
                                   random_state=42))

# Fit the model to the data
grid_search.fit(X, y)

# Print the best hyperparameters and corresponding score
print("Best Hyperparameters: ", grid_search.best_params_)
print("Best Score: ", grid_search.best_score_)

Best Hyperparameters:  {'alpha': 0.0001, 'solver': 'auto'}
Best Score:  0.7152217272017058


**Grid Search limitations**

The main problem with Grid Search is it's scaling. The number of fits is equal to the number of hyperparameters multiplied by the number of values multiplied by the number of folds.

To make the search for hyperparameters more efficient we can use RandomizedSearchCV.

In [7]:
from sklearn.datasets import load_boston
from sklearn.model_selection import RandomizedSearchCV, KFold
from sklearn.linear_model import Ridge
import numpy as np

X, y = load_boston(return_X_y=True)
ridge = Ridge()


# Define the hyperparameter grid
param_grid = {
    'alpha': np.arange(0.0001, 1, 10),
    'solver': ['auto', 'sag', 'cholesky', 'lsqr', 'sparse_cg']
}
# Defnine number of folds
kf = KFold(n_splits=5, shuffle=True, random_state=42)

r_cv = RandomizedSearchCV(ridge, param_grid, cv=kf, n_iter=2)

# Fit the model to the data
r_cv.fit(X, y)

# Print the best hyperparameters and corresponding score
print("Best Hyperparameters: ", r_cv.best_params_)
print("Best Score: ", r_cv.best_score_)

Best Hyperparameters:  {'solver': 'auto', 'alpha': 0.0001}
Best Score:  0.7152217272017058
