## Hyperparameter Search Methods

When modeling, the hyperparameters of the model have a certain impact on the accuracy, and setting and adjusting the value of the hyperparameters is often called parameter tuning.

In practice, parameter tuning often relies on humans to set the adjustment range, and then use the machine to search within the hyperparameter range. This article will demonstrate the four basic hyperparameter search methods supported in sklearn:

- GridSearch
- RandomizedSearch
- HalvingGridSearch
- HalvingRandomSearch

In [14]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import KFold   #For K-fold cross validation
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn import metrics

### Initial Model
As an accuracy comparison, we initially use random forests to train the initial model and calculate the accuracy on the test set:

In [15]:
# Retrive the data
df = pd.read_csv('heart.csv')
X = df.drop(columns=['output'])
y = df['output']

# data split
x_train, x_test, y_train, y_test = train_test_split(X, y, stratify=y)

# data training & accuracy
clf = RandomForestClassifier(random_state=0)
clf.fit(x_train, y_train)
clf.score(x_test, y_test)

0.8289473684210527

### GridSearch
GridSearch is a relatively basic hyperparameter search method. The principle is to traverse all hyperparameter combinations in the calculation process, and then search for the optimal result.

As shown in the code below, we search for 4 hyperparameters, and the search space is 5 * 3 * 2 * 3 = 90 sets of hyperparameters. For each set of hyperparameters, 5-fold cross-validation needs to be calculated, which requires 450 training times.

In [18]:
from sklearn.model_selection import GridSearchCV 
parameters = {
    'max_depth': [2,4,5,6,7],
    'min_samples_leaf': [1,2,3],
    'min_weight_fraction_leaf': [0, 0.1],
    'min_impurity_decrease': [0, 0.1, 0.2]
}

# Fitting 5 folds for each of 90 candidates, totalling 450 fits
clf = GridSearchCV(
    RandomForestClassifier(random_state=0),
    parameters, refit=True, verbose=1,
)
clf.fit(x_train, y_train)
clf.best_estimator_.score(x_test, y_test)

Fitting 5 folds for each of 90 candidates, totalling 450 fits


0.8552631578947368

### HalvingGridSearch
HalvingGridSearch is very similar to GridSearch, but in the process of iteration, there is an operation of halving the parameter combination.

**At first, all hyperparameter combinations are used, but the least data is used, the optimal hyperparameters are filtered, and the data is added and then filtered.**\

The idea of HalvingGridSearch is very similar to that of hyperband, but the most simple implementation. Use a small amount of data to filter the hyperparameter combinations first, then use more data to verify the accuracy.

In [24]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.experimental import enable_halving_search_cv  # noqa
from sklearn.model_selection import HalvingGridSearchCV

X, y = load_iris(return_X_y=True)
clf = RandomForestClassifier(random_state=0)

param_grid = {"max_depth": [3, None],
              "min_samples_split": [5, 10]}
search = HalvingGridSearchCV(clf, param_grid, resource='n_estimators',
                             max_resources=10,
                             random_state=0).fit(X, y)
search.best_params_  


{'max_depth': None, 'min_samples_split': 5, 'n_estimators': 9}

### HalvingRandomSearch
HalvingRandomSearch and HalvingGridSearch are similar in that they gradually increase samples and reduce hyperparameter combinations. But each time a hyperparameter combination is generated, it is randomly selected.

In [25]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.experimental import enable_halving_search_cv  # noqa
from sklearn.model_selection import HalvingRandomSearchCV
from scipy.stats import randint
import numpy as np

X, y = load_iris(return_X_y=True)
clf = RandomForestClassifier(random_state=0)
np.random.seed(0)

param_distributions = {"max_depth": [3, None],
                       "min_samples_split": randint(2, 11)}
search = HalvingRandomSearchCV(clf, param_distributions,
                               resource='n_estimators',
                               max_resources=10,
                               random_state=0).fit(X, y)
search.best_params_  


{'max_depth': 3, 'min_samples_split': 3, 'n_estimators': 9}

### Conclusion
HalvingGridSearch and HalvingRandomSearch are more suitable for use when the amount of data is relatively large, which can improve the training speed. GridSearch and HalvingGridSearch will give better results if computing resources are sufficient.