# Tuning the hyper-parameters of an estimator

In machine learning, hyper-parameter optimization or tuning is the problem of choosing a set of optimal hyper-parameters for a learning algorithm. It is important to recall that, a hyper-parameter is a parameter whose value is used to control the learning process. By contrast, parameters (typically values of weights) are learned.

In scikit-learn the hyper-parameters are passed as arguments to the constructor of the estimator classes. Typical examples include C, kernel and gamma for Support Vector Classifier, alpha for Lasso, etc.

It is possible and recommended to search the hyper-parameter space for the best cross validation score.

A search consists of:
- an estimator (regressor or classifier such as `XGBClassifier()`);
- a parameter space;
- a method for searching or sampling candidates;
- a cross-validation scheme; and
- a score function.


In [14]:
from collections import Counter
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import accuracy_score

from xgboost import XGBClassifier
from sklearn_genetic import GASearchCV
from sklearn_genetic.space import Integer, Continuous

## Dataset

In [15]:
from sklearn.datasets import load_wine
from sklearn.preprocessing import LabelEncoder

dataset = load_wine()
m, n = dataset.data.shape
features = dataset.feature_names
print('The dataset contains %d samples and %d features:\n' %(m, n))
print(features)

df = pd.DataFrame(data=np.concatenate((dataset.data, dataset.target.reshape(m,1)), axis=1),
                     columns=dataset.feature_names + ['target'])

X = df[features]

le = LabelEncoder()
y = le.fit_transform(df['target'])

display(df)

The dataset contains 178 samples and 13 features:

['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']


Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline,target
0,14.23,1.71,2.43,15.6,127.0,2.80,3.06,0.28,2.29,5.64,1.04,3.92,1065.0,0.0
1,13.20,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.40,1050.0,0.0
2,13.16,2.36,2.67,18.6,101.0,2.80,3.24,0.30,2.81,5.68,1.03,3.17,1185.0,0.0
3,14.37,1.95,2.50,16.8,113.0,3.85,3.49,0.24,2.18,7.80,0.86,3.45,1480.0,0.0
4,13.24,2.59,2.87,21.0,118.0,2.80,2.69,0.39,1.82,4.32,1.04,2.93,735.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
173,13.71,5.65,2.45,20.5,95.0,1.68,0.61,0.52,1.06,7.70,0.64,1.74,740.0,2.0
174,13.40,3.91,2.48,23.0,102.0,1.80,0.75,0.43,1.41,7.30,0.70,1.56,750.0,2.0
175,13.27,4.28,2.26,20.0,120.0,1.59,0.69,0.43,1.35,10.20,0.59,1.56,835.0,2.0
176,13.17,2.59,2.37,20.0,120.0,1.65,0.68,0.53,1.46,9.30,0.60,1.62,840.0,2.0


## Estimator

In [16]:
estimator = XGBClassifier()
estimator.get_params()

{'objective': 'binary:logistic',
 'use_label_encoder': None,
 'base_score': None,
 'booster': None,
 'callbacks': None,
 'colsample_bylevel': None,
 'colsample_bynode': None,
 'colsample_bytree': None,
 'early_stopping_rounds': None,
 'enable_categorical': False,
 'eval_metric': None,
 'feature_types': None,
 'gamma': None,
 'gpu_id': None,
 'grow_policy': None,
 'importance_type': None,
 'interaction_constraints': None,
 'learning_rate': None,
 'max_bin': None,
 'max_cat_threshold': None,
 'max_cat_to_onehot': None,
 'max_delta_step': None,
 'max_depth': None,
 'max_leaves': None,
 'min_child_weight': None,
 'missing': nan,
 'monotone_constraints': None,
 'n_estimators': 100,
 'n_jobs': None,
 'num_parallel_tree': None,
 'predictor': None,
 'random_state': None,
 'reg_alpha': None,
 'reg_lambda': None,
 'sampling_method': None,
 'scale_pos_weight': None,
 'subsample': None,
 'tree_method': None,
 'validate_parameters': None,
 'verbosity': None}

## Parameter grid

In [17]:
param_grid = {
    'n_estimators' : Integer(40, 400),
    'learning_rate' : Continuous(0.01, 0.2),
    'max_depth' : Integer(3, 10),
    'subsample' : Continuous(0.5, 1),
    'colsample_bytree' : Continuous(0.5, 1),
    'gamma': Integer(0, 1),
    'reg_lambda' : Continuous(0, 10),
    'reg_alpha' : Continuous(0, 10),
    'min_child_weight' : Continuous(0, 10)
}

## Genetic optimization for hyperparameter tuning

In [19]:
evolved_estimator = GASearchCV(
    estimator=estimator,
    cv=3,
    scoring='accuracy',
    param_grid=param_grid,
    n_jobs=-1,
    verbose=True,
    population_size=10,
    generations=5
).fit(X, y)

print('Best score: {}'.format(evolved_estimator.best_score_))

gen	nevals	fitness 	fitness_std	fitness_max	fitness_min
0  	10    	0.911328	0.0194935  	0.938324   	0.865066   
1  	20    	0.927646	0.0095514  	0.938324   	0.910169   
2  	20    	0.935508	0.00518326 	0.938324   	0.921469   
3  	20    	0.938324	1.11022e-16	0.938324   	0.938324   
4  	20    	0.938324	1.11022e-16	0.938324   	0.938324   
5  	20    	0.938324	1.11022e-16	0.938324   	0.938324   
Best score: 0.9383239171374765
