# Introduction: Hyperparameter Optimization

In this notebook we will explore several options for hyperparameter optimization of a machine learning algorithm. We will start with some of the basic methods such as random search, and then proceed to more sophisticated methods using Guassian Processes.

In [39]:
import pandas as pd
import numpy as np

# Modeling
import lightgbm as lgb

# Evaluation of the model
from sklearn.model_selection import KFold

In [41]:
# Read in data and separate into training and testing sets
data = pd.read_csv('data/caravan-insurance-challenge.csv')
train = data[data['ORIGIN'] == 'train']
test = data[data['ORIGIN'] == 'test']

# Extract the labels and format properly
train_labels = np.array(train['CARAVAN'].astype(np.int32)).reshape((-1,))
test_labels = np.array(test['CARAVAN'].astype(np.int32)).reshape((-1,))

# Drop the unneeded columns
train = train.drop(columns = ['ORIGIN', 'CARAVAN'])
test = test.drop(columns = ['ORIGIN', 'CARAVAN'])

# Convert to numpy array for splitting in cross validation
features = np.array(train)
labels = train_labels

print('Train shape: ', train.shape)

train.head()

Train shape:  (5822, 85)


Unnamed: 0,MOSTYPE,MAANTHUI,MGEMOMV,MGEMLEEF,MOSHOOFD,MGODRK,MGODPR,MGODOV,MGODGE,MRELGE,...,ALEVEN,APERSONG,AGEZONG,AWAOREG,ABRAND,AZEILPL,APLEZIER,AFIETS,AINBOED,ABYSTAND
0,33,1,3,2,8,0,5,1,3,7,...,0,0,0,0,1,0,0,0,0,0
1,37,1,2,2,8,1,4,1,4,6,...,0,0,0,0,1,0,0,0,0,0
2,37,1,2,2,8,0,4,2,4,3,...,0,0,0,0,1,0,0,0,0,0
3,9,1,3,3,3,2,3,2,4,5,...,0,0,0,0,1,0,0,0,0,0
4,40,1,4,2,10,1,4,1,4,7,...,0,0,0,0,1,0,0,0,0,0


# Random Search by Hand

The first method we can implement is simply random search. Each iteration, choose a random set of model hyperparameters from a search space. Empirically, random search is very effective, returning nearly as good results as grid search with a significant reduction in time spent searching. 

Random search can be implement in the Scikit-Learn library with the LightGBM Sklearn API. However, this does not support training with early stopping, which is the most effective method for determining the best number of iterations to use. Therefore, we will implement random search ourselves with a defined parameter grid, and using Early Stopping.

In [10]:
import random

In [12]:
lgb.LGBMClassifier()

LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
        learning_rate=0.1, max_depth=-1, min_child_samples=20,
        min_child_weight=0.001, min_split_gain=0.0, n_estimators=100,
        n_jobs=-1, num_leaves=31, objective=None, random_state=None,
        reg_alpha=0.0, reg_lambda=0.0, silent=True, subsample=1.0,
        subsample_for_bin=200000, subsample_freq=1)

# Domain 

In random search, as in Bayesian optimization, we have a domain over which we search for the best hyperparameters. In terms of a random or grid search, this is generally known as a hyperparameter grid

In [42]:
# Hyperparameter grid
param_grid = {
    'class_weight': [None, 'balanced'],
    'boosting_type': ['gbdt', 'goss', 'dart'],
    'num_leaves': list(range(30, 151)),
    'learning_rate': list(np.logspace(np.log(0.01), np.log(0.2), base = np.exp(1), num = 100)),
    'subsample_for_bin': list(range(20000, 300000, 20000)),
    'min_child_samples': list(range(20, 500, 5)),
    'reg_alpha': list(np.linspace(0, 1)),
    'reg_lambda': list(np.linspace(0, 1)),
    'colsample_bytree': list(np.linspace(0.6, 1, 10))
}

# Subsampling (only applicable with 'goss')
subsample_dist = list(np.linspace(0.5, 1, 100))

In [43]:
evals = 50

# Dataframe to hold cv results
results = pd.DataFrame(columns = ['params', 'train_scores', 'train', 'valid_scores', 'valid', 'estimators'],
                       index = list(range(evals)))

In [None]:
%%capture 
# Iterate through the specified number of evaluations
for i in range(evals):
    
    k_fold = KFold(n_splits = 5)
    
    # Randomly sample parameters for gbm
    params = {key: random.sample(value, 1)[0] for key, value in param_grid.items()}
    
    
    if params['boosting_type'] == 'goss':
        # Cannot subsample with goss
        params['subsample'] = 1.0
    else:
        # Subsample supported for gdbt and dart
        params['subsample'] = random.sample(subsample_dist, 1)[0]
        
        
    # Create the model with the parameters
    model = lgb.LGBMClassifier(class_weight = params['class_weight'], boosting_type = params['boosting_type'], 
                               num_leaves = params['num_leaves'], learning_rate = params['learning_rate'], 
                               subsample_for_bin = params['subsample_for_bin'], min_child_samples = params['min_child_samples'], 
                               reg_alpha = params['reg_alpha'], reg_lambda = params['reg_lambda'], 
                               colsample_by_tree = params['colsample_bytree'], subsample = params['subsample'], 
                               n_estimators = 10000, n_jobs = -1, objective = 'binary', verbose=-1, verbose_eval = False)
    
    # Empty lists for records
    valid_scores = []
    train_scores = []
    number_estimators = []
    
    # Split the data
    for (train_indices, valid_indices) in k_fold.split(features):
        
        # Training data and validation set
        train_features, train_labels = features[train_indices], labels[train_indices]
        valid_features, valid_labels = features[valid_indices], labels[valid_indices]
        
        # Fit the model using early stopping
        model.fit(train_features, train_labels, eval_set = [(train_features, train_labels), (valid_features, valid_labels)],
                  eval_metric = 'auc', eval_names = ['train', 'valid'], early_stopping_rounds = 200, verbose = -1);
        
        
        valid_scores.append(model.best_score_['valid']['auc'])
        train_scores.append(model.best_score_['train']['auc'])
        number_estimators.append(model.best_iteration_)
        
    # Average the scores
    valid = np.mean(valid_scores)
    train = np.mean(train_scores)
    estimators = np.mean(number_estimators)
    
    # Add results to next row in dataframe
    results.loc[i, :] = [params, train_scores, train, valid_scores, valid, estimators]

In [None]:
results = results.sort_values('valid', ascending = False)
results.head()

In [None]:
results.iloc[0, 0]

## Bayesian Hyperparameter Optimization using Hyperopt

In [None]:
from hyperopt import hp, fmin, tpe, STATUS_OK, Trials
from hyperopt.pyll.stochastic import sample

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
learning_rate = {'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.2))}
learning_rate_dist = []

for _ in range(10000):
    learning_rate_dist.append(sample(learning_rate)['learning_rate'])
    
plt.hist(learning_rate_dist, bins = 10, edgecolor = 'k');
plt.title('Learning Rate Distribution');

In [None]:
num_leaves = {'num_leaves': hp.quniform('num_leaves', 30, 150, 1)}
num_leaves_dist = []

for _ in range(10000):
    num_leaves_dist.append(sample(num_leaves)['num_leaves'])
    
plt.hist(num_leaves_dist, bins = 10, edgecolor = 'k');
plt.title('Number of Leaves Distribution');

# Objective Function

The objective function will be the cross validation score evaluate over 5 folds. We need to make sure the objective function returns a single, real-value metric. We can return more information in the form of a dictionary where one of the keys must be 'loss' and another must be 'STATUS'. The other keys can hold information such as the hyperparameters used or the evaluation time.

In [None]:
import csv
from timeit import default_timer as timer

In [None]:
def objective(params):
    
    global iteration
    
    iteration += 1
    
    k_fold = KFold(n_splits = 5)
    
    # Retrieve the subsample if present otherwise set to 1.0
    subsample = params['boosting_type'].get('subsample', 1.0)
    
    # Extract the boosting type
    params['boosting_type'] = params['boosting_type']['boosting_type']
    params['subsample'] = subsample
    
    # Make sure parameters that need to be integers are integers
    for parameter_name in ['num_leaves', 'subsample_for_bin', 'min_child_samples']:
        params[parameter_name] = int(params[parameter_name])
    
    
    model = lgb.LGBMClassifier(n_estimators = 10000, **params, objective = 'binary', n_jobs = -1, verbose = -1)
    
    valid_scores = []
    train_scores = []
    number_estimators = []
    
    start = timer()
    for (train_indices, valid_indices) in k_fold.split(features):
        
        # Training data and validation set
        train_features, train_labels = features[train_indices], labels[train_indices]
        valid_features, valid_labels = features[valid_indices], labels[valid_indices]
        
        # Fit the model using early stopping
        model.fit(train_features, train_labels, eval_set = [(train_features, train_labels), (valid_features, valid_labels)],
                  eval_metric = 'auc', eval_names = ['train', 'valid'], early_stopping_rounds = 200, verbose = -1)
    
        valid_scores.append(model.best_score_['valid']['auc'])
        train_scores.append(model.best_score_['train']['auc'])
        number_estimators.append(model.best_iteration_)
        
    end = timer()
    
    run_time = end - start
    
    # fmin needs a loss to minimize
    valid = -1 * np.mean(valid_scores)
    train = -1 * np.mean(train_scores)
    
    # average number of estimators
    estimators = np.mean(number_estimators)

    o_f = open(out_file, 'a')
    writer = csv.writer(o_f)
    
    writer.writerow([valid, train, estimators, run_time, params, iteration])
    
    # Dictionary with information for evaluation
    return {'loss': valid, 'train': train, 'estimators': estimators, 
            'train_time': run_time, 'status': STATUS_OK, 'params': params}

In [None]:
# Define the search space
space = {
    'class_weight': hp.choice('class_weight', [None, 'balanced']),
    'boosting_type': hp.choice('boosting_type', [{'boosting_type': 'gbdt', 'subsample': hp.uniform('gdbt_subsample', 0.5, 1)}, 
                                                 {'boosting_type': 'dart', 'subsample': hp.uniform('dart_subsample', 0.5, 1)},
                                                 {'boosting_type': 'goss'}]),
    'num_leaves': hp.quniform('num_leaves', 30, 150, 1),
    'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.2)),
    'subsample_for_bin': hp.quniform('subsample_for_bin', 20000, 300000, 20000),
    'min_child_samples': hp.quniform('min_child_samples', 20, 500, 5),
    'reg_alpha': hp.uniform('reg_alpha', 0.0, 1.0),
    'reg_lambda': hp.uniform('reg_lambda', 0.0, 1.0),
    'colsample_bytree': hp.uniform('colsample_by_tree', 0.6, 1.0)
}

In [None]:
boosting_type = {'boosting_type': hp.choice('boosting_type', [{'boosting_type': 'gbdt', 'subsample': hp.uniform('subsample', 0.5, 1)}, 
                                                 {'boosting_type': 'dart', 'subsample': hp.uniform('subsample', 0.5, 1)},
                                                 {'boosting_type': 'goss'}])}

In [None]:
sample(boosting_type)

In [None]:
sample(boosting_type)

In [None]:
learning_rate = {'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.5))}

In [None]:
sample(learning_rate)

In [None]:
sample(learning_rate)

In [None]:
x = sample(space)

### Example of Sampling from the space

After finding the boosting type (which is in a nested dictionary), we assign the boosting type to make it a top level value. We use the dictionary get method to find the 'subsample' if it is in the dictionary (indicating the boosting type is 'gbdt' or 'dart') or set it to 1.0 otherwise (if boosting type is 'goss'). The goss boosting type cannot use bagging. 

This entire step is necessary because of the conditional logic used for the boosting type and subsample ratio.

In addition, we can see the other variables in the dictionary. These will change every time we sample the space.

In [None]:
x = sample(space)
subsample = x['boosting_type'].get('subsample', 1.0)
x['boosting_type'] = x['boosting_type']['boosting_type']
x['subsample'] = subsample
x

In [None]:
x = sample(space)
subsample = x['boosting_type'].get('subsample', 1.0)
x['boosting_type'] = x['boosting_type']['boosting_type']
x['subsample'] = subsample
x

## Optimization

In [112]:
%%capt

out_file = 'gbm_trials1.csv'
iteration = 1

o_f = open(out_file, 'w')
writer = csv.writer(o_f)
writer.writerow(['loss', 'train', 'estimators', 'train_time', 'params', 'iteration'])
o_f.close()

trials = Trials()

best = fmin(fn = objective, space = space, algo = tpe.suggest, 
            max_evals = 5, trials = trials, verbose = 1)

Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[20]	train's auc: 0.796388	valid's auc: 0.781494
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[302]	train's auc: 0.826292	valid's auc: 0.763413
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[127]	train's auc: 0.814329	valid's auc: 0.770288
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[394]	train's auc: 0.836541	valid's auc: 0.750156
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[405]	train's auc: 0.825125	valid's auc: 0.79806
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[57]	train's auc: 0.824137	valid's auc: 0.779204
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[193]	train's auc

In [113]:
trials.results

[{'estimators': 249.6,
  'loss': -0.7726819522791677,
  'params': {'boosting_type': 'goss',
   'class_weight': 'balanced',
   'colsample_bytree': 0.885238016885687,
   'learning_rate': 0.08297750566782122,
   'min_child_samples': 400,
   'num_leaves': 78,
   'reg_alpha': 0.8570474624562273,
   'reg_lambda': 0.20061022850531263,
   'subsample': 1.0,
   'subsample_for_bin': 100000},
  'status': 'ok',
  'train': -0.819735249958874,
  'train_time': 1.6798017310136402},
 {'estimators': 84.8,
  'loss': -0.7678879800156242,
  'params': {'boosting_type': 'goss',
   'class_weight': None,
   'colsample_bytree': 0.7309008963265395,
   'learning_rate': 0.07449403943678422,
   'min_child_samples': 110,
   'num_leaves': 101,
   'reg_alpha': 0.056076459327698336,
   'reg_lambda': 0.6389491698840711,
   'subsample': 1.0,
   'subsample_for_bin': 220000},
  'status': 'ok',
  'train': -0.8455731392374876,
  'train_time': 1.9323893199166946},
 {'estimators': 197.0,
  'loss': -0.7722024973823665,
  'params

In [114]:
trials_results = sorted(trials.results, key = lambda x: x['loss'])

In [115]:
trials_results[:2]

[{'estimators': 249.6,
  'loss': -0.7726819522791677,
  'params': {'boosting_type': 'goss',
   'class_weight': 'balanced',
   'colsample_bytree': 0.885238016885687,
   'learning_rate': 0.08297750566782122,
   'min_child_samples': 400,
   'num_leaves': 78,
   'reg_alpha': 0.8570474624562273,
   'reg_lambda': 0.20061022850531263,
   'subsample': 1.0,
   'subsample_for_bin': 100000},
  'status': 'ok',
  'train': -0.819735249958874,
  'train_time': 1.6798017310136402},
 {'estimators': 197.0,
  'loss': -0.7722024973823665,
  'params': {'boosting_type': 'dart',
   'class_weight': 'balanced',
   'colsample_bytree': 0.8523279607641367,
   'learning_rate': 0.07662953225152201,
   'min_child_samples': 240,
   'num_leaves': 68,
   'reg_alpha': 0.5464116066586574,
   'reg_lambda': 0.8197593496690962,
   'subsample': 0.5611109141166888,
   'subsample_for_bin': 260000},
  'status': 'ok',
  'train': -0.8483165945497552,
  'train_time': 8.46129112636487}]

In [116]:
import json

with open('trials.json', 'w') as f:
    f.write(json.dumps(trials_results))

In [117]:
out_file = 'gbm_trials2.csv'
iteration = 1

o_f = open(out_file, 'w')
writer = csv.writer(o_f)
writer.writerow(['loss', 'train', 'estimators', 'train_time', 'params', 'iteration'])
o_f.close()

trials = Trials()

best = fmin(fn = objective, space = space, algo = tpe.suggest, 
            max_evals = 100, trials = trials, verbose = 1)

Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[12]	train's auc: 0.776723	valid's auc: 0.772825
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[160]	train's auc: 0.843645	valid's auc: 0.745357
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[123]	train's auc: 0.833289	valid's auc: 0.760781
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[92]	train's auc: 0.825811	valid's auc: 0.748693
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[9]	train's auc: 0.761589	valid's auc: 0.806248
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[75]	train's auc: 0.851791	valid's auc: 0.779517
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[143]	train's auc: 

Early stopping, best iteration is:
[10]	train's auc: 0.766436	valid's auc: 0.74121
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[319]	train's auc: 0.797332	valid's auc: 0.765907
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[517]	train's auc: 0.825063	valid's auc: 0.750652
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[473]	train's auc: 0.813919	valid's auc: 0.795799
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[69]	train's auc: 0.875316	valid's auc: 0.766236
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[438]	train's auc: 0.925936	valid's auc: 0.738406
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[6]	train's auc: 0.842953	valid's auc: 0.764706
Training until validation scores

Early stopping, best iteration is:
[13]	train's auc: 0.789325	valid's auc: 0.739742
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[10]	train's auc: 0.782484	valid's auc: 0.736106
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[13]	train's auc: 0.786649	valid's auc: 0.789607
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[265]	train's auc: 0.774541	valid's auc: 0.770998
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[814]	train's auc: 0.821163	valid's auc: 0.753636
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[416]	train's auc: 0.789567	valid's auc: 0.763914
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[712]	train's auc: 0.816376	valid's auc: 0.745596
Training until validation scor

Early stopping, best iteration is:
[376]	train's auc: 0.823911	valid's auc: 0.750586
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[398]	train's auc: 0.815433	valid's auc: 0.794553
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[45]	train's auc: 0.793628	valid's auc: 0.782551
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[90]	train's auc: 0.807217	valid's auc: 0.737958
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[107]	train's auc: 0.802812	valid's auc: 0.771562
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[10]	train's auc: 0.795741	valid's auc: 0.750156
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[8]	train's auc: 0.771456	valid's auc: 0.789703
Training until validation scores

Early stopping, best iteration is:
[331]	train's auc: 0.972385	valid's auc: 0.779805
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[20]	train's auc: 0.792195	valid's auc: 0.786699
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[301]	train's auc: 0.813795	valid's auc: 0.764035
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[172]	train's auc: 0.809558	valid's auc: 0.768141
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[301]	train's auc: 0.818658	valid's auc: 0.751869
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[370]	train's auc: 0.806036	valid's auc: 0.797431
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[9]	train's auc: 0.777582	valid's auc: 0.771018
Training until validation scor

Early stopping, best iteration is:
[188]	train's auc: 0.813657	valid's auc: 0.780659
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[18]	train's auc: 0.796172	valid's auc: 0.745671
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[16]	train's auc: 0.795453	valid's auc: 0.758346
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[210]	train's auc: 0.818567	valid's auc: 0.755026
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[341]	train's auc: 0.825643	valid's auc: 0.800346
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[43]	train's auc: 0.768146	valid's auc: 0.771194
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[720]	train's auc: 0.844749	valid's auc: 0.761958
Training until validation scor

Early stopping, best iteration is:
[276]	train's auc: 0.820121	valid's auc: 0.76635
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[341]	train's auc: 0.825849	valid's auc: 0.769114
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[342]	train's auc: 0.831149	valid's auc: 0.75314
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[224]	train's auc: 0.807646	valid's auc: 0.802704
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[1]	train's auc: 0.762298	valid's auc: 0.786817
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[38]	train's auc: 0.804313	valid's auc: 0.754685
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[104]	train's auc: 0.818398	valid's auc: 0.767457
Training until validation scores

Early stopping, best iteration is:
[135]	train's auc: 0.822856	valid's auc: 0.760432
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[108]	train's auc: 0.816008	valid's auc: 0.748157
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[52]	train's auc: 0.775012	valid's auc: 0.798115
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[48]	train's auc: 0.803636	valid's auc: 0.781115
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[636]	train's auc: 0.879137	valid's auc: 0.750727
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[180]	train's auc: 0.827154	valid's auc: 0.767047
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[284]	train's auc: 0.843194	valid's auc: 0.754795
Training until validation sco

Early stopping, best iteration is:
[24]	train's auc: 0.792106	valid's auc: 0.747733
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[354]	train's auc: 0.791911	valid's auc: 0.797274
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[1]	train's auc: 0.762298	valid's auc: 0.786817
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[38]	train's auc: 0.806126	valid's auc: 0.752664
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[251]	train's auc: 0.835966	valid's auc: 0.766128
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[173]	train's auc: 0.834023	valid's auc: 0.752922
Training until validation scores don't improve for 200 rounds.
Early stopping, best iteration is:
[251]	train's auc: 0.834057	valid's auc: 0.800697
Training until validation score

# Conclusions

In this notebook, we saw how to use Hyperopt and the Tree Parzen Estimator to optimize the hyperparameters of a gradient boosting machine. Bayesian model-based optimization is more efficient than random search, finding a better set of model hyperparameters in fewer objective function (train-predict-evaluate) calls. In later notebooks we will examine using hyperparameter optimization on additional problems. 