Best result e6:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.9500000000000001, 'eta': 0.015, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 6, 'min_child_weight': 10, 'n_estimators': 20, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9500000000000001, 'tree_method': 'gpu_hist'}
Score: 2.0134776818828346

100 Runs:

Score: 2.094291906478197

300 Runs:

Score: 2.1418

2nd best result e6:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.8, 'eta': 0.075, 'eval_metric': 'mae', 'gamma': 0.9, 'max_depth': 4, 'min_child_weight': 9, 'n_estimators': 21, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.8500000000000001, 'tree_method': 'gpu_hist'}
Score: 2.0137338312412885

Best result e3:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.7000000000000001, 'eta': 0.15, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 4, 'min_child_weight': 4, 'n_estimators': 21, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.8, 'tree_method': 'gpu_hist'}
Score: 2.0309477396962037

100 runs:

Score: 2.1099816599340055

300 Runs:

Score: 2.1556

2nd best result e3:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.65, 'eta': 0.155, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 4, 'min_child_weight': 3, 'n_estimators': 21, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9, 'tree_method': 'gpu_hist'}
Score: 2.0309701359369408

3rd best result e3:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 1.0, 'eta': 0.01, 'eval_metric': 'mae', 'gamma': 1.0, 'max_depth': 4, 'min_child_weight': 6, 'n_estimators': 20, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9, 'tree_method': 'gpu_hist'}
Score: 2.0309965301727493

Example of a bad result e3:

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.9, 'eta': 0.08, 'eval_metric': 'mae', 'gamma': 0.65, 'max_depth': 3, 'min_child_weight': 1, 'n_estimators': 10, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.6000000000000001, 'tree_method': 'gpu_hist'}
Score: 2.37749189196555

In [None]:
!pip install hyperopt

In [1]:
import sys
sys.path.append('../../src/')
from folds.folds import CustomFold
from hyperopt import fmin, hp, tpe, STATUS_OK
from hyperopt.pyll.base import scope
from sklearn import preprocessing
from sklearn.metrics import mean_absolute_error
from xgboost import XGBRegressor
import pandas as pd
import numpy as np

In [12]:
df = pd.read_hdf('../../data/e2.h5', key='table')

train_data = df.drop(['ttf'], axis=1)
y_train_data = df['ttf']

In [13]:
def score(params, n=10):
    print(f"Training with params: {params}")

    np.random.seed(0)
    folds = CustomFold(n_splits=9, shuffle=True, fragmentation=0, pad=150)
    loss_list = []
    for _ in range(n):
        for fold_n, (train_index, valid_index) in enumerate(folds.split(train_data)):
            X_train, X_valid = train_data.iloc[train_index], train_data.iloc[valid_index]
            y_train, y_valid = y_train_data.iloc[train_index], y_train_data.iloc[valid_index]

            preprocessor = preprocessing.StandardScaler()
            preprocessor.fit(X_train)
            X_train = pd.DataFrame(preprocessor.transform(X_train))
            X_valid = pd.DataFrame(preprocessor.transform(X_valid))

            model = XGBRegressor(**params)
            model.fit(X_train.values, y_train.values)
            predictions = model.predict(X_valid.values)
            loss_list.append(mean_absolute_error(predictions, y_valid))

    loss = np.mean(loss_list)
    print(f"Score: {loss}, std: {np.std(loss_list)}")
    return {'loss': loss, 'status': STATUS_OK}

In [4]:
def optimize(random_state=314159265):
    """
    This is the optimization function that given a space (space here) of
    hyperparameters and a scoring function (score here), finds the best hyperparameters.
    """

    space = {
        'n_estimators': scope.int(hp.quniform('n_estimators', 10, 300, 5)),
        'eta': hp.quniform('eta', 0.005, 0.5, 0.005),
        'max_depth': scope.int(hp.quniform('max_depth', 1, 5, 1)),
        'min_child_weight': scope.int(hp.quniform('min_child_weight', 1, 10, 1)),
        'subsample': hp.quniform('subsample', 0.5, 1, 0.05),
        'gamma': hp.quniform('gamma', 0.5, 1, 0.05),
        'colsample_bytree': hp.quniform('colsample_bytree', 0.5, 1, 0.05),
        'eval_metric': 'mae',
        'objective': 'gpu:reg:linear',
        'booster': 'gbtree',
        'tree_method': 'gpu_hist',
        'silent': 1,
        'seed': random_state
    }

    best = fmin(score,
                space,
                algo=tpe.suggest,
                max_evals=2000)
    return best

In [None]:
best_hyperparams = optimize()
print("The best hyperparameters are: ", "\n")
print(best_hyperparams)

In [5]:
score({"eta": 0.01, 
       "max_depth": 4, 
       "n_estimators": 51, 
       "subsample": 0.7, 
       "objective": "gpu:reg:linear",
       "eval_metric": "mae",
       "tree_method": "gpu_hist",
       "verbosity": 3,
       "nthread": 15})

Training with params: {'eta': 0.01, 'max_depth': 4, 'n_estimators': 51, 'subsample': 0.7, 'objective': 'gpu:reg:linear', 'eval_metric': 'mae', 'tree_method': 'gpu_hist', 'verbosity': 3, 'nthread': 15}
Score: 2.06215859193786


{'loss': 2.06215859193786, 'status': 'ok'}

In [6]:
score({'booster': 'gbtree', 
       'colsample_bytree': 1.0, 
       'eta': 0.085, 
       'eval_metric': 'mae', 
       'gamma': 0.5, 
       'max_depth': 4, 
       'min_child_weight': 7, 
       'n_estimators': 15, 
       'objective': 'gpu:reg:linear', 
       'seed': 314159265, 'silent': 1, 'subsample': 0.9, 'tree_method': 'gpu_hist'})

Training with params: {'booster': 'gbtree', 'colsample_bytree': 1.0, 'eta': 0.085, 'eval_metric': 'mae', 'gamma': 0.5, 'max_depth': 4, 'min_child_weight': 7, 'n_estimators': 15, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9, 'tree_method': 'gpu_hist'}
Score: 2.0585525864873393


{'loss': 2.0585525864873393, 'status': 'ok'}

In [14]:
score({'booster': 'gbtree', 'colsample_bytree': 0.9500000000000001, 'eta': 0.015, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 6, 'min_child_weight': 10, 'n_estimators': 20, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9500000000000001, 'tree_method': 'gpu_hist'})

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.9500000000000001, 'eta': 0.015, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 6, 'min_child_weight': 10, 'n_estimators': 20, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.9500000000000001, 'tree_method': 'gpu_hist'}


  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)


  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  


  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app


  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)


  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  


  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app


  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)


  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app
  return self.partial_fit(X, y)
  
  from ipykernel import kernelapp as app


Score: 2.100954299646639, std: 0.709332285344933


{'loss': 2.100954299646639, 'status': 'ok'}

In [9]:
score({'booster': 'gbtree', 'colsample_bytree': 0.7000000000000001, 'eta': 0.15, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 4, 'min_child_weight': 4, 'n_estimators': 21, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.8, 'tree_method': 'gpu_hist'})

Training with params: {'booster': 'gbtree', 'colsample_bytree': 0.7000000000000001, 'eta': 0.15, 'eval_metric': 'mae', 'gamma': 0.75, 'max_depth': 4, 'min_child_weight': 4, 'n_estimators': 21, 'objective': 'gpu:reg:linear', 'seed': 314159265, 'silent': 1, 'subsample': 0.8, 'tree_method': 'gpu_hist'}
Score: 2.155649124495926, std: 0.7786928471426826


{'loss': 2.155649124495926, 'status': 'ok'}