## Restoring Gradient Boosted Regression Model After Optimization

In [15]:
from hyperspace.kepler.data_utils import load_results

#### Reload the objective function

In [18]:
def objective(params):
    """
    Objective function to be minimized.
    Parameters
    ----------
    * params [list, len(params)=n_hyperparameters]
        Settings of each hyperparameter for a given optimization iteration.
        - Controlled by hyperspaces's hyperdrive function.
        - Order preserved from list passed to hyperdrive's hyperparameters argument.
    """
    #max_depth, learning_rate, max_features, min_samples_split, min_samples_leaf = params
    max_depth, max_features, min_samples_split, min_samples_leaf = params

    reg.set_params(max_depth=max_depth,
                   learning_rate=learning_rate,
                   max_features=max_features,
                   min_samples_split=min_samples_split,
                   min_samples_leaf=min_samples_leaf)

    return -np.mean(cross_val_score(reg, X, y, cv=5, n_jobs=-1,
                                    scoring="neg_mean_absolute_error"))

#### Load results from optimization

In [34]:
gbm_results = load_results("../gbm_results", sort=True)
best = gbm_results[0]

# Get the hyperparameter values
print('Hyperparameters of our best model:\n {}'.format(best.x))

max_depth = best.x[0]
learning_rate = best.x[1]
max_features = best.x[2]
min_samples_split = best.x[3]
min_samples_leaf = best.x[4]

Number of results: 32

Hyperparameters of our best model:
 [7, 0.18920799259946858, 6, 2, 1]


#### Retraining the Gradient Boosted Regressor with Optimal Hyperparameters

In [35]:
import numpy as np
from sklearn.datasets import load_boston
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import cross_val_score

In [43]:
boston = load_boston()
X, y = boston.data, boston.target
n_features = X.shape[1]

reg = GradientBoostingRegressor(n_estimators=50, random_state=0)

reg.set_params(max_depth=max_depth,
               learning_rate=learning_rate,
               max_features=max_features,
               min_samples_split=min_samples_split,
               min_samples_leaf=min_samples_leaf)

final_results = -np.mean(cross_val_score(reg, X, y, cv=5, n_jobs=-1, scoring="neg_mean_absolute_error"))
print('Negative Mean Absolute Error with optimal hyperparameters: {}'.format(final_results))

Negative Mean Absolute Error with optimal hyperparameters: 2.9032924372635147


#### Notes

HyperSpace keeps track of each of the distributed models evaluations. The minimum objective function evaluation at each distributed run can be found in the results `.fun`. We can verify that our re-evaluated model above returns the same negative mean absolute error as that reported by Hyperspace:

In [44]:
final_results == best.fun

True