Advanced

REMBO

Random EMbedding Bayesian Optimization (REMBO) tackles the problem of high dimensional input spaces with low effective dimensionality. It creates a random matrix to perform a random projection from a high dimensional space into a smaller embedded subspace (rembo-paper). If you want to use REMBO for you objective function you just have to derive from the REMBO task an call its __init__() function in the constructor:

class BraninInBillionDims(REMBO):
    def __init__(self):
        self.b = Branin()
        X_lower = np.concatenate((self.b.X_lower, np.zeros([999998])))
        X_upper = np.concatenate((self.b.X_upper, np.ones([999998])))
        super(BraninInBillionDims, self).__init__(X_lower, X_upper, d=2)

    def objective_function(self, x):
        return self.b.objective_function(x[:, :2])

Afterwards you can simply optimize your task such as any other task. It will then automatically perform Bayesian optimization in the lower embedded subspace to find a new configuration. To evaluate a configuration it will be transformed back to the original space.

task = BraninInBillionDims()
kernel = GPy.kern.Matern52(input_dim=task.n_dims)
model = GPyModel(kernel, optimize=True, noise_variance=1e-3, num_restarts=10)
acquisition_func = EI(model, task.X_lower, task.X_upper, compute_incumbent)
maximizer = CMAES(acquisition_func, task.X_lower, task.X_upper)
bo = BayesianOptimization(acquisition_fkt=acquisition_func,
                      model=model,
                      maximize_fkt=maximizer,
                      task=task)

bo.run(500)

Bayesian optimization with MCMC sampling of the GP's hyperparameters

So far we optimized the GPy's hyperparameter by maximizing the marginal loglikelihood. If you want to marginalise over hyperparameter you can use the GPyModelMCMC module:

kernel = GPy.kern.Matern52(input_dim=branin.n_dims)
model = GPyModelMCMC(kernel, burnin=20, chain_length=100, n_hypers=10)

It used the HMC method implemented in GPy to sample the marginal loglikelihood. Afterwards you can simply plug it into your acquisition functions

acquisition_func = EI(model, X_upper=branin.X_upper, X_lower=branin.X_lower, compute_incumbent=compute_incumbent, par=0.1)

maximizer = Direct(acquisition_func, branin.X_lower, branin.X_upper)
bo = BayesianOptimization(acquisition_fkt=acquisition_func,
                      model=model,
                      maximize_fkt=maximizer,
                      task=task)

bo.run(10)

RoBO will then compute an marginalised acquistion value by computing the acquisition value based on each single GP and sum over all of them.

Fabolas

The general idea of Fabolas is to expand the traditional way to model the objective function f(x, s) by an additional input s that specifies the amount of training data to evaluate a point x:

At the end we want to find the best points x_⋆ on the full dataset s = s_max. Because of that Fabolas uses the information gain acquisition function but models the distribution over the minimum p_min(x, s) only on the subspace s_max such that p_min(x, s = s_max).

By additionally modeling the evaluation time of a point x and dividing the information gain by the cost c(x, s) it would take to evaluate x on s, Fabolas evaluate points only on small subsets of the data and extrapolates their error to the full dataset size. For more details have a look at the paper http://arxiv.org/abs/1605.07079

Fabolas has the same interface as RoBO`s fmin function (see fmin). First you have to define your objective function which now should depend on x and s:

def objective_function(x, s):
        # Train your algorithm here with x on the dataset subset with length s
        # Estimate the validation error and the cost on the validation data set
        return np.array([[validation_error]]), np.array([[cost]])

Your objective function should return the validation error and the total cost c(x, s) of the point x. Normally the cost is the time it took to train and validate x. After defining your objective function you also have to define the input bounds for x and s. Make sure that the dataset size s is the last dimension. It is often a good idea to set the data set size on a log scale.

X_lower = np.array([-10, -10, s_min])
X_upper = np.array([10, 10, s_max])

Then you can call Fabolas by:

x_best = fabolas_fmin(objective_function, X_lower, X_upper, num_iterations=100)

You can find a full example for training a support vector machine on MNIST here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

advanced.rst

advanced.rst

Advanced

REMBO

Bayesian optimization with MCMC sampling of the GP's hyperparameters

Fabolas

Files

advanced.rst

Latest commit

History

advanced.rst

File metadata and controls

Advanced

REMBO

Bayesian optimization with MCMC sampling of the GP's hyperparameters

Fabolas