Bayesian Optimization of Combinatorial Structures (BOCS)
What is the BOCS algorithm?
The BOCS algorithm is a method for finding a global minimizer of an expensive-to-evaluate black-box function that is defined over discrete inputs given a finite budget of function evaluations. The algorithm combines an adaptive generative model and semidefinite programming techniques for scalability of the acquisition function to large combinatorial domains. For more information on the details of the algorithm, please see the ICML paper.
Ricardo Baptista (MIT) and Matthias Poloczek (University of Arizona)
The BOCS algorithm is implemented in MATLAB and only requires the CVX package to be available in the local path for performing convex optimization. The scripts for comparing to other discrete optimization methods require an installation of SMAC, the python wrapper pySMAC and the bayesopt function in MATLAB for Bayesian Optimization with the Expected Improvement acquisition function.
Example running BOCS on a benchmark problem
We provide an example for running the BOCS algorithm the Ising model sparsification benchmark problem on a 9-node grid graph with a budget of 100 sample evaluations. The code can also be run from MATLAB using the file
The script first defines the input parameters in the
inputs struct. These include the number of discrete variables (i.e., edges in the grid graph)
n_vars, the sample evaluation budget
evalBudget, the number of samples initially used to build the statistical model
n_init, the lambda parameter
lambda, and the horseshoe
estimator used for the regression model. Other options for
estimator are bayes and mle.
inputs = struct; inputs.n_vars = 12; inputs.evalBudget = 100; inputs.n_init = 20; inputs.lambda = 1e-4; inputs.estimator = 'horseshoe';
We randomly instantiate the edge weights of the Ising model and define the objective function based on the KL divergence in
inputs.model. We add an l_1 penalty scaled by the lambda parameter in
% Generate random graphical model Theta = rand_ising_grid(9); Moments = ising_model_moments(Theta); % Save objective function and regularization term inputs.model = @(x) KL_divergence_ising(Theta, Moments, x); inputs.penalty = @(x) inputs.lambda*sum(x,2);
We sample the initial values for the statistical model and evaluate the objective function at these samples. Using these inputs, we run the BOCS-SA and BOCS-SDP algorithms with the order 2 regression model.
% Generate initial samples for statistical models inputs.x_vals = sample_models(inputs.n_init, inputs.n_vars); inputs.y_vals = inputs.model(inputs.x_vals); % Run BOCS-SDP and BOCS-SA (order 2) B_SA = BOCS(inputs.model, inputs.penalty, inputs, 2, 'SA'); B_SDP = BOCS(inputs.model, inputs.penalty, inputs, 2, 'sdp');
We also provide an example of running the python implementation of BOCS on the binary quadratic programming benchmark problem. The code can be run using the command
python BOCSpy/example_bqp.py. The script produces a plot with the convergence results from running BOCS-SA and BOCS-SDP on an instance of the problem.
Comparison to other discrete optimization algorithms
To compare BOCS to other algorithms, we provided files in the scripts folder that run all cases for the quadratic programming problem, the Ising model sparsification problem, and the contamination control problem. The algorithms that are compared include:
- RS: Random search (Bergstra & Bengio, 2012)
- SA: Simulated annealing (Spears, 1993) (Pankratov & Borodin, 2010)
- EI: Expected improvement (Jones et al., 2008)
- OLS: Local search (Khanna et al., 1998)
- PS: Sequential Monte Carlo particle search (Schäfer, 2012)
- SMAC (Hutter et al., 2011)
- BOCS-SA, BOCS-SDP with Bayesian linear regression model
- BOCS-SA, BOCS-SDP with MLE model
- BOCS-SA, BOCS-SDP with sparse linear regression model
To run these examples, we provided three files in the scripts folder named
opt_runs_cont for the benchmark problems. In each of the files, the
n_func parameter defines multiple random instances of the test function and the
n_runs parameter defines the number of repeated evaluations of each algorithm starting from different initial datasets. The
n_proc parameter can be used to specify the number of available processors to parallelize the execution of these tests. Lastly, running
scripts/create_plots will plot the average output of these tests and present statistics of the solutions found by all algorithms, while running
scripts/cross_validate followed by
scripts/plot_crossval can be used to validate the statistical models for each benchmark problem.