Optimizing over discrete parameter domains #177

avimit · 2019-06-19T09:55:50Z

Can BoTorch be used over discrete parameter domains?
(if so, than this is a feature inquiry, and not a feature "request")

We have a use case of domains which are partly continuous, partly discrete, like:
[{"name": "param1", "type": "continuous", "domain": [-5, 10]}, {"name": "param2", "type": "continuous", "domain": [1, 15]}, {"name": "param3", "type": "discrete", "domain": [1, 1.5, 2, 2.5, 3, 3.5, 4]}]

The functions under "botorch/optim/optimize.py" accept an argument called "bounds", which you define as : "bounds: A 2 x d tensor of lower and upper bounds for each column of X".

These are obviously bounds for a continuous search space. Can BoTorch be used for searching over discrete spaces?

Thank you so much for the package!
Avi

The text was updated successfully, but these errors were encountered:

Balandat · 2019-06-19T13:46:42Z

Hi @avimit, BoTorch itself at this point doesn't directly deal with discrete parameters, but If your goal is simply to run BayesOpt (rather than developing new BayesOpt primitives), you can easily do this using the higher-level Ax interface. Ax will perform appropriate transformations (e.g. rounding for ordinal parameters, or one-hot encoding for categorical ones) when talking to BoTorch under the hood.

We are also working on improving support for discrete parameters directly in BoTorch (e.g. by using a concrete relaxation to still enable gradient-based optimization).

Finally, if you're willing to write your own Kernels or even models that comply with our basic API, you can do anything you like directly with discrete parameters and still use BoTorch's acquisition functions. Note, however, that optimizing over mixed domains will then also require using a different optimization algorithm, i.e., you can't use the functions in optim/optimize.py directly.

Hope this helps!

avimit · 2019-06-19T14:22:04Z

Hi @Balandat,

Thank you for the answer!

I will have a look at Ax, but I feel that we are in need of the more "low level" character of BoTorch.

Explanation: we have have our own small platform for distributing & managing jobs (mostly DL training) on the cloud, and on top of it we sometimes run HPO. Our HPO framework treats the training of models as a "black box" function, launching train jobs as resources are freed, and collecting evaluations when ready. Our use case for BoTorch is as a supplier of next points for evaluation; I managed to successfully integrate it (and enjoyed the process!) as long as I don't use discrete parameters.

Thanks,
Avi

Balandat · 2019-06-19T15:04:02Z

Glad you were able to use BoTorch directly for this.

That said, Ax is designed for exactly your use case, and provides a number of different APIs (including lower-level ask/tell style ones) that should make that integration quite straightforward. You can also stick your own BoTorch components into these APIs if you want to retain the flexibility of customizing your own models/acquisition functions. See the tutorial on this.

As I mentioned, we are working on improving support for discrete parameters directly in Botorch. For now, what you could do is perform the parameter transformations that Ax does in this module yourself directly in torch.

avimit · 2019-06-19T22:22:51Z

I will sure give it a try!
Thanks a lot,
Avi

eytan · 2019-06-19T23:21:44Z

@avimit , based on your description, I think that Ax is the appropriate tool to plug into. For this ordered, categorical parameter (param3), you'll want to use a ChoiceParameter (as described here) with the keyword argument is_ordered=True. Ax will take care of all the necessary encoding, normalization of the search space and outcomes, etc.

Your setup sounds kind of like what one might do with Slurm or Ray Tune. We have forthcoming interfaces between Ax and these platforms, and they were quite straightforward to implement.

eytan · 2019-06-20T00:23:57Z

You might want to look at the source code for the Ray Tune interface as an example.

avimit · 2019-06-20T09:21:07Z

@eytan, ChoiceParameter does look like what I need, and the is_ordered switch covers all options; the example in Ray Tune in indeed instructive. Thanks.

Taking a parameter transformation from Ax transforms is also a viable option, @Balandat; thank you too.

Now it's time for me to do the work ;}

avimit · 2019-07-04T23:20:35Z

Can you please direct me to a function in Ax which returns next candidates for evaluation, similar to joint_optimize() in BoTorch?

(I have been doing something like this:

model = botorch.models.SingleTaskGP(X, Y)
mll = gpytorch.mlls.ExactMarginalLogLikelihood(model.likelihood, model)
botorch.fit.fit_gpytorch_model(mll)

acquisition_function = botorch.acquisition.qNoisyExpectedImprovement(model, X_baseline)
X_candidates_tensor = botorch.optim.joint_optimize(acquisition_function, bounds=bounds, 
                                                                   q=batch_size, num_restarts=1, raw_samples=len(X))

)

Thanks!

Balandat · 2019-07-04T23:36:49Z

If you use Ax's service api see API examples here, get_next_trial does what you want.

You can also use the developer api (next tab on the above page) - you can access the generated candidates ("arms") by accessing th arms property of the GeneratorRun returned by the gen function.

avimit · 2019-07-04T23:41:27Z

Thanks! I was there, but never noticed the tabs...

Aren't you supposed to be resting on 4th of July? ;)

avimit · 2019-07-06T23:17:36Z

Hi,

I must admit that I find implementing Ax for my task much harder compared to BoTorch.

The five schematic BoTorch lines I quote above do what I want: supply X new candidates for evaluation, given the known+pending evaluations.

Haven't yet found a way to do it with ax....
I've been trying to use the BotorchModel (for continuous params) via developer api:

Do I have to state an evaluation function when defining an "experiment"? Our "evaluation" is a training & testing of an ML model done on a cloud sever. I just want to feed the results to the model.

Also, I couldn't find how to load the known+pending evaluations to the model (and are the objective_weights, that the gen() fonction of BotorchModel requires, factors for low/high-fidelity evals?)

Thanks,
Avi

Balandat · 2019-07-06T23:47:50Z

Our "evaluation" is a training & testing of an ML model done on a cloud sever

If you want to manage that manually you could just use the service API - calling parameters, trial_index = ax.get_next_trial() will give you a new set of parameters to evaluate. You can then communicate with your cloud server to run the ML evaluation and get some results back in however way you want. Then calling ax.complete_trial(trial_index=trial_index, raw_data=results) will include that data into the experiment state. Then calling ax.get_next_trial() gives you the next set of parameters.

The benefit of using Ax for this is that it keeps track of all the metadata, so you can for instance store a complete history of your experiment in a sql database (if you hook Ax up to it with the appropriate tables).

The developer API gives you even more flexibility. For instance you can write a Runner that implements the logic for talking to the ML training job on the cloud server. For more information on this see https://ax.dev/docs/runner.html

I hope this helps. If you have specific questions about Ax, we should probably move the discussion to an issue in the Ax repo. There you'll also have folks like @lena-kashtelyan, @amurthy1, @sdsingh who understand the Ax APIs much better than I do.

taikichi-pc · 2022-02-25T10:54:16Z

def get_next_point(init_x, init_y, best_init_y, bounds, candidates_num = 1):
        mxsingle_gp = MixedSingleTaskGP(init_x, init_y, cat_dims=[2, 3])
        mll = ExactMarginalLogLikelihood(likelihood = mxsingle_gp.likelihood, model = mxsingle_gp)
        fit_gpytorch_model(mll)
        model = qExpectedImprovement(model = mxsingle_gp, best_f = best_init_y)
        optimize_acqf_mixed(
                         acq_function = model
                         bounds = bounds,
                         fixed_features_list = [{3: 1}, {3: 2}, {2: 1}, {2: 2}],
                         q = candidates_num,
                         num_restarts = 200,
                         raw_samples = 512,
                         options = {"batch_limit": 5, "maxiter": 200}
        return candidates

I write fixed_features_list = [{3: 1}, {3: 2}, {2: 1}, {2: 2}] in order to fix discrete features(indices are 2&3, 0&1 are continuous), but only indice[3] of candidate is fixed. Then indice[2] is float like continuous.

Like this, When I'd like discrete inputs(>=2 this case x2, x3) candidate, how should I use it?

Balandat · 2022-02-25T21:45:16Z

So doing fixed_features_list = [{3: 1}, {3: 2}, {2: 1}, {2: 2}] will run over the four dicts in the list, perform optimization with the features as specified in that dict fixed, and then pick the best value across all four dicts.

If you want to optimize the continuous features and fix the discrete ones, you should be doing sth like
fixed_features_list = [{2: 1, 3: 1}], which will rune one optimization with features 2 and 3 both fixed to 1. If you want multiple combinations of fixed features you can add additional dicts to the list.

jackliu333 · 2022-12-14T16:19:51Z

Is this feature available now in BoTorch? I found the optimize_acqf() function only accepts continuous bounds in the bounds parameter.

Balandat · 2022-12-14T16:31:18Z

Yes it is, check out optimize_acqf_mixed.

We also have an incoming PR (#1533) that implements the method from [1], but it'll take a bit to get that fully cleaned up and merged.

[1] S. Daulton et al. Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization. NeurIPS 2022

jackliu333 · 2022-12-14T17:03:56Z

Yes it is, check out optimize_acqf_mixed.

We also have an incoming PR (#1533) that implements the method from [1], but it'll take a bit to get that fully cleaned up and merged.

[1] S. Daulton et al. Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization. NeurIPS 2022

So if my understanding is correct, in optimize_acqf_mixed the bounds parameter handles continuous features and fixed_features_list handles discrete ones. So if we supply both arguments, such as the learning rate for bounds and the number of nodes in a layer for fixed_features_list, the function will condition on each unique discrete value of fixed_features_list and search for the optimal continuous value? In other words, it's the same as if we were to have a outer for loop over the discrete values and return the best one?

Balandat · 2022-12-14T17:09:34Z

So if my understanding is correct, in optimize_acqf_mixed the bounds parameter handles continuous features and fixed_features_list handles discrete ones. So if we supply both arguments, such as the learning rate for bounds and the number of nodes in a layer for fixed_features_list, the function will condition on each unique discrete value of fixed_features_list and search for the optimal continuous value? In other words, it's the same as if we were to have a outer for loop over the discrete values and return the best one?

That is correct, it's a convenience function for essentially doing that. While this is the gold standard for optimizing, this obviously doesn't scale with the number of discrete features. Which is why we worked on [1], so this is incoming.

If you have discrete parameters with many ordered values (e.g. integers over a large range) it's usually fine to do a continuous relaxation for those (optimize over continuous range and then round).

jackliu333 · 2022-12-14T17:12:12Z

Sounds pretty cool! Looking forward to it!

avimit added the enhancement New feature or request label Jun 19, 2019

avimit changed the title ~~[Feature Request]~~ Optimizing over discrete parameter domains Jun 19, 2019

eytan closed this as completed Jun 19, 2019

jmren168 mentioned this issue Sep 17, 2019

discrete parameter domain of GP, high-dimensional GP, multi-output BO, output-constrainted BO #268

Closed

GuangchenW mentioned this issue Mar 13, 2024

Datasets Unavailable GuangchenW/BayesianOptimization#1

Closed

jackliu333 mentioned this issue Mar 23, 2024

Optimizing discrete values jackliu333/bayesian_optimization_theory_and_practice_with_python#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing over discrete parameter domains #177

Optimizing over discrete parameter domains #177

avimit commented Jun 19, 2019 •

edited

Loading

Balandat commented Jun 19, 2019

avimit commented Jun 19, 2019 •

edited

Loading

Balandat commented Jun 19, 2019

avimit commented Jun 19, 2019

eytan commented Jun 19, 2019

eytan commented Jun 20, 2019

avimit commented Jun 20, 2019

avimit commented Jul 4, 2019 •

edited

Loading

Balandat commented Jul 4, 2019

avimit commented Jul 4, 2019 •

edited

Loading

avimit commented Jul 6, 2019

Balandat commented Jul 6, 2019

taikichi-pc commented Feb 25, 2022

Balandat commented Feb 25, 2022

jackliu333 commented Dec 14, 2022

Balandat commented Dec 14, 2022

jackliu333 commented Dec 14, 2022

Balandat commented Dec 14, 2022

jackliu333 commented Dec 14, 2022

Optimizing over discrete parameter domains #177

Optimizing over discrete parameter domains #177

Comments

avimit commented Jun 19, 2019 • edited Loading

Balandat commented Jun 19, 2019

avimit commented Jun 19, 2019 • edited Loading

Balandat commented Jun 19, 2019

avimit commented Jun 19, 2019

eytan commented Jun 19, 2019

eytan commented Jun 20, 2019

avimit commented Jun 20, 2019

avimit commented Jul 4, 2019 • edited Loading

Balandat commented Jul 4, 2019

avimit commented Jul 4, 2019 • edited Loading

avimit commented Jul 6, 2019

Balandat commented Jul 6, 2019

taikichi-pc commented Feb 25, 2022

Balandat commented Feb 25, 2022

jackliu333 commented Dec 14, 2022

Balandat commented Dec 14, 2022

jackliu333 commented Dec 14, 2022

Balandat commented Dec 14, 2022

jackliu333 commented Dec 14, 2022

avimit commented Jun 19, 2019 •

edited

Loading

avimit commented Jun 19, 2019 •

edited

Loading

avimit commented Jul 4, 2019 •

edited

Loading

avimit commented Jul 4, 2019 •

edited

Loading