Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

discrete parameter domain of GP, high-dimensional GP, multi-output BO, output-constrainted BO #268

Closed
jmren168 opened this issue Sep 17, 2019 · 7 comments
Labels
enhancement New feature or request

Comments

@jmren168
Copy link
Contributor

jmren168 commented Sep 17, 2019

馃殌 Feature Request

Hi, I'm new to this package, and tried to find if botorch matches my situations.

What I would like to do is

  1. create surrogate model on sample pool, e.g., a high-dimensional (dim. >40) , and discrete parameter domain of GP,
  2. create acquisition function, e.g., multi-output BO,
  3. optimize acquisition function, e.g., output-constrained BO,
  4. get the next recommendation sample, say X and do REAL experiment to get REAL_Y,
  5. augment (X, REAL_Y) to sample pool, and
  6. repeat steps 1-5 till some criterion is satisfied

What I found is that discrete parameter domain of GP is not supported by botorch, and is recommended to use Ax instead of botorch (#177 ), but I'm not sure if Ax satisfies my situations.

Any comments are highly appreciated.

@jmren168 jmren168 added the enhancement New feature or request label Sep 17, 2019
@Balandat
Copy link
Contributor

Well, this reads like a standard BayesOpt loop, so on a high level, yes, that is what botorch does. A couple of questions:

  1. How many of your parameters are discrete? Are they categorical or are they ordered? How many values can they take on?
  2. What do you mean by multi-output BO here? A multi-output model on which you want to evaluate some scalar objective? Or is your goal to do something like Pareto-front optimization without scalarization/weighing the outputs?

@jmren168
Copy link
Contributor Author

jmren168 commented Sep 17, 2019

Hi Balandat,

Thanks for the quick reply.

  1. 30 parameters are continuous and at least 10 parameters are discrete and ordered. 10 to 20 possible values for these discrete parameters.
  2. I have at least three outputs (continuous variables), and would like to optimize them at the same time. So it should be a Pareto-front optimization problem.

Via BO with GP, I would like to find Pareto-set and do REAL experiment on Pareto-optimal to get REAL_Y. Hope this makes my problems more clearly, and many thanks again.

@Balandat
Copy link
Contributor

  1. The fact that the discrete parameters are ordered is helpful, it should be possible to use a continuous relaxation for these for the modeling and optimization. The fact that you have >40 parameters is somewhat daunting though unless you either
    a. have a ton of data and a scalable model
    b. have reason to believe that the outcome is well-modeled on a lower-dimensional subspace/manifold of the parameters.
    Re a., how many data points do you expect to be able to gather?

  2. We're working on some algorithms for Pareto-optimization, but right now we don't have anything packaged up yet. The simplest approach would be to do some random scalarization similar to e.g. https://arxiv.org/abs/1805.12168 - this would be pretty straightforward to implement. Note though that your problem is relatively high-dimensional, and so I'd first see whether you can get a good handle on the modeling in 1. before going down that road.

@jmren168
Copy link
Contributor Author

Hi Balandat,

Thanks for the guides.

  1. Although we face a high-dimensional optimization problem, but one good news is that most of parameters are fixed, and only 1 to 3 parameters are changed in REAL sequential experiments. The number of data points is at most 30. Although it may not be enough to construct BO models, but from domain experts, we believe that some SUB-PARAMETERS dominate the outcomes. For this case, do you recommend any subspace optimization techniques? Either comments or papers are highly welcome. Thanks again.

  2. We have tried to apply "Dragonfly" package to our case, but it needs a function caller instead of REAL experiments. It may be good to implement random scalarization approach by ourselves.

@Balandat
Copy link
Contributor

What's the motivation behind modeling around ~40 parameters if you're optimizing over only 1-3? Is this a contextual optimization problem where you're trying to find optimal settings conditional on other parameters? Standard GP models are likely to fail if you try to model 30-40 dimensions with only 30 data points.

@jmren168
Copy link
Contributor Author

Here's the situation, say, x1,...,x40, where x1,...,x10 are discrete and the others are continuous. y1,...,y3 are the 3 outcomes of REAL sequential experiments.

Since the case is a SEQUENTIAL experiment, the setting of the 2nd experiment depends on the result of the 1st experiment. E.g., x3 of the 2nd data point is modified where the others are the same as the 1st data point. Then, for the 3rd data point, maybe only x2 and x40 are modified compared to the 2nd data point. That's what we mean most parameters are FIXED here. Hope this makes our case more clearly.

@saitcakmak
Copy link
Contributor

Closing this since many of the features are now supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants