Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Optimizing hyperparameters under unknown constraints #884

Open
sethaxen opened this issue Oct 12, 2022 · 7 comments
Open
Assignees
Labels
Milestone

Comments

@sethaxen
Copy link

Description

The ability to optimize along with a univariate hyperparameter an uknown/hidden constraint for the parameter, where invalid parameter values are identified by failed function evaluations.

This may be related to #403.

Specific application

For the simplest case I have, we can assume:

  • the hyperparameter $x$ is independent of all other hyperparameters
  • all valid values occur below some threshold $x_c$. All $x \ge x_c$ are invalid.
  • so once an invalid value is encountered, we never want to test a value larger than it.

An initial idea

The KI Campus MOOC on AutoML mentioned that the Expected Constrained Improvement acquisition function, given in Eq.4 of https://engineering.ucsc.edu/sites/default/files/technical-reports/UCSC-SOE-10-10.pdf as
$$x' = \arg\max_{x \in \mathcal{X}} \mathbb{E}[I(x)] h(x)$$
is one approach that could work for something like this, where $I$ is an improvement statistic, and $h(x)$ is the predicted probability that the parameter $x$ is valid, for which they use a random forest classifier.

I suppose a more general approach would allow a user to provide

  1. a function $h: \mathcal{X} \to [0, 1]$, which could be multiplied by any acquisition function
  2. a method to update $h$ given a new evaluation $x_k$,
@sethaxen sethaxen changed the title Optimizing hyperparameters under unknown constraints Feature request: Optimizing hyperparameters under unknown constraints Oct 12, 2022
@sethaxen
Copy link
Author

sethaxen commented Dec 9, 2022

@mlindauer we spoke about this briefly at the AutoML fall school, and you said it might be straightforward to add. If so, I could try to contribute this feature, but it would be helpful if someone familiar with the codebase could point me in the right direction.

@mlindauer
Copy link
Contributor

Hi,
Thanks for pinging us again. We are focusing all the manpower on the next major SMAC release right now.
Nevertheless, here are some pointers:

  1. You need to build a binary label classification dataset with successful vs. unsuccessful runs, being extracted from the runhistory https://github.com/automl/SMAC3/tree/main/smac/runhistory/encoder
  2. You need to use that data to feed it to a model (https://github.com/automl/SMAC3/tree/main/smac/model) but do some probabilistic classification (instead of regression). You can e.g., slightly modify the RF for that since it also supports classification and probabilistic predictions.
  3. Last but not least, you need to implement a new acquisition function to take that into account (e.g., in the same way as you wrote above). https://github.com/automl/SMAC3/tree/main/smac/acquisition/function

Does that help you?
I hope that either Rene (only available next week) or Difan can provide more pointers in case you need more detailed help.

Best,
Marius

PS: I point you directly to the new code base since it would be wasted effort to do it in the old code base now.

@sethaxen
Copy link
Author

Thanks @mlindauer for the pointers! These were very helpful. I was able to put together an example that seems to work well on some toy target functions with unknown constraints: https://gist.github.com/sethaxen/331402e156537a933121133fe9965573 . Major thanks to @KEggensperger who helped me navigate SMAC.

I'm planning to deploy this shortly on an expensive model, so if someone has ideas for improvement, I'd love to hear them. Also, if you think this would be a useful example to include in the docs, I'm happy to contribute it.

@alexandertornede
Copy link
Contributor

Thanks for posting your outcome, @sethaxen . We will have a look at it during our meeting next week!

@dengdifan
Copy link
Contributor

dengdifan commented Feb 2, 2023

@sethaxen Thanks for the updates! It looks quite promising to me and we would like to integrate this as an example for our model. We would appreciate it if you could provide a PR for us (ideally for development 2.0 branch) (you can also come to us if you need any support with the codebase).
Just a small question, does this approach only work with EI acquisition function?

@sethaxen
Copy link
Author

sethaxen commented Feb 2, 2023

We would appreciate it if you could provide a PR for us (ideally for development 2.0 branch) (you can also come to us if you need any support with the codebase).

Sure, I'd be happy to! This would mean adding a new section to https://github.com/automl/SMAC3/tree/development/examples, right?

Just a small question, does this approach only work with EI acquisition function?

I'm new to AutoML, so I I can't say for certain. I came across examples where a classifier was incorporated into the acquisition function in more complicated ways (e.g. https://arxiv.org/abs/1004.4027), and the 2 papers I read that took the approach used here both used EI.

Naively, it seems intuitive that this could work with other acquisition functions, so I could generalize EIConstrained to a ConstraintWeightedAcquisition that one would pass a standard acquisition function and a classifier to.

@dengdifan
Copy link
Contributor

Sure, I'd be happy to! This would mean adding a new section to https://github.com/automl/SMAC3/tree/development/examples, right?

Thanks! Exactly, the interface is overall the same, so it should not be so hard to transform to the development branch

Naively, it seems intuitive that this could work with other acquisition functions,

I think this might only work for improvement-based acquisition functions (EI and PI). If you have an LCB as an acquisition function, then its acquisition function value might become negative, multiplying this value with predict_probability might not work as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

5 participants