Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What happens when the range of parameters changes? #822

Closed
louisabraham opened this issue Dec 29, 2019 · 3 comments
Closed

What happens when the range of parameters changes? #822

louisabraham opened this issue Dec 29, 2019 · 3 comments
Labels
question Question about Optuna.

Comments

@louisabraham
Copy link

Hi,

Suppose I run a study with some parameters, e.g. x has distribution {"name": "UniformDistribution", "attributes": {"low": 0, "high": 1}}, and I find that x=0.99 is frequently proposed.

If I restart my study with {"name": "UniformDistribution", "attributes": {"low": 0, "high": 5}}, how will the samplers react?

But if the type of the observations change (numerical / categorical), I don't think it will work.

If the range is simply expanded, I think (from my understanding of the code) that TPESampler will just be able to use the observations. Am I correct?

What if the range is shrunk (or shifted)? The problem is that the previous observations are not restricted to the current distribution range, so at least there will problems with n_startup_trials that will not be respected.

IMO:

  • The behavior should be documented because I don't think I'm the first to ask the question

  • In case of a range modification, I think it would be useful and not very difficult to ignore the observations outside of the current distribution.

  • In case of a type modification, I think categorical -> numerical could be supported. Numerical to categorical would just ignore the past.

@louisabraham louisabraham added the question Question about Optuna. label Dec 29, 2019
@hvy
Copy link
Member

hvy commented Jan 7, 2020

Sorry for the delayed response and thank you for bringing this up. You are right that this can be confusing. There seems to be two issues and I suggest we start with the former.

  • The behavior have not been discussed throughly enough.
  • The behavior is not documented.

@hvy
Copy link
Member

hvy commented Jan 7, 2020

It's up to each sampler how to handle these dynamic changes and we currently have no means to control it universally. As for the TPE sampler.

  1. All parameter-value pairs previously collected will be used for the upper/lower kernel density estimations. The range is ignored here.
  2. When sampling from the lower density estimation, the parameters are actually guaranteed to be within the range according to this logic https://github.com/optuna/optuna/blob/master/optuna/samplers/tpe/sampler.py#L315.

Changing 1. to only look at parameter-value pairs within the current range would probably make more sense as you say. I'm not so sure about handling new types. In my understanding, users might casually want to resume studies with slightly adjusted ranges, but are the cases besides those?

@louisabraham
Copy link
Author

Hi, thank you very much for your answer.

I think 1. is good as it exploits all available information.
I didn't notice the behavior of 2. but it basically solves my issue.

To summarize:

  • all past data is used
  • the suggested parameters are always in the current required range

This behavior seems very logical. It could just be documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about Optuna.
Projects
None yet
Development

No branches or pull requests

2 participants