Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What happens when the range of parameters changes? #822

Closed
louisabraham opened this issue Dec 29, 2019 · 3 comments
Closed

What happens when the range of parameters changes? #822

louisabraham opened this issue Dec 29, 2019 · 3 comments
Labels

Comments

@louisabraham
Copy link

@louisabraham louisabraham commented Dec 29, 2019

Hi,

Suppose I run a study with some parameters, e.g. x has distribution {"name": "UniformDistribution", "attributes": {"low": 0, "high": 1}}, and I find that x=0.99 is frequently proposed.

If I restart my study with {"name": "UniformDistribution", "attributes": {"low": 0, "high": 5}}, how will the samplers react?

But if the type of the observations change (numerical / categorical), I don't think it will work.

If the range is simply expanded, I think (from my understanding of the code) that TPESampler will just be able to use the observations. Am I correct?

What if the range is shrunk (or shifted)? The problem is that the previous observations are not restricted to the current distribution range, so at least there will problems with n_startup_trials that will not be respected.

IMO:

  • The behavior should be documented because I don't think I'm the first to ask the question

  • In case of a range modification, I think it would be useful and not very difficult to ignore the observations outside of the current distribution.

  • In case of a type modification, I think categorical -> numerical could be supported. Numerical to categorical would just ignore the past.

@hvy

This comment has been minimized.

Copy link
Member

@hvy hvy commented Jan 7, 2020

Sorry for the delayed response and thank you for bringing this up. You are right that this can be confusing. There seems to be two issues and I suggest we start with the former.

  • The behavior have not been discussed throughly enough.
  • The behavior is not documented.
@hvy

This comment has been minimized.

Copy link
Member

@hvy hvy commented Jan 7, 2020

It's up to each sampler how to handle these dynamic changes and we currently have no means to control it universally. As for the TPE sampler.

  1. All parameter-value pairs previously collected will be used for the upper/lower kernel density estimations. The range is ignored here.
  2. When sampling from the lower density estimation, the parameters are actually guaranteed to be within the range according to this logic https://github.com/optuna/optuna/blob/master/optuna/samplers/tpe/sampler.py#L315.

Changing 1. to only look at parameter-value pairs within the current range would probably make more sense as you say. I'm not so sure about handling new types. In my understanding, users might casually want to resume studies with slightly adjusted ranges, but are the cases besides those?

@louisabraham

This comment has been minimized.

Copy link
Author

@louisabraham louisabraham commented Jan 22, 2020

Hi, thank you very much for your answer.

I think 1. is good as it exploits all available information.
I didn't notice the behavior of 2. but it basically solves my issue.

To summarize:

  • all past data is used
  • the suggested parameters are always in the current required range

This behavior seems very logical. It could just be documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.