Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in constant_liar option #4073

Merged
merged 5 commits into from
Oct 25, 2022

Conversation

contramundum53
Copy link
Member

Motivation

Currently, constant_liar option first fills inf in the values of running trials and splits them into "below" and "above". An example of 5-concurrent case (with n_startup_trials = 3) is given below:

Event constant_liar=False constant_liar=True
ask 0 RandomSampler RandomSampler
ask 1 RandomSampler RandomSampler
ask 2 RandomSampler B, A = split({0…1}, γ(3))
ask 3 RandomSampler B, A = split({0…2}, γ(4))
ask 4 RandomSampler B, A = split({0…3}, γ(5))
tell 0, ask 5 RandomSampler B, A = split({0…4}, γ(6))
tell 1, ask 6 RandomSampler B, A = split({0…5}, γ(7))
tell 2, ask 7 B, A = split({0…2}, γ(3)) B, A = split({0…6}, γ(8))
tell 3, ask 8 B, A = split({0…3}, γ(4)) B, A = split({0…7}, γ(9))
tell 4, ask 9 B, A = split({0…4}, γ(5)) B, A = split({0…8}, γ(10))

Here, B stands for "below" and A stands for "above".

This behavior has the following problems:

  • In "ask 2...4" step, no trials have generated results yet. However, if γ(3)...γ(5) is nonzero, some running trials are randomly chosen to be "below", and subsequent trials will be wrongly attracted toward that trial.
  • In "ask 5...6" step, there are only 1 or 2 finished trials, but we choose γ(6) or γ(7) "below" trials. Since all running trials have value inf, the early trials are unconditionally chosen to be "below", no matter how good or bad their actual values are. Again, subsequent trials will be wrongly attracted toward early trials.

In practice, this causes performance deterioration when the number of concurrency is large, such as 50. One example of benchmark is given below, where HPOBench is used. (In the below image, "cl" stands for constant_liar=True.)

constant_liar1

As can be seen from the image, constant_liar=True performs worse than constant_liar=False until the budget reaches around 150.

Description of the changes

We change the behavior to be following:

Event constant_liar=False constant_liar=True
ask 0 RandomSampler RandomSampler
ask 1 RandomSampler RandomSampler
ask 2 RandomSampler RandomSampler
ask 3 RandomSampler RandomSampler
ask 4 RandomSampler RandomSampler
tell 0, ask 5 RandomSampler RandomSampler
tell 1, ask 6 RandomSampler RandomSampler
tell 2, ask 7 B, A = split({0…2}, γ(3)) B, A = split({0…6}, γ(3))
tell 3, ask 8 B, A = split({0…3}, γ(4)) B, A = split({0…7}, γ(4))
tell 4, ask 9 B, A = split({0…4}, γ(5)) B, A = split({0…8}, γ(5))

Note that B, A = split({0…6}, γ(3)) is equivalent to

B, A' = split({0…2}, γ(3))
A = A' + {3...6}

where we first split all finished trials, and then append all running trials to "above".

This improves the performance of HPOBench:
constant_liar2

One side effect of this change is that running trials will be ranked after pruned trials. (Note that currently running trials are ranked before pruned trials.)
Another side effect is that it will now be trivial to support constant_liar in multi-objective settings.

@github-actions github-actions bot added the optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. label Oct 13, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #4073 (b024474) into master (1a520bd) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #4073   +/-   ##
=======================================
  Coverage   90.06%   90.07%           
=======================================
  Files         160      160           
  Lines       12591    12591           
=======================================
+ Hits        11340    11341    +1     
+ Misses       1251     1250    -1     
Impacted Files Coverage Δ
optuna/samplers/_tpe/sampler.py 96.96% <100.00%> (ø)
optuna/storages/_rdb/storage.py 94.04% <0.00%> (+0.18%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@toshihikoyanase toshihikoyanase added the bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. label Oct 14, 2022
@toshihikoyanase
Copy link
Member

@HideakiImamura @knshnb Could you review this PR, please?
@not522 @c-bata You may also be interested in this PR, so please feel free to add comments.

Copy link
Member

@knshnb knshnb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement and careful benchmarks! The algorithm change makes sense to me. I left small comments.
[Note] I personally think the change on the priority of running and pruned trials is reasonable. Please comment if anyone has concerns about this.

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
contramundum53 and others added 2 commits October 24, 2022 10:39
Co-authored-by: Kenshin Abe <abe.kenshin@gmail.com>
Co-authored-by: Kenshin Abe <abe.kenshin@gmail.com>
@contramundum53
Copy link
Member Author

@knshnb Thanks for your comments! I applied your suggestions. PTAL.

Copy link
Member

@knshnb knshnb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@knshnb knshnb removed their assignment Oct 25, 2022
Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good. LGTM.

@HideakiImamura HideakiImamura merged commit 3580c37 into optuna:master Oct 25, 2022
@HideakiImamura HideakiImamura added this to the v3.1.0 milestone Oct 25, 2022
eukaryo pushed a commit to eukaryo/optuna that referenced this pull request Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the specification of the behavior of n_startup_trials for constant_liar option
5 participants