-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in constant_liar
option
#4073
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4073 +/- ##
=======================================
Coverage 90.06% 90.07%
=======================================
Files 160 160
Lines 12591 12591
=======================================
+ Hits 11340 11341 +1
+ Misses 1251 1250 -1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@HideakiImamura @knshnb Could you review this PR, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement and careful benchmarks! The algorithm change makes sense to me. I left small comments.
[Note] I personally think the change on the priority of running and pruned trials is reasonable. Please comment if anyone has concerns about this.
Co-authored-by: Kenshin Abe <abe.kenshin@gmail.com>
Co-authored-by: Kenshin Abe <abe.kenshin@gmail.com>
@knshnb Thanks for your comments! I applied your suggestions. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good. LGTM.
Fix bugs in `constant_liar` option
Motivation
Currently,
constant_liar
option first fillsinf
in the values of running trials and splits them into "below" and "above". An example of 5-concurrent case (withn_startup_trials = 3
) is given below:constant_liar=False
constant_liar=True
B, A = split({0…1}, γ(3))
B, A = split({0…2}, γ(4))
B, A = split({0…3}, γ(5))
B, A = split({0…4}, γ(6))
B, A = split({0…5}, γ(7))
B, A = split({0…2}, γ(3))
B, A = split({0…6}, γ(8))
B, A = split({0…3}, γ(4))
B, A = split({0…7}, γ(9))
B, A = split({0…4}, γ(5))
B, A = split({0…8}, γ(10))
Here,
B
stands for "below" andA
stands for "above".This behavior has the following problems:
γ(3)
...γ(5)
is nonzero, some running trials are randomly chosen to be "below", and subsequent trials will be wrongly attracted toward that trial.γ(6)
orγ(7)
"below" trials. Since all running trials have valueinf
, the early trials are unconditionally chosen to be "below", no matter how good or bad their actual values are. Again, subsequent trials will be wrongly attracted toward early trials.In practice, this causes performance deterioration when the number of concurrency is large, such as 50. One example of benchmark is given below, where
HPOBench
is used. (In the below image, "cl" stands forconstant_liar=True
.)As can be seen from the image,
constant_liar=True
performs worse thanconstant_liar=False
until the budget reaches around 150.Description of the changes
We change the behavior to be following:
constant_liar=False
constant_liar=True
B, A = split({0…2}, γ(3))
B, A = split({0…6}, γ(3))
B, A = split({0…3}, γ(4))
B, A = split({0…7}, γ(4))
B, A = split({0…4}, γ(5))
B, A = split({0…8}, γ(5))
Note that
B, A = split({0…6}, γ(3))
is equivalent towhere we first split all finished trials, and then append all running trials to "above".
This improves the performance of HPOBench:
One side effect of this change is that running trials will be ranked after pruned trials. (Note that currently running trials are ranked before pruned trials.)
Another side effect is that it will now be trivial to support
constant_liar
in multi-objective settings.