-
-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache split in TPE for high-dimensional optimization #5464
Cache split in TPE for high-dimensional optimization #5464
Conversation
@not522 |
optuna/_hypervolume/utils.py
Outdated
@@ -1,3 +1,5 @@ | |||
from __future__ import annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change is not relevant to this PR.
This change would be good to include when using features from __future__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Sorry, I am temporarily busy because my primary computer is not working, and I suppose @HideakiImamura -san is the appropriate reviewer. @HideakiImamura could you review this PR? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5464 +/- ##
==========================================
+ Coverage 89.52% 89.74% +0.22%
==========================================
Files 194 195 +1
Lines 12626 12592 -34
==========================================
- Hits 11303 11301 -2
+ Misses 1323 1291 -32 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding cache on the sampler instance causes the data inconsistency across different processes. Could you give me an opinion that?
We discussed internally and decided to close this PR. |
Just for future reminder, I will leave some comments: This PR does not cause any issues between processesAs each trial is sampled in a thread and a sampling of a specific trial will not be scattered on multiple processes or threads, we do not have to be concerned about the missing cached data in another process. However, when using multiple threads, we need to take care of another thread overwriting the cache before one trial completes its sampling. Anyways, missing cache does not cause any issue because we simply need to re-calculate the split, which incurs some more computational burden. |
Motivation
As TPE significantly slows down for high-dimensional optimization, this PR introduces caching mechanism for the TPE split.
Description of the changes
For
n_trials=1000
anddim=10
, the runtimes are the following: