Cache split in TPE for high-dimensional optimization #5464

nabenabe0928 · 2024-05-29T01:56:36Z

Motivation

As TPE significantly slows down for high-dimensional optimization, this PR introduces caching mechanism for the TPE split.

Description of the changes

Cache the split information in sampler so that we can re-use the result

For n_trials=1000 and dim=10, the runtimes are the following:

This PR	Master
29	47

not522 · 2024-05-29T06:52:12Z

What is the relationship between this PR and #5454? Should we review it after #5454?

nabenabe0928 · 2024-05-29T08:18:59Z

@not522
These two PRs are orthogonal works, so we can separately work!
This PR aims to share the split information in a trial.
The other PR aims to share the information over multiple trials giving the same set of arguments to the hssp solver.

not522 · 2024-05-30T05:03:50Z

@eukaryo @gen740 Could you review this PR?

gen740 · 2024-06-04T05:58:49Z

optuna/_hypervolume/utils.py

@@ -1,3 +1,5 @@
+from __future__ import annotations


I believe this change is not relevant to this PR.
This change would be good to include when using features from __future__.

eukaryo · 2024-06-04T07:14:22Z

Sorry, I am temporarily busy because my primary computer is not working, and I suppose @HideakiImamura -san is the appropriate reviewer. @HideakiImamura could you review this PR?

codecov · 2024-06-05T02:13:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.74%. Comparing base (181d65f) to head (3b61edc).
Report is 158 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5464      +/-   ##
==========================================
+ Coverage   89.52%   89.74%   +0.22%     
==========================================
  Files         194      195       +1     
  Lines       12626    12592      -34     
==========================================
- Hits        11303    11301       -2     
+ Misses       1323     1291      -32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

HideakiImamura

Holding cache on the sampler instance causes the data inconsistency across different processes. Could you give me an opinion that?

nabenabe0928 · 2024-06-06T04:30:47Z

We discussed internally and decided to close this PR.
This issue can be more or less avoided by specifying multivariate=True.

nabenabe0928 · 2024-06-06T04:59:01Z

Just for future reminder, I will leave some comments:

This PR does not cause any issues between processes

As each trial is sampled in a thread and a sampling of a specific trial will not be scattered on multiple processes or threads, we do not have to be concerned about the missing cached data in another process.

However, when using multiple threads, we need to take care of another thread overwriting the cache before one trial completes its sampling.
For this reason, I added a buffer to store the split data up to the latest 64 trials in a thread.

Anyways, missing cache does not cause any issue because we simply need to re-calculate the split, which incurs some more computational burden.

nabenabe0928 added 8 commits May 23, 2024 19:59

Cache HSSP

bc7e6d1

Add hypervolume 3d

552b14b

Add split cache

9ea0320

Remove unnecessary changes

02e5055

Make hssp cache name same as arguments

5963b50

Change the condition checker order

f930f02

Refactor hssp with cache

f16d416

Remove HSSP cache for PR

2c7d32e

eukaryo mentioned this pull request May 29, 2024

Cache the latest result of HSSP for speedup of MOTPE #5454

Closed

not522 assigned eukaryo and gen740 May 30, 2024

gen740 reviewed Jun 4, 2024

View reviewed changes

eukaryo assigned HideakiImamura Jun 4, 2024

HideakiImamura unassigned eukaryo Jun 4, 2024

nabenabe0928 added 3 commits June 5, 2024 03:48

Reflect gen's comment

8141856

Update an inline comment

aa4c151

Update the ref numbe

3b61edc

HideakiImamura reviewed Jun 6, 2024

View reviewed changes

nabenabe0928 closed this Jun 6, 2024

nabenabe0928 mentioned this pull request Jul 10, 2024

Speed up MOTPE with cache #5558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache split in TPE for high-dimensional optimization #5464

Cache split in TPE for high-dimensional optimization #5464

nabenabe0928 commented May 29, 2024 •

edited

Loading

not522 commented May 29, 2024

nabenabe0928 commented May 29, 2024

not522 commented May 30, 2024

gen740 Jun 4, 2024

nabenabe0928 Jun 5, 2024

eukaryo commented Jun 4, 2024 •

edited

Loading

codecov bot commented Jun 5, 2024

HideakiImamura left a comment •

edited

Loading

nabenabe0928 commented Jun 6, 2024 •

edited

Loading

nabenabe0928 commented Jun 6, 2024

Cache split in TPE for high-dimensional optimization #5464

Cache split in TPE for high-dimensional optimization #5464

Conversation

nabenabe0928 commented May 29, 2024 • edited Loading

Motivation

Description of the changes

not522 commented May 29, 2024

nabenabe0928 commented May 29, 2024

not522 commented May 30, 2024

gen740 Jun 4, 2024

Choose a reason for hiding this comment

nabenabe0928 Jun 5, 2024

Choose a reason for hiding this comment

eukaryo commented Jun 4, 2024 • edited Loading

codecov bot commented Jun 5, 2024

Codecov Report

HideakiImamura left a comment • edited Loading

Choose a reason for hiding this comment

nabenabe0928 commented Jun 6, 2024 • edited Loading

nabenabe0928 commented Jun 6, 2024

This PR does not cause any issues between processes

nabenabe0928 commented May 29, 2024 •

edited

Loading

eukaryo commented Jun 4, 2024 •

edited

Loading

HideakiImamura left a comment •

edited

Loading

nabenabe0928 commented Jun 6, 2024 •

edited

Loading