Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the latest result of HSSP for speedup of MOTPE #5454

Closed
wants to merge 11 commits into from

Conversation

nabenabe0928
Copy link
Collaborator

@nabenabe0928 nabenabe0928 commented May 24, 2024

Motivation

As the speed bottleneck of MOTPE comes from the HSSP calculation at each iteration, I introduce the caching for HSSP.

This is very important especially for n_objectives > 2.

Note that the split cache relates to the following PR:

It is algorithmically wrong to consider different splits for different dimensions.

Description of the changes

I added a cache of the latest HSSP result to study.system_attrs.

Benchmarking Results

I did a benchmarking using a 2-dimensional 3-objective function with n_trials=1000.
My PR enables MOTPE to finish the optimization on this problem four times quicker with the identical optimization result.

Master This PR
116 409
Code
from __future__ import annotations

import time

import optuna


def objective3d(trial: optuna.Trial) -> tuple[float, float, float]:
    x = trial.suggest_float("x", -5, 5)
    y = trial.suggest_float("y", -5, 5)
    return x ** 2 + y ** 2, (x - 2) ** 2 + (y - 2) ** 2, (x + 2) ** 2 + (y + 2) ** 2


if __name__ == "__main__":
    start = time.time()
    n_obj = 3
    sampler = optuna.samplers.TPESampler(seed=42)
    study = optuna.create_study(sampler=sampler, directions=["minimize"]*n_obj)

    study.optimize(objective3d, n_trials=1000)

    print(time.time() - start)

@HideakiImamura
Copy link
Member

It seems that the change of this PR includes not only the cache on results of HSSP but also that on the split of TPE. Could you separate the change for the cache on the split into another PR? It would be great to discuss each item.

@eukaryo
Copy link
Collaborator

eukaryo commented May 29, 2024

FYI: How about utilizing functools.lru_cache to add a caching mechanism to a function or a method ?
(Note that there is a pitfall about caching a method with lru_cache. cf: https://www.youtube.com/watch?v=sVjtp6tGo0g , https://qiita.com/fumitoh/items/f36dc494eec35a13f513 )
This comment may be also related to PR #5464 .

@nabenabe0928
Copy link
Collaborator Author

@eukaryo thank you for the suggestion!
I didn't choose `lru_cache' because caching works over different trials, meaning that another process can make use of the cache via study attrs.

@eukaryo
Copy link
Collaborator

eukaryo commented May 29, 2024

@nabenabe0928 Thank you for your rapid reply!
My current understanding is that lru_cache works to some extent, but not perfectly, especially in a distributed environment. The study attrs always works well in distributed environments, so you chose study attrs. Your choice seems to be reasonable; however, I still think lru_cache is still another "pareto solution" in the trade-off between code complexity and speedup benefits.

@nabenabe0928
Copy link
Collaborator Author

@eukaryo I got your point and tried lru_cache, but it seems lru_cache takes only hashable as arguments and neither list not Array can be hashed, so it didn't work.

@eukaryo
Copy link
Collaborator

eukaryo commented May 29, 2024

I see, that's unfortunate. I really appreciate your effort!

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add unit tests to validate the cases in which the cache hit/doesn't hit?

@nabenabe0928
Copy link
Collaborator Author

Could you add unit tests to validate the cases in which the cache hit/doesn't hit?

I added a unittest!

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update. I have a comment for the test. PTAL.

"storage",
(optuna.storages.RDBStorage("sqlite:///:memory:"), optuna.storages.InMemoryStorage()),
)
def test_solve_hssp_with_cache(storage: optuna.storages.BaseStorage) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, it seems that this test includes too many preparations and logics. How about pre-determining the arguments of _solve_hssp_and_check_cache?

@nabenabe0928
Copy link
Collaborator Author

We discussed internally and decided to close this PR.
However, this PR might be re-opened in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants