New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the time complexity of HSSP 2d from O(NK^2 log K)
to O((N - K)K)
#5346
base: master
Are you sure you want to change the base?
Conversation
Benchmarking using MO-TPE: Benchmarking Codefrom __future__ import annotations
import time
import optuna
def objective2d(trial: optuna.Trial) -> tuple[float, float]:
x = trial.suggest_float("x", -5, 5)
y = trial.suggest_float("y", -5, 5)
return x ** 2 + y ** 2, (x - 2) ** 2 + (y - 2) ** 2
if __name__ == "__main__":
start = time.time()
# optuna.logging.set_verbosity(optuna.logging.CRITICAL)
n_trials = 1000
sampler = optuna.samplers.TPESampler(seed=42)
study = optuna.create_study(sampler=sampler, directions=["minimize"]*2)
study.optimize(objective2d, n_trials=n_trials)
print(f"n_obj=2, {study.sampler}", time.time() - start) The results are here:
Note that the unit of the table is seconds. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5346 +/- ##
==========================================
+ Coverage 89.52% 89.66% +0.13%
==========================================
Files 194 210 +16
Lines 12626 13309 +683
==========================================
+ Hits 11303 11933 +630
- Misses 1323 1376 +53 ☔ View full report in Codecov by Sentry. |
O(NK^2 log K)
to O(NK)
O(NK^2 log K)
to O((N - K)K)
@gen740 @contramundum53 Could you review this PR? |
@contramundum53 I unassigned you as discussed online for now! |
This pull request has not seen any recent activity. |
@eukaryo Could you review this PR? |
I have interpreted HSSP as referring to the Hypervolume Subset Selection Problem. |
Yes, HSSP is the Hypervolume Subset Selection Problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have reviewed the algorithm and your implementation. The logic is sound and the code is well-written.
While the inherent complexity of the algorithm naturally leads to challenging code, your implementation is clear and relatively easy to follow.
This pull request has not seen any recent activity. |
@not522 Could you review this PR? |
This pull request has not seen any recent activity. |
This pull request has not seen any recent activity. |
Motivation
As the bottleneck of MO-TPE is HSSP, I enhanced the time complexity of HSSP for 2D from$O(NK^2 \log K)$ to $O((N-K)K)$ where $N$ is the number of trials to be considered and $K$ is the subset size.
I further speeded up the implementation by using numpy vectorization.
Description of the changes
Another breaking change is in the reproducibility when
study
hasinf
,-inf
, andnan
.However, this is actually a bug in the original implementation because the original implementation gets
nan
forinf
hypervolume contribution.It happened because hypervolume contributions were incrementally calculated by doing
inf - inf = nan
.Meanwhile, the new implementation correctly calculates
inf
hypervolume contributions.You can check such behavior using:
In this example, the reference point is
[np.inf, np.inf]
, so the hypervolumes forsolutions[0]
andsolutions[3]
cannot be computed because they exhibitnan
from(np.inf - np.inf) * np.inf
.On the other hand, the hypervolumes for
solutions[1]
andsolutions[2]
areinf
.Verification Code
The figure below shows the solutions ($v_i$ where $v_{1,1} \leq v_{2,1} \leq v_{3,1} \leq v_{4,1}$ and $v_{1,2} \geq v_{2,2} \geq v_{3,2} \geq v_{4,2}$ . Recall that $\{v_i\}$ is a non-dominated set) and their diagonal points ($d_i$ ).$d_i$ (pink dots) so that the hypervolume contribution (the gray rectangular) of the $i$ -th solution becomes $|v_i - d_i |^2$ given a current state.$v_2$ as a member of the subset.$v_1$ becomes $d_1 \leftarrow [\min(v_{2,1}, d_{1,1}), d_{1, 2}]$ and the diagonal points for $v_3, v_4$ become $d_3 \leftarrow [d_{3,1}, \min(v_{2, 2}, d_{3,2})]$ and $d_4 \leftarrow [d_{4,1}, \min(v_{2, 2}, d_{4,2})]$ .$v_3$ as another member of the subset.$v_1$ becomes $d_1 \leftarrow [\min(v_{3,1}, d_{1,1}), d_{1, 2}]$ and the diagonal point for $v_4$ becomes $d_4 \leftarrow [d_{4,1}, \min(v_{3, 2}, d_{4,2})]$ .
Note that I defined
In the first figure, we have not picked any point.
In the second figure, we picked
Then the diagonal point for
In the third figure, we picked
Then the diagonal point for
In principle, when$v_i$ is picked, $d_j$ will be updated as $d_j \leftarrow [\min(v_{i,1}, d_{j,1}), d_{j,2}]$ for $j < i$ and $d_j \leftarrow [d_{j,1}, \min(v_{i,2}, d_{j,2})]$ for $j > i$ .
These updates can be done by$O(N)$ and we need to repeat it $K$ times, so the time complexity is $O(NK)$ .