New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add constraints option to TPESampler #3506
Conversation
Could you please be a reviewer @HideakiImamura @toshihikoyanase? |
I checked if it works in a toy example, which is modified a little from Binh and Corn function with constraints. from typing import Tuple
import optuna
def objective(trial: optuna.Trial) -> Tuple[float, float]:
# Binh and Korn function with constraints.
x = trial.suggest_float("x", -15, 30)
y = trial.suggest_float("y", -15, 30)
v0 = 4 * x**2 + 4 * y**2
v1 = (x - 5) ** 2 + (y - 5) ** 2
# Modified constraints function.
trial.set_user_attr("constraint", (1000 - v0,))
return v0, v1
def constraints(trial: optuna.Trial) -> Tuple[float]:
return trial.user_attrs["constraint"]
if __name__ == "__main__":
samplers = {
"nsgaii": optuna.samplers.NSGAIISampler,
"motpe": optuna.samplers.TPESampler,
}
for use_const in [False, True]:
for sampler_name in ["nsgaii", "motpe"]:
sampler = samplers[sampler_name](
seed=0, constraints_func=constraints if use_const else None
)
study = optuna.create_study(directions=["minimize", "minimize"], sampler=sampler)
study.optimize(objective, n_trials=500)
optuna.visualization.plot_pareto_front(
study, target_names=["v0", "v1"], constraints_func=constraints
).write_image(f"result/{sampler_name}_{'constraints' if use_const else ''}_pareto.jpg") |
MOTPE without constraints seemed to exploit in early trials and did not work well compared to NSGA2 without constraints. Therefore, MOTPE improved a lot by considering constraints and worked comparably as NSGA2 with constraints. |
As discussed privately with @HideakiImamura and @toshihikoyanase, I'll consider some other strategies of how to calculate weights for |
I tried another toy example that has a smaller feasible region, as suggested by @HideakiImamura. from typing import Tuple
import numpy as np
import optuna
def objective(trial: optuna.Trial) -> Tuple[float, float]:
x = trial.suggest_float("x", 0, 6)
y = trial.suggest_float("y", 0, 6)
c = np.sin(x) * np.sin(y) + 0.95
trial.set_user_attr("constraint", (c,))
f1 = np.sin(x) - y
f2 = np.cos(x) + y**2
return f1, f2
def constraints(trial: optuna.Trial) -> Tuple[float]:
return trial.user_attrs["constraint"]
if __name__ == "__main__":
samplers = {
"nsgaii": optuna.samplers.NSGAIISampler,
"motpe": optuna.samplers.TPESampler,
}
for use_const in [False, True]:
for sampler_name in ["nsgaii", "motpe"]:
sampler = samplers[sampler_name](
seed=0, constraints_func=constraints if use_const else None
)
study = optuna.create_study(sampler=sampler, directions=["minimize", "minimize"])
study.optimize(objective, n_trials=400)
fig = optuna.visualization.plot_pareto_front(study, constraints_func=constraints)
fig.write_image(
f"result/small_feasible/{sampler_name}_{'constraints' if use_const else ''}_pareto.jpg"
) The current implementation already works fine (seems slightly better than NSGA2 with constraints), but anyway I will try another strategy for the weight calculation. |
In the current implementation, I set Another option we can consider is to use the same strategy of weight calculation as the single-objective TPE: put more weights on new trials. With this (knshnb#1), it seems to exploit in the earlier phase and does not explore well. |
I tried another strategy (knshnb#2) of weight calculation: just ignore all the infeasible trials in |
As a side note, there is some room for improvement, such as
However, I think the current implementation is enough as a straightforward extension of the method used in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great improvement of TPESampler! I have several comments. PTAL.
optuna/samplers/_tpe/sampler.py
Outdated
) -> Tuple[ | ||
Dict[str, List[Optional[float]]], | ||
List[Tuple[float, List[float]]], | ||
Optional[List[Sequence[float]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not force it to be a tuple, for example, instead of an arbitrary Sequence? Ambiguities will be reduced and bugs will be less likely to be created in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type of constraints_func
is Sequence[float]
following NSGAIISampler
. Do you suggest to cast from Sequence[float]
to Tuple[float]
somewhere?
optuna/optuna/samplers/nsgaii/_sampler.py
Line 121 in 0964da2
constraints_func: Optional[Callable[[FrozenTrial], Sequence[float]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean we can cast it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we do not need to cast by 2f1ff5a.
optuna/samplers/_tpe/sampler.py
Outdated
return values, scores | ||
if get_constraints: | ||
assert constraints is not None | ||
constraints.append(cast(Sequence[float], trial.system_attrs.get(_CONSTRAINTS_KEY))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If trial.system_attrs
misses the key of _CONSTRAINTS_KEY
, the result of cast(Sequence[float], trial.system_attrs.get(_CONSTRAINTS_KEY))
will be None. So, constraints
can include None
.
(1) The type of constraints
should be Optional[List[Optional[Sequence[float]]]]
.
(2) The handling of constraints after this function returns should be implemented with the consideration that there may be None inside. (I have not checked the handling after this function yet.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved by 2f1ff5a.
optuna/samplers/_tpe/sampler.py
Outdated
# 3. Feasible trials are sorted by loss_vals. | ||
if constraints is not None: | ||
# 1-dimensional violation value (violation_1d==0 <=> feasible, >0 <=> infeasible). | ||
violation_1d = np.maximum(np.array(constraints), 0).sum(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If constraints
includes None
, the type of each item in np.array(constraints)
includes None
. How about specifying the dtype like np.array(constraints, dtype=float)
? It convert None
to np.nan
, so we need to filter np.nan
before taking np.maximum
to compare violation_1d[idx[n_below]] > 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not consider the case constraints
includes None
. When constraints
include None
, it cannot be simply converted using np.array
because constraints
might be something like [(1.0, 1.0), None, (2.0, 2.0)]
and the dimension does not match. I need to consider how to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to handle None
inside _get_observation_pairs
. This makes _get_observation_pairs
a bit more complex, but the whole logic and return type are simplified if we need to consider None
.
2f1ff5a#diff-2cc3a4088f95a10e806b68655d375f6f7f5fe2de161a0182c16168e378be8065R713-R715
By the way, I think it is good idea to add some unit tests to validate the behavior. |
@toshihikoyanase Thanks for pointing out the case that causes an error! It happens when I added a test for that case (2290708) and fixed the code (7a6ca0b). @HideakiImamura Thanks for the comments! Please take a look again 🙏 |
@knshnb Thank you for your investigation! I confirmed that the current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. I have several comments. PTAL.
optuna/samplers/_tpe/sampler.py
Outdated
if n_below >= len(idx) or violation_1d[idx[n_below]] > 0: | ||
# if violation_1d[idx[n_below]] > 0: | ||
# Below is filled by all feasible trials and trials with smaller violation values. | ||
indices_below = idx[:n_below] | ||
indices_above = idx[n_below:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify this part for my own understanding (I assume that we remove taking max in line 713.)
violation_1d[idx]
is a sequence of the values of violations in ascending order, for example, [-10.0, -1.0, 0.0, 1.0, 2.0, 5.0, 10.0]
. The first 3 values are feasible.
Let n_below = 5
. Then violation_1d[idx[n_below]] = 5.0 > 0
, so all feasible trials [-10.0, -1.0, 0.0]
are included in the below
. In addition, we should include 1.0
and 2.0
in the below
. We select the infeasible trials with smaller violations.
def test_multi_objective_get_observation_pairs() -> None: | ||
@pytest.mark.parametrize("constraints_enabled", [False, True]) | ||
@pytest.mark.parametrize("constraint_value", [-2, 2]) | ||
def test_multi_objective_get_observation_pairs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
) | ||
assert list(indices_below) == [0, 3] | ||
assert list(indices_above) == [1, 2] | ||
|
||
|
||
def test_split_observation_pairs_with_constraints() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not test both cases, ``if n_below >= len(idx) or violation_1d[idx[n_below]] > 0` or not then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code basically looks good to me. Let me add some cosmetic comments.
optuna/samplers/_tpe/sampler.py
Outdated
warnings.warn( | ||
"The constraints_func option is an experimental feature." | ||
" The interface can change in the future.", | ||
ExperimentalWarning, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we add a test about this ExperimentalWanring
? The multivariate
argument has such a test case.
optuna/tests/samplers_tests/tpe_tests/test_sampler.py
Lines 35 to 37 in 83c8022
def test_multivariate_experimental_warning() -> None: | |
with pytest.warns(optuna.exceptions.ExperimentalWarning): | |
optuna.samplers.TPESampler(multivariate=True) |
def test_multi_objective_get_observation_pairs() -> None: | ||
@pytest.mark.parametrize("constraints_enabled", [False, True]) | ||
@pytest.mark.parametrize("constraint_value", [-2, 2]) | ||
def test_multi_objective_get_observation_pairs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @HideakiImamura . This test changes the expected values depending on the parametrized value (i.e., constraints_enabled
), and I think it is a kind of conditional test logic. As you mentioned, we can work on it in a follow-up PR.
assert _tpe.sampler._get_observation_pairs( | ||
study, ["y"], True, constraints_enabled=constraints_enabled | ||
) == ({"y": []}, [], [] if constraints_enabled else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto. This test function also changes the expected values depending on the parametrized value.
Co-authored-by: Toshihiko Yanase <toshihiko.yanase@gmail.com>
@HideakiImamura @toshihikoyanase I updated the code! PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Almost LGTM. I have a nit comment.
By the way, since we have slightly changed the implementation, could you take a benchmark again to confirm the no performance degradation?
else: | ||
# All trials in below are feasible. | ||
# Feasible trials with smaller loss_vals are selected. | ||
(feasible_idx,) = (violation_1d == 0).nonzero() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(feasible_idx,) = (violation_1d == 0).nonzero() | |
(feasible_idx,) = (violation_1d <= 0).nonzero() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we privately discussed, I keep this line because violation_1d
is guaranteed to be greater than or equal to 0.
https://github.com/optuna/optuna/pull/3506/files#diff-2cc3a4088f95a10e806b68655d375f6f7f5fe2de161a0182c16168e378be8065R712-R713
Since there has been a large amount of code modification, I confirmed that the current code works fine on the toy problem #3506 (comment). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the long running work! LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for accomplishing the long running PR!
[Notes] I checked the behavior of constrained MOTPE with the following simple code.
import optuna
def objective(trial):
x = trial.suggest_float("x", -10, 10)
y = trial.suggest_float("y", -10, 10)
trial.set_user_attr("constraint", [y - 5])
return x, y
def constraints_func(trial):
return trial.user_attrs["constraint"]
study = optuna.create_study(
directions=["maximize", "maximize"],
sampler=optuna.samplers.TPESampler(
multivariate=True,
constraints_func=constraints_func,
),
)
study.optimize(objective, n_trials=100)
optuna.visualization.plot_pareto_front(study, constraints_func=constraints_func).show()
optuna.visualization.plot_optimization_history(study, target=lambda t: t.values[0]).show()
optuna.visualization.plot_optimization_history(study, target=lambda t: t.values[1]).show()
The constrained MOTPE intensively search around x=10, y=5
instead of x=10, y=10
since y>5
is infeasible.
By comparing optimization histories of x
and y
, we can see that the sampler gradually focused on y=5
and I guess it can successfully choose the search points considering the constraint.
I found a minor bug of |
is it possible to implement constraints for single objective TPE? if so, are you planning to add this feature? |
Actually, constrained optimization for single objective TPE itself is supported by this PR as well. However, we haven't developed the related features (such as log, visualization, etc.) and dogfooded well yet. Feel free to request if you encounter any problems! import optuna
def objective(trial):
x = trial.suggest_float("x", -1, 1)
trial.set_user_attr("constraint", (-x,))
return x
def constraints_func(trial):
return trial.user_attrs["constraint"]
sampler = optuna.samplers.TPESampler(constraints_func=constraints_func)
study = optuna.create_study(sampler=sampler)
study.optimize(objective, n_trials=30) Sorry, log of the best value does not take the constraint into account, but the optimization itself can work. |
The reason I mentioned this is because I get the following warning on all trials when running single objective TPE with constraints: |
Could you share the reproducible code? |
this happens when constant_liar==True. |
Is it possible to provide a parameter to the constraints function? Example: constraints_func(trial=trial, metrics=metrics) |
Thanks for the comment! Could you create a new issue with a more detailed explanation of how you want to change the constraints during the study? |
Motivation
I implemented a constrained optimization option in multi-objective
TPESampler
with the similar strategy as NSGA-II (#2175).Interfaces are same as
NSGAIISampler
.Description of the changes
_get_observation_pairs
also return constraints of trialsI'm considering if I should enable the constrained optimization in TPESampler, which can be naturally introduced when implementing constrained optimization in MOTPESampler.