Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add constraints option to TPESampler #3506

Merged
merged 30 commits into from Jun 14, 2022

Conversation

knshnb
Copy link
Member

@knshnb knshnb commented Apr 24, 2022

Motivation

I implemented a constrained optimization option in multi-objective TPESampler with the similar strategy as NSGA-II (#2175).
Interfaces are same as NSGAIISampler.

Description of the changes

  • Add constraints_func argument
  • Make _get_observation_pairs also return constraints of trials
  • When dividing into below and above in TPE, consider constraint values before HSSP

I'm considering if I should enable the constrained optimization in TPESampler, which can be naturally introduced when implementing constrained optimization in MOTPESampler.

@github-actions github-actions bot added the optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. label Apr 24, 2022
@toshihikoyanase toshihikoyanase added the feature Change that does not break compatibility, but affects the public interfaces. label Apr 25, 2022
@himkt
Copy link
Member

himkt commented Apr 25, 2022

Could you please be a reviewer @HideakiImamura @toshihikoyanase?

@knshnb knshnb changed the title [WIP] Add constraints option to MOTPESampler [WIP] Add constraints option to TPESampler Apr 25, 2022
@knshnb
Copy link
Member Author

knshnb commented Apr 25, 2022

I checked if it works in a toy example, which is modified a little from Binh and Corn function with constraints.

from typing import Tuple

import optuna


def objective(trial: optuna.Trial) -> Tuple[float, float]:
    # Binh and Korn function with constraints.
    x = trial.suggest_float("x", -15, 30)
    y = trial.suggest_float("y", -15, 30)

    v0 = 4 * x**2 + 4 * y**2
    v1 = (x - 5) ** 2 + (y - 5) ** 2

    # Modified constraints function.
    trial.set_user_attr("constraint", (1000 - v0,))

    return v0, v1


def constraints(trial: optuna.Trial) -> Tuple[float]:
    return trial.user_attrs["constraint"]


if __name__ == "__main__":
    samplers = {
        "nsgaii": optuna.samplers.NSGAIISampler,
        "motpe": optuna.samplers.TPESampler,
    }
    for use_const in [False, True]:
        for sampler_name in ["nsgaii", "motpe"]:
            sampler = samplers[sampler_name](
                seed=0, constraints_func=constraints if use_const else None
            )
            study = optuna.create_study(directions=["minimize", "minimize"], sampler=sampler)
            study.optimize(objective, n_trials=500)
            optuna.visualization.plot_pareto_front(
                study, target_names=["v0", "v1"], constraints_func=constraints
            ).write_image(f"result/{sampler_name}_{'constraints' if use_const else ''}_pareto.jpg")

@knshnb
Copy link
Member Author

knshnb commented Apr 25, 2022

  • MOTPE without constraints
    image

  • MOTPE with constraints (this PR)
    image

  • NSGA2 without constraints
    image

  • NSGA2 with constraints
    image

MOTPE without constraints seemed to exploit in early trials and did not work well compared to NSGA2 without constraints. Therefore, MOTPE improved a lot by considering constraints and worked comparably as NSGA2 with constraints.

@knshnb
Copy link
Member Author

knshnb commented Apr 25, 2022

As discussed privately with @HideakiImamura and @toshihikoyanase, I'll consider some other strategies of how to calculate weights for _ParzenEstimator by testing on a task with a smaller feasible region.

@knshnb
Copy link
Member Author

knshnb commented May 6, 2022

I tried another toy example that has a smaller feasible region, as suggested by @HideakiImamura.

from typing import Tuple

import numpy as np

import optuna


def objective(trial: optuna.Trial) -> Tuple[float, float]:
    x = trial.suggest_float("x", 0, 6)
    y = trial.suggest_float("y", 0, 6)

    c = np.sin(x) * np.sin(y) + 0.95
    trial.set_user_attr("constraint", (c,))

    f1 = np.sin(x) - y
    f2 = np.cos(x) + y**2

    return f1, f2


def constraints(trial: optuna.Trial) -> Tuple[float]:
    return trial.user_attrs["constraint"]


if __name__ == "__main__":
    samplers = {
        "nsgaii": optuna.samplers.NSGAIISampler,
        "motpe": optuna.samplers.TPESampler,
    }
    for use_const in [False, True]:
        for sampler_name in ["nsgaii", "motpe"]:
            sampler = samplers[sampler_name](
                seed=0, constraints_func=constraints if use_const else None
            )
            study = optuna.create_study(sampler=sampler, directions=["minimize", "minimize"])
            study.optimize(objective, n_trials=400)

            fig = optuna.visualization.plot_pareto_front(study, constraints_func=constraints)
            fig.write_image(
                f"result/small_feasible/{sampler_name}_{'constraints' if use_const else ''}_pareto.jpg"
            )
  • MOTPE without constraints
    image

  • MOTPE with constraints (this PR)
    image

  • NSGA2 without constraints
    image

  • NSGA2 with constraints
    image

The current implementation already works fine (seems slightly better than NSGA2 with constraints), but anyway I will try another strategy for the weight calculation.

@knshnb
Copy link
Member Author

knshnb commented May 6, 2022

In the current implementation, I set EPS weights for _ParzenEstimator to all the infeasible trials.

Another option we can consider is to use the same strategy of weight calculation as the single-objective TPE: put more weights on new trials. With this (knshnb#1), it seems to exploit in the earlier phase and does not explore well.
image

@knshnb
Copy link
Member Author

knshnb commented May 6, 2022

I tried another strategy (knshnb#2) of weight calculation: just ignore all the infeasible trials in below (similar to this PR, but different in that infeasible trials are ignored also in hypervolume calculation).
I think this is natural and it seems to be working fine.
image

@knshnb
Copy link
Member Author

knshnb commented May 12, 2022

I implemented another strategy: ignore all infeasible trials in hypervolume calculation for ParzenEstimator's weight and set infeasible trials' weight EPS.

The experimental results are fine overall, although they are dependent on random seeds.
Seed 0:
image
Seed 1:
image
Seed 2:
image
Seed 3:
image
Seed 4:
image

@knshnb
Copy link
Member Author

knshnb commented May 12, 2022

As a side note, there is some room for improvement, such as

  • using EPS as infeasible trials' weight might be too small
    • it depends on the scale of hypervolume, so we might need to consider normalization in output space
  • we set the same weight for all infeasible trials in below, regardless of constraints values

However, I think the current implementation is enough as a straightforward extension of the method used in NSGAIISampler.

@knshnb
Copy link
Member Author

knshnb commented May 12, 2022

I supported sample_independent and tested on multivariate=True as well.
The results might be less dependent on random seeds when multivariate=True.

Seed 0:
image
Seed 1:
image
Seed 2:
image
Seed 3:
image
Seed 4:
image

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great improvement of TPESampler! I have several comments. PTAL.

optuna/samplers/_tpe/sampler.py Show resolved Hide resolved
optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
optuna/samplers/_tpe/sampler.py Show resolved Hide resolved
optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
) -> Tuple[
Dict[str, List[Optional[float]]],
List[Tuple[float, List[float]]],
Optional[List[Sequence[float]]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not force it to be a tuple, for example, instead of an arbitrary Sequence? Ambiguities will be reduced and bugs will be less likely to be created in the future.

Copy link
Member Author

@knshnb knshnb May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type of constraints_func is Sequence[float] following NSGAIISampler. Do you suggest to cast from Sequence[float] to Tuple[float] somewhere?

constraints_func: Optional[Callable[[FrozenTrial], Sequence[float]]] = None,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean we can cast it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we do not need to cast by 2f1ff5a.

return values, scores
if get_constraints:
assert constraints is not None
constraints.append(cast(Sequence[float], trial.system_attrs.get(_CONSTRAINTS_KEY)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If trial.system_attrs misses the key of _CONSTRAINTS_KEY, the result of cast(Sequence[float], trial.system_attrs.get(_CONSTRAINTS_KEY)) will be None. So, constraints can include None.

(1) The type of constraints should be Optional[List[Optional[Sequence[float]]]].
(2) The handling of constraints after this function returns should be implemented with the consideration that there may be None inside. (I have not checked the handling after this function yet.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by 2f1ff5a.

# 3. Feasible trials are sorted by loss_vals.
if constraints is not None:
# 1-dimensional violation value (violation_1d==0 <=> feasible, >0 <=> infeasible).
violation_1d = np.maximum(np.array(constraints), 0).sum(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If constraints includes None, the type of each item in np.array(constraints) includes None. How about specifying the dtype like np.array(constraints, dtype=float)? It convert None to np.nan, so we need to filter np.nan before taking np.maximum to compare violation_1d[idx[n_below]] > 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not consider the case constraints includes None. When constraints include None, it cannot be simply converted using np.array because constraints might be something like [(1.0, 1.0), None, (2.0, 2.0)] and the dimension does not match. I need to consider how to handle this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to handle None inside _get_observation_pairs. This makes _get_observation_pairs a bit more complex, but the whole logic and return type are simplified if we need to consider None.
2f1ff5a#diff-2cc3a4088f95a10e806b68655d375f6f7f5fe2de161a0182c16168e378be8065R713-R715

@HideakiImamura
Copy link
Member

By the way, I think it is good idea to add some unit tests to validate the behavior.

@knshnb
Copy link
Member Author

knshnb commented Jun 3, 2022

@toshihikoyanase Thanks for pointing out the case that causes an error! It happens when n_startup_trials is 0 or 1 (not related to constant_liar). This was caused because I did not consider the case that _split_observation_pars is called for a very small number of trials (By default, RandomSampler is used in such cases).

I added a test for that case (2290708) and fixed the code (7a6ca0b).

@HideakiImamura Thanks for the comments! Please take a look again 🙏

@toshihikoyanase
Copy link
Member

@knshnb Thank you for your investigation! I confirmed that the current TPESampler works with n_startup_trials=1. I'll check the details of the code next!

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I have several comments. PTAL.

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
Comment on lines 732 to 736
if n_below >= len(idx) or violation_1d[idx[n_below]] > 0:
# if violation_1d[idx[n_below]] > 0:
# Below is filled by all feasible trials and trials with smaller violation values.
indices_below = idx[:n_below]
indices_above = idx[n_below:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify this part for my own understanding (I assume that we remove taking max in line 713.)

violation_1d[idx] is a sequence of the values of violations in ascending order, for example, [-10.0, -1.0, 0.0, 1.0, 2.0, 5.0, 10.0]. The first 3 values are feasible.

Let n_below = 5. Then violation_1d[idx[n_below]] = 5.0 > 0, so all feasible trials [-10.0, -1.0, 0.0] are included in the below. In addition, we should include 1.0 and 2.0 in the below. We select the infeasible trials with smaller violations.

def test_multi_objective_get_observation_pairs() -> None:
@pytest.mark.parametrize("constraints_enabled", [False, True])
@pytest.mark.parametrize("constraint_value", [-2, 2])
def test_multi_objective_get_observation_pairs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

)
assert list(indices_below) == [0, 3]
assert list(indices_above) == [1, 2]


def test_split_observation_pairs_with_constraints() -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not test both cases, ``if n_below >= len(idx) or violation_1d[idx[n_below]] > 0` or not then?

Copy link
Member

@toshihikoyanase toshihikoyanase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code basically looks good to me. Let me add some cosmetic comments.

Comment on lines 301 to 305
warnings.warn(
"The constraints_func option is an experimental feature."
" The interface can change in the future.",
ExperimentalWarning,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we add a test about this ExperimentalWanring ? The multivariate argument has such a test case.

def test_multivariate_experimental_warning() -> None:
with pytest.warns(optuna.exceptions.ExperimentalWarning):
optuna.samplers.TPESampler(multivariate=True)

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved
optuna/samplers/_tpe/sampler.py Show resolved Hide resolved
def test_multi_objective_get_observation_pairs() -> None:
@pytest.mark.parametrize("constraints_enabled", [False, True])
@pytest.mark.parametrize("constraint_value", [-2, 2])
def test_multi_objective_get_observation_pairs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @HideakiImamura . This test changes the expected values depending on the parametrized value (i.e., constraints_enabled), and I think it is a kind of conditional test logic. As you mentioned, we can work on it in a follow-up PR.

tests/samplers_tests/tpe_tests/test_sampler.py Outdated Show resolved Hide resolved
Comment on lines +830 to +832
assert _tpe.sampler._get_observation_pairs(
study, ["y"], True, constraints_enabled=constraints_enabled
) == ({"y": []}, [], [] if constraints_enabled else None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. This test function also changes the expected values depending on the parametrized value.

@knshnb
Copy link
Member Author

knshnb commented Jun 9, 2022

@HideakiImamura @toshihikoyanase I updated the code! PTAL.

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. Almost LGTM. I have a nit comment.

By the way, since we have slightly changed the implementation, could you take a benchmark again to confirm the no performance degradation?

else:
# All trials in below are feasible.
# Feasible trials with smaller loss_vals are selected.
(feasible_idx,) = (violation_1d == 0).nonzero()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(feasible_idx,) = (violation_1d == 0).nonzero()
(feasible_idx,) = (violation_1d <= 0).nonzero()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we privately discussed, I keep this line because violation_1d is guaranteed to be greater than or equal to 0.
https://github.com/optuna/optuna/pull/3506/files#diff-2cc3a4088f95a10e806b68655d375f6f7f5fe2de161a0182c16168e378be8065R712-R713

@knshnb
Copy link
Member Author

knshnb commented Jun 10, 2022

Since there has been a large amount of code modification, I confirmed that the current code works fine on the toy problem #3506 (comment).

seed=0
image
seed=1
image

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the long running work! LGTM!

Copy link
Member

@toshihikoyanase toshihikoyanase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for accomplishing the long running PR!

[Notes] I checked the behavior of constrained MOTPE with the following simple code.

import optuna

def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    y = trial.suggest_float("y", -10, 10)
    trial.set_user_attr("constraint", [y - 5])
    return x, y

def constraints_func(trial):
    return trial.user_attrs["constraint"]

study = optuna.create_study(
    directions=["maximize", "maximize"],
    sampler=optuna.samplers.TPESampler(
        multivariate=True,
        constraints_func=constraints_func,
    ),
)
study.optimize(objective, n_trials=100)

optuna.visualization.plot_pareto_front(study, constraints_func=constraints_func).show()
optuna.visualization.plot_optimization_history(study, target=lambda t: t.values[0]).show()
optuna.visualization.plot_optimization_history(study, target=lambda t: t.values[1]).show()

The constrained MOTPE intensively search around x=10, y=5 instead of x=10, y=10 since y>5 is infeasible.

image

By comparing optimization histories of x and y, we can see that the sampler gradually focused on y=5 and I guess it can successfully choose the search points considering the constraint.

Optimization history of x
image

Optimization history of y
image

@toshihikoyanase toshihikoyanase added this to the v3.0.0-rc0 milestone Jun 14, 2022
@toshihikoyanase toshihikoyanase merged commit d3c53e0 into optuna:master Jun 14, 2022
@toshihikoyanase
Copy link
Member

I found a minor bug of TPESampler during the review and reported in #3670.

@knshnb knshnb deleted the motpe-constraints branch October 3, 2022 08:55
@okaikov
Copy link

okaikov commented Nov 30, 2022

is it possible to implement constraints for single objective TPE? if so, are you planning to add this feature?

@knshnb
Copy link
Member Author

knshnb commented Dec 1, 2022

Actually, constrained optimization for single objective TPE itself is supported by this PR as well. However, we haven't developed the related features (such as log, visualization, etc.) and dogfooded well yet. Feel free to request if you encounter any problems!

import optuna


def objective(trial):
    x = trial.suggest_float("x", -1, 1)
    trial.set_user_attr("constraint", (-x,))
    return x


def constraints_func(trial):
    return trial.user_attrs["constraint"]


sampler = optuna.samplers.TPESampler(constraints_func=constraints_func)
study = optuna.create_study(sampler=sampler)
study.optimize(objective, n_trials=30)

Sorry, log of the best value does not take the constraint into account, but the optimization itself can work.

@okaikov
Copy link

okaikov commented Dec 1, 2022

The reason I mentioned this is because I get the following warning on all trials when running single objective TPE with constraints:
optuna/samplers/_tpe/sampler.py:680: UserWarning: Trial 0 does not have constraint values. It will be treated as a lower priority than other trials.

@knshnb
Copy link
Member Author

knshnb commented Dec 1, 2022

Could you share the reproducible code?

@okaikov
Copy link

okaikov commented Dec 1, 2022

this happens when constant_liar==True.
just add "constant_liar=True" to the TPE sampler

@knshnb
Copy link
Member Author

knshnb commented Dec 1, 2022

@okaikov Thanks for the report!
I created an issue. It seems that the warning is a bug and you can ignore it for now.

@okaikov
Copy link

okaikov commented Apr 27, 2023

Is it possible to provide a parameter to the constraints function? Example: constraints_func(trial=trial, metrics=metrics)
The use case is that I want to change the constraints during the study, so I don't want to set hardcoded values to the usr_attr and read it from there, instead I will compute the constraints dynamically.

@knshnb
Copy link
Member Author

knshnb commented Apr 27, 2023

Thanks for the comment! Could you create a new issue with a more detailed explanation of how you want to change the constraints during the study?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Change that does not break compatibility, but affects the public interfaces. optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants