Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify implementation of fast non-dominated sort #5160

Conversation

Alnusjaponica
Copy link
Collaborator

Motivation

(Partially) resolve #5089

Description of the changes

  • Unify the backend of non-dominated sort in sampler/_tpe/sampler.py and sampler/nsgaii/_elite_population_selection_strategy.py into study._multi_objective._fast_non_dominated_sort(), which
    • (theoretically) speeds up the non dominated sort step used in TPESampler, and
    • speeds up fast non dominated sort step in NSGAIISampler and NSGAIIISampler by replacing for loop with ndarray operation,
  • Add helper function _constraints_evaluation._evaluate_penalty() to handle constraints. This eliminates _fast_non_dominated_sort()’s dependence on _constrained_dominates() and prevents duplicate penalty value calculations,
  • Add helper function _rank_population() for NSGA-II, -III. This bridges the gap between I/O type of non-dominated sort implementations,
  • Update argument for _validate_constraints() and move constraints value validation step in _constrained_dominates() to _validate_constraints(), which reduces duplicated validation steps,
  • Rename a module name from optuna.samplers.nsgaii._dominates to optuna.samplers.nsgaii._constraints_evaluation,
  • Move test_calculate_nondomination_rank(), which now tests study._multi_objective._fast_non_dominated_sort() from samplers_tests/tpe_tests/test_multi_objective_sampler.py to study_tests/test_multi_objective.py.

Please note that this PR is based on #5157 and merge it first.

@github-actions github-actions bot added optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. optuna.testing Related to the `optuna.testing` submodule. This is automatically labeled by github-actions. labels Dec 22, 2023
@Alnusjaponica Alnusjaponica force-pushed the unifiy-implementation-of-fast-nondominated-sort branch from 149248b to 3137ef6 Compare December 22, 2023 12:41
Copy link

codecov bot commented Dec 22, 2023

Codecov Report

Attention: Patch coverage is 98.90110% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 89.59%. Comparing base (a8aa5c1) to head (542f011).
Report is 186 commits behind head on master.

Files Patch % Lines
...ers/nsgaii/_elite_population_selection_strategy.py 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5160      +/-   ##
==========================================
+ Coverage   89.01%   89.59%   +0.57%     
==========================================
  Files         213      209       -4     
  Lines       14523    13114    -1409     
==========================================
- Hits        12928    11749    -1179     
+ Misses       1595     1365     -230     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Alnusjaponica Alnusjaponica force-pushed the unifiy-implementation-of-fast-nondominated-sort branch from 3137ef6 to a4cd1cd Compare December 22, 2023 14:36
@Alnusjaponica Alnusjaponica marked this pull request as ready for review December 22, 2023 16:23
Comment on lines +77 to +82
def _fast_non_dominated_sort(
objective_values: np.ndarray,
*,
penalty: np.ndarray | None = None,
n_below: int | None = None,
) -> np.ndarray:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from __future__ import annotations

from collections import defaultdict
import time

import numpy as np


def run_alnusjaponica(objective_values: np.ndarray) -> np.ndarray:
    domination_mat = np.all(
        objective_values[:, np.newaxis, :] >= objective_values[np.newaxis, :, :], axis=2
    ) & np.any(
        objective_values[:, np.newaxis, :] > objective_values[np.newaxis, :, :], axis=2
    )

    domination_list = np.nonzero(domination_mat)
    domination_map = defaultdict(list)
    for dominated_idx, dominating_idx in zip(*domination_list):
        domination_map[dominating_idx].append(dominated_idx)

    ranks = np.full(len(objective_values), -1)
    dominated_count = np.sum(domination_mat, axis=1)

    rank = -1
    ranked_idx_num = 0
    while ranked_idx_num < len(objective_values):
        (non_dominated_idxs,) = np.nonzero(dominated_count == 0)
        ranked_idx_num += len(non_dominated_idxs)
        rank += 1
        ranks[non_dominated_idxs] = rank

        dominated_count[non_dominated_idxs] = -1
        for non_dominated_idx in non_dominated_idxs:
            dominated_count[domination_map[non_dominated_idx]] -= 1
    return ranks


def is_pareto_front_nd(ordered_loss_values: np.ndarray, assume_unique: bool) -> np.ndarray:
    loss_values = ordered_loss_values.copy()
    n_trials = loss_values.shape[0]
    is_front = np.zeros(n_trials, dtype=bool)
    nondominated_indices = np.arange(n_trials)
    while len(loss_values):
        nondominated_and_not_top = np.any(loss_values < loss_values[0], axis=1)
        # NOTE: trials[j] cannot dominate trials[j] for i < j because of lexsort.
        # Therefore, nondominated_indices[0] is always non-dominated.
        if assume_unique:
            is_front[nondominated_indices[0]] = True
        else:
            top_indices = nondominated_indices[np.all(loss_values[~nondominated_and_not_top] == loss_values[0], axis=1)]
            is_front[top_indices] = True

        loss_values = loss_values[nondominated_and_not_top]
        nondominated_indices = nondominated_indices[nondominated_and_not_top]

    return is_front


def is_pareto_front_2d(ordered_loss_values: np.ndarray, assume_unique: bool) -> np.ndarray:
    n_trials = ordered_loss_values.shape[0]
    cummin_value1 = np.minimum.accumulate(ordered_loss_values[:, 1])
    is_value1_min = cummin_value1 == ordered_loss_values[:, 1]
    is_value1_new_min = cummin_value1[1:] < cummin_value1[:-1]

    on_front = np.ones(n_trials, dtype=bool)
    if assume_unique:
        on_front[1:] = is_value1_min[1:] & is_value1_new_min
    if not assume_unique:
        is_value0_same = ordered_loss_values[1:, 0] == ordered_loss_values[:-1, 0]
        on_front[1:] = is_value1_min[1:] & (is_value0_same | is_value1_new_min)

    return on_front


def is_pareto_front(ordered_loss_values: np.ndarray, assume_unique: bool) -> np.ndarray:
    (n_trials, n_objectives) = ordered_loss_values.shape
    if n_objectives == 1:
        return ordered_loss_values[:, 0] == ordered_loss_values[0]
    elif n_objectives == 2:
        return is_pareto_front_2d(ordered_loss_values, assume_unique)
    else:
        return is_pareto_front_nd(ordered_loss_values, assume_unique)


def calculate_nondomination_rank(loss_values: np.ndarray) -> np.ndarray:
    (n_trials, n_objectives) = loss_values.shape

    if n_objectives == 1:
        _, ranks = np.unique(loss_values[:, 0], return_inverse=True)
        return ranks
    else:
        # It ensures that trials[j] will not dominate trials[i] for i < j.
        # np.unique does lexsort.
        ordered_loss_values, order_inv = np.unique(loss_values, return_inverse=True, axis=0)

    n_unique = ordered_loss_values.shape[0]
    ranks = np.zeros(n_unique, dtype=int)
    rank = 0
    indices = np.arange(n_unique)
    while indices.size > 0:
        on_front = is_pareto_front(ordered_loss_values, assume_unique=True)
        ranks[indices[on_front]] = rank
        # Remove the recent Pareto solutions.
        indices = indices[~on_front]
        ordered_loss_values = ordered_loss_values[~on_front]
        rank += 1

    return ranks[order_inv]


def run_nabenabe(loss_values: np.ndarray) -> np.ndarray:
    return calculate_nondomination_rank(loss_values)


def measure_time(target, loss_values: np.ndarray) -> tuple[np.ndarray, float]:
    start = time.time()
    results = target(loss_values.copy())
    elapsed_time = (time.time() - start) * 1000
    return results, elapsed_time


if __name__ == "__main__":
    n_trials = 1000
    n_objectives = 1
    n_seeds = 5
    results = {"nabenabe": [], "alnusjaponica": []}
    for seed in range(n_seeds):
        rng = np.random.RandomState(seed)
        loss_values = rng.normal(size=(n_trials, n_objectives))
        ans_nabenabe, t = measure_time(run_nabenabe, loss_values)
        results["nabenabe"].append(t)
        ans_alnusjaponica, t = measure_time(run_alnusjaponica, loss_values)
        results["alnusjaponica"].append(t)
        print(np.all(ans_alnusjaponica == ans_nabenabe))

    print({k: f"{np.mean(v):.2f} +/- {np.std(v) / np.sqrt(n_seeds):.2f} [ms]" for k, v in results.items()})

For simplicity, I removed penalty and n_below, but this implementation maintains the same results as your program yet much quicker.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit is milliseconds.

Mine Yours
n_trials=1000, n_objectives=2 1.57 $\pm$ 0.1 84.5 $\pm$ 3.1
n_trials=10000, n_objectives=2 18.13 $\pm$ 1.4 9858.0 $\pm$ 126.8
n_trials=1000, n_objectives=3 11.95 $\pm$ 0.9 61.5 $\pm$ 1.0
n_trials=10000, n_objectives=3 310.5 $\pm$ 5.0 7025.0 $\pm$ 181.4

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your implementation of dominance with penalty includes the unnecessary consideration of penalty for feasible cases.
Namely, we do not consider the penalty amount once each trial satisfies the penalty, but yours are considering the penalty amount even for feasible cases.

Please check the following definition by Deb et al. [1]:

image

Plus, the implementation below is much quicker.

def calculate_nondomination_rank_with_penalty(
    loss_values: np.ndarray, penalty: np.ndarray | None = None
) -> np.ndarray:
    if penalty is None:
        return calculate_nondomination_rank(loss_values)

    # If values[i] constrained-dominates values[j] given penalty[i] and penalty[j],
    # one of the following must be satisfied:
    # 1. penalty[i] <= 0 and penalty[j] > 0,
    # 2. penalty[i] > 0 and penalty[j] > 0 and penalty[i] < penalty[j], or
    # 3. penalty[i] <= 0 and penalty[j] <= 0 and values[i] dominates values[j].
    # Therefore, if trials[i] is feasible and trials[j] is infeasible,
    # nondomination_rank[i] <= nondomination_rank[j] always holds and we can separate the sortings
    # for feasible trials and infeasible trials.
    # Ref: Definition 1 by K. Deb et al. in
    # `A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II`
    if len(penalty) != len(loss_values):
        raise ValueError(
            f"The length of penalty and loss_values must be same, but got "
            f"len(penalty)={len(penalty)} and len(loss_values)={len(loss_values)}."
        )

    penalty[np.isnan(penalty)] = np.inf
    is_feasible = penalty <= 0
    nondomination_rank = np.zeros(len(loss_values), dtype=int)
    nondomination_rank[is_feasible] += calculate_nondomination_rank(loss_values[is_feasible])
    nondomination_rank[~is_feasible] += calculate_nondomination_rank(
        penalty[~is_feasible, np.newaxis],
    ) + np.max(nondomination_rank[is_feasible], initial=-1) + 1
    return nondomination_rank

[1] K. Deb et al. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this implementation maintains the same results as your program yet much quicker.

Thanks for the suggestion. I just move what's implemented in _tpe/sampler.py and no need to stick to the first implementation. I'll consider how to reconcile current _get_pareto_front_trials_by_trials function with your suggestion.

I think your implementation of dominance with penalty includes the unnecessary consideration of penalty for feasible cases.

In my understanding, the penalty is set to 0 when the trial is feasible, thus having no influence on the result. Anyway, I'll use the faster implementation in the suggestion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from __future__ import annotations

import itertools

import numpy as np


def is_pareto_front(trials: list[Trial]) -> np.ndarray:
    n_trials = len(trials)
    is_front = np.zeros(n_trials, dtype=bool)
    next_index = 0
    nondominated_indices = np.arange(n_trials)
    while next_index < len(trials):
        nondominated_mask = np.array([not dominates(t, trials[next_index]) for t in trials])
        trials = list(itertools.compress(trials, nondominated_mask))
        nondominated_indices = nondominated_indices[nondominated_mask]
        next_index = np.sum(nondominated_mask[:next_index]) + 1

    is_front[nondominated_indices] = True
    return is_front


def calculate_nondomination_rank(trials: list[Trial]) -> np.ndarray:
    n_trials = len(trials)
    ranks = np.zeros(n_trials, dtype=int)
    rank = 0
    indices = np.arange(n_trials)
    while indices.size > 0:
        on_front = is_pareto_front(trials)
        ranks[indices[on_front]] = rank
        indices = indices[~on_front]
        trials = list(itertools.compress(trials, ~on_front))
        rank += 1

    return ranks

This is Just a memo for future discussion, please ignore it for now.

When using dominates, the runtime will be much much longer compared to the vectorization version, but it still runs quicker than creating the dominance matrix.
This implementation is much slower because:

  1. we do not use vectorization,
  2. we cannot pre-sort trials so that trials[j] cannot dominate trials[i] for $i &lt; j$, and
  3. we cannot assume uniqueness in the trials.

@HideakiImamura HideakiImamura self-assigned this Dec 25, 2023
@HideakiImamura
Copy link
Member

@Alnusjaponica It seems to me that if we include the current PR, it would not be in the form of passing the dominates function to the function that does the fast non-dominated sort. What do you think about this?

Copy link
Contributor

github-actions bot commented Jan 4, 2024

This pull request has not seen any recent activity.

@github-actions github-actions bot added the stale Exempt from stale bot labeling. label Jan 4, 2024
@nabenabe0928 nabenabe0928 removed the stale Exempt from stale bot labeling. label Jan 6, 2024
@nabenabe0928
Copy link
Collaborator

nabenabe0928 commented Jan 7, 2024

Just for the future reference, I found a Python implementation of non-dominated sort algorithm with the time complexity of $O(N \log^{M - 1} N)$.
Repository / Paper.

Copy link
Contributor

This pull request has not seen any recent activity.

@github-actions github-actions bot added the stale Exempt from stale bot labeling. label Jan 14, 2024
@Alnusjaponica
Copy link
Collaborator Author

@HideakiImamura @nabenabe0928
I resolved all the review comments. Could you take another look?

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. Almost, LGTM. I have several minor comments. PTAL.

],
dtype=np.float64,
)
objective_values *= np.array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think objective_values is appropriate here.

optuna/study/_multi_objective.py Show resolved Hide resolved
optuna/study/_multi_objective.py Show resolved Hide resolved
optuna/study/_multi_objective.py Outdated Show resolved Hide resolved
@nabenabe0928
Copy link
Collaborator

@@ -91,26 +91,31 @@ def _fast_non_dominated_sort(
len(penalty), len(objective_values)
)
)
nondomination_rank = np.zeros(len(objective_values), dtype=int)
nondomination_rank = np.full(len(objective_values), -1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we can already bound max(nondomination_rank) by n_below and nondomination_rank of n_below + 1 will not be used, so what about using n_below + 1?
Another reason why we should probably avoid -1 is that it might cause unexpected bugs in the future when some developers use nondomination_rank being always better when it is lower.
Plus, this implementation requires an ad-hoc handling of nondomination_rank=-1 in each place where the function is used.


# First, we calculate the domination rank for feasible trials.
is_feasible = np.logical_and(~is_nan, penalty <= 0)
ranks, feasible_bottom_rank = _calculate_nondomination_rank(
ranks, bottom_rank = _calculate_nondomination_rank(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we define nondomination_rank as:

nondomination_rank = np.full(len(objective_values), n_below + 1)

bottom_rank becomes bottom_rank = np.max(ranks).
Note that if np.max(bottom_rank) = n_below + 1, the processes hereafter simply define each nondomination_rank as n_below + <positive_integer>, so they will be ignored.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree what you say but it makes this PR even larger. Can I split the task as a follow-up and resolve your comment in another PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion is a little bit complicated, so I remarked the comment on #5089

optuna/study/_multi_objective.py Show resolved Hide resolved
optuna/study/_multi_objective.py Show resolved Hide resolved
optuna/study/_multi_objective.py Outdated Show resolved Hide resolved
optuna/study/_multi_objective.py Outdated Show resolved Hide resolved
@Alnusjaponica
Copy link
Collaborator Author

Thanks for your review. I've addressed nearly all comments, so please take another look.
As for whether the rank should be 0-indexed or not, I believe it's better to discuss in another PR. Do you have any opinions on this?

Copy link
Collaborator

@nabenabe0928 nabenabe0928 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work and sorry for taking long!
I left followups, so let's try checking them in (an)other PR(s)!

@nabenabe0928 nabenabe0928 removed their assignment Feb 29, 2024
@nabenabe0928 nabenabe0928 added code-fix Change that does not change the behavior, such as code refactoring. enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. and removed code-fix Change that does not change the behavior, such as code refactoring. labels Feb 29, 2024
@Alnusjaponica
Copy link
Collaborator Author

Thanks for your reviews. I remarked follow-up comment from the issue.
@HideakiImamura Could you take another look?

Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@HideakiImamura HideakiImamura merged commit 9c80d91 into optuna:master Feb 29, 2024
20 checks passed
@HideakiImamura HideakiImamura added this to the v3.6.0 milestone Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. optuna.testing Related to the `optuna.testing` submodule. This is automatically labeled by github-actions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify the implementations of non-dominated sort
4 participants