This library is solely for non-dominated search and to find Pareto optimal solutions.
There are only two functions (is_pareto_front and nondominated_rank) in this library and we describe the usage below.
The time complexity of finding the Pareto solutions is n_observations
and n_objectives
. (Kung's algorithm)
Note
There is a package for 2d costs (pygmo), but this package does not support 3d or higher dimensional costs.
$ pip install fast-pareto
from fast_pareto import is_pareto_front, nondominated_rank
import numpy as np
is_pareto_front(np.array([[0, 1], [1, 0], [1, 1]]))
>>> array([True, True, False])
is_pareto_front(np.array([[0, -1], [1, 0], [1, -1]]), larger_is_better_objectives=[1])
>>> array([True, True, False])
nondominated_rank(np.array([[2], [1], [0]]))
>>> array([2, 1, 0], dtype=int32)
nondominated_rank(np.array([[2], [1], [0]]), larger_is_better_objectives=[0])
>>> array([0, 1, 2], dtype=int32)
This function determines the Pareto front from a provided set of costs.
The arguments are a numpy array with the shape of (n_observations, n_objectives)
and a list of indices that show which objectives are "larger is better".
If None
is provided, we consider all objectives should be minimized.
This function returns the true/false mask with the shape of (n_observations, )
.
True means the corresponding observation is on the Pareto front given a set of solutions.
Note
For 2d costs, we provide is_pareto_front2d
, which is much quicker.
We need exactly the same arguments as is_pareto_front
and users can optionally specify ordered=True
to further speed up.
This function calculates the non-dominated rank of each observation.
The arguments are a numpy array with the shape of (n_observations, n_objectives)
and a list of indices that show which objectives are "larger is better".
If None
is provided, we consider all objectives should be minimized.
This function returns the non-dominated rank with the shape of (n_observations, )
.
The non-dominated rank is better when it is smaller.
For this implementation, we return zero when those observations are on the Pareto front.
You can see the examples of the results obtained by this module below.
Note that we added the tie-breaking feature using average ranks in v0.0.5
and when you specify tie_break=True
, this function returns the ranks of each observation
with tie-breaking.
For example, when we have non-domination ranks of [0, 0, 1, 1]
with tie_break=False
,
then tie_break=True
tries to differentiate those values to be such as [1, 0, 2, 3]
.
When using this feature, we will not get non-domination ranks anymore,
but if the rank for the i-th observation r[i]
and that for the j-th observation r[j]
have the relationship of r[i] < r[j]
, it is guaranteed that the non-domination rank
of the i-th observation is lower or equal to that of the j-th observation.
for n_points in [100, 1000, 10000]:
for n_costs in [1, 5, 10, 50]:
print(f"n_points={n_points}, n_costs={n_costs}")
%time nondominated_rank(np.random.normal(size=(n_points, n_costs)))
print("\n")
n_points=100, n_costs=1
CPU times: user 13.4 ms, sys: 897 µs, total: 14.3 ms
Wall time: 14.3 ms
n_points=100, n_costs=5
CPU times: user 3.41 ms, sys: 0 ns, total: 3.41 ms
Wall time: 3.42 ms
n_points=100, n_costs=10
CPU times: user 2.86 ms, sys: 0 ns, total: 2.86 ms
Wall time: 2.87 ms
n_points=100, n_costs=50
CPU times: user 4.55 ms, sys: 0 ns, total: 4.55 ms
Wall time: 4.56 ms
n_points=1000, n_costs=1
CPU times: user 190 ms, sys: 0 ns, total: 190 ms
Wall time: 190 ms
n_points=1000, n_costs=5
CPU times: user 45.4 ms, sys: 0 ns, total: 45.4 ms
Wall time: 45.4 ms
n_points=1000, n_costs=10
CPU times: user 68.6 ms, sys: 0 ns, total: 68.6 ms
Wall time: 68.7 ms
n_points=1000, n_costs=50
CPU times: user 116 ms, sys: 0 ns, total: 116 ms
Wall time: 116 ms
n_points=10000, n_costs=1
CPU times: user 5.16 s, sys: 0 ns, total: 5.16 s
Wall time: 5.17 s
n_points=10000, n_costs=5
CPU times: user 2.18 s, sys: 2.16 ms, total: 2.19 s
Wall time: 2.19 s
n_points=10000, n_costs=10
CPU times: user 4.42 s, sys: 0 ns, total: 4.42 s
Wall time: 4.42 s
n_points=10000, n_costs=50
CPU times: user 15.7 s, sys: 7.47 ms, total: 15.7 s
Wall time: 15.7 s
To supplement the knowledge, I note the definition of non-dominated rank.
Suppose we would like to minimize the multiobjective function
When there is no such observation that dominates