Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent comparisons #158

Open
tcbrindle opened this issue Jan 24, 2024 · 0 comments · May be fixed by #164
Open

Consistent comparisons #158

tcbrindle opened this issue Jan 24, 2024 · 0 comments · May be fixed by #164
Labels
enhancement New feature or request

Comments

@tcbrindle
Copy link
Owner

We have several adaptors and algorithms which take a comparator, and require that the comparator produces a strict weak order over the sequence elements:

  • find_min/find_max/find_minmax
  • min/max/minmax
  • sort
  • set_union/set_difference/set_symmetric_difference/set_intersection
  • cmp::min/cmp::max (for a sequence of two elements)

The comparator for all of these functions follows the classic STL model of requiring a binary predicate which returns true if the first argument is "less than" the second, defaulting to std::ranges::less. We use the standard library std::strict_weak_order concept, in most cases via our own strict_weak_order_for<Seq1, Seq2> concept which checks std::strict_weak_order for all combinations of the sequences' element_t, value_t& and common_element_t (the Flux equivalent of Ranges' std::indirect_strict_weak_order).

The outlier here is flux::compare(seq1, seq2, cmp) which requires the comparator to return a value of one of the three C++20 comparison categories (std::partial_ordering, std::weak_ordering or std::strong_ordering) and performs a lexicographical comparison of the sequence elements returning that category -- that is, the same as std::lexicographical_compare_three_way.

First of all, this is inconsistent. We shouldn't have one algorithm with a completely different interface to all the others.

A simple solution would be to change our compare to be the equivalent of std::lexicographical_compare -- that is, require just a "less than" predicate like everything else.

A potentially better solution would be to consistently use three way comparison everywhere -- specifically, requiring a comparator for all of the above algorithms (except compare) to return at least std::weak_ordering. After all, we're a C++20 library, we can do things the modern way!

As with everything though, there are pros and cons to doing this:

Pros:

  • Correctness. At the moment, strict weak ordering is basically just a semantic requirement. Although a user could theoretically supply a custom comparator that returns std::weak_ordering but "lies" about it, this seems much less likely than accidentally passing an incorrect predicate.
  • Proper handling of NaNs: Right now, flux::sort(vector<float>) compiles but does the wrong thing if the data contains NaN values. With the proposed change (assuming std::compare_three_way as the default comparator) then this will fail to compile. This forces the user to think about how they want to handle NaNs; either by passing std::strong_order to use the IEEE total ordering, or by using a custom comparator that ignores NaN values.

Cons:

  • By default, only types with a spaceship operator could be used with the Flux algorithms. Of course, defaulting the operator is very easy in C++20, but there's probably a lot of code out there that only defines the classic relational operators
  • Potentially, worse codegen. These algorithms only really care about less-than, but custom comparators (or op<=> implementations) might need to do two compares rather than one. It's probably worth benchmarking this. In particular, using std::strong_order to compare floats generates a lot more code than std::less.
  • Ergonomics: right now it's easy to sort in descending order by saying sort(vec, std::greater{}). We'd probably want to provide a comparator that inverts ordering::greater and ordering::less to allow the same thing without the user needing to write a lambda to do it themselves.
  • Unfamiliarity: the STL has used less-than predicate comparators for 30 years, and now suddenly we're asking people to do something different. Likewise, "why can't I use flux::sort on my vector of doubles, std::sort works fine"

Overall, I think this is worth investigating -- at least, benchmarking what happens if we use our current sort with a comparator defined as (roughly):

bool cmp(T& lhs, T& rhs) { return std::is_lt(std::weak_order(lhs, rhs)); }

and seeing what happens.

@tcbrindle tcbrindle added the enhancement New feature or request label Jan 24, 2024
@tcbrindle tcbrindle linked a pull request Jan 30, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant