Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`sort_by` takes a comparison fn, but `min_by`/`max_by` take a "scoring" fn #15311

Closed
ben0x539 opened this Issue Jul 1, 2014 · 21 comments

Comments

Projects
None yet
@ben0x539
Copy link
Contributor

ben0x539 commented Jul 1, 2014

This is sorta inconsistent :(

fn sort_by(self, compare: |&T, &T| -> Ordering)

fn max_by<B: Ord>(&mut self, f: |&A| -> B) -> Option<A>

Having a comparison function is kinda more useful because then you could use min_by/max_by with non-total Ord types more easily, but it's also a bit more effort in the easy case. I dunno. fwiw, ruby's sort_by takes a "scoring"-like block and haskell's sortBy takes a comparison function.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 5, 2014

cc @aturon

One possibility could be that foo_by takes T to U and works on U, where foo_with takes (T, T) and goes to whatever's being expected (in these cases an Ordering).

@P1start

This comment has been minimized.

Copy link
Contributor

P1start commented Jan 25, 2015

Has there been any progress on this? I only recently noticed this inconsistency, and it looks like sort_by has been stabilised even though {min,max}_by have not (due to this issue). I like @alexcrichton’s suggestion about _by v.s. _with, but the stable sort_by does the opposite, using a comparison function even though it ends in _by.

Python 2 provides both cmp (basically, (T, T) -> Ordering) and key (T -> U) parameters to its sorted built-in, but Python 3 switched to key alone, removing the old cmp parameter. Perhaps we should follow Python’s lead and switch to key functions, although I’m not entirely sure what the motivation for them is (other than simplicity in some cases).

@noctune

This comment has been minimized.

Copy link
Contributor

noctune commented Feb 1, 2015

Key functions would not work well for min/max things without a total order like floats. Ideally you would use comparison based {min,max}_by and then use a function like Haskells comparing, as @kmcallister mentioned in #17648, for when you only want to compare keys.

Unfortunately, currently Rust functions cannot return a closure, so creating a comparing-like function seems impossible right now. So if it is comparison based it ends up being unergonomic for key comparison, and if its key based it ends up being difficult or impossible to express min or max for things without a total order.

@noctune

This comment has been minimized.

Copy link
Contributor

noctune commented Feb 11, 2015

Scratch that, you actually CAN make a comparing-like function:

#![feature(unboxed_closures)]
#![feature(core)]

use std::cmp::Ordering;

pub fn comparing<F,T,B>(f: F) -> Comparing<F> where
    F: Fn(&T) -> B,
    B: Ord
{
    Comparing(f)
}

pub struct Comparing<F>(F);

impl<'a,'b,F,T,B> Fn<(&'a T, &'b T)> for Comparing<F> where
    F: Fn(&T) -> B,
    B: Ord
{
    type Output = Ordering;

    extern "rust-call" fn call(&self, args: (&T, &T)) -> Ordering {
        self.0(args.0).cmp(&self.0(args.1))
    }
}

#[derive(Debug)]
struct A(u8);

fn main() {
    let mut a = [A(4), A(2), A(6), A(9)];
    a.sort_by(comparing(|x: &A| x.0));
    println!("{:?}", a);
}

I can't get it to properly infer the parameter to the closure though. This would make it a lot more ergonomic to use a comparator based min_by and max_by.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Feb 19, 2015

1.0 beta, P-backcompat-libs, I-needs-decision.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Feb 21, 2015

I'm 👍 of having {sort,min,max}_{by,with}, where *_by takes a Fn(U) -> T and *_with takes a Fn(U, U) -> Ordering, more-or-less.

@kmcallister

This comment has been minimized.

Copy link
Contributor

kmcallister commented Feb 21, 2015

2×3 = 6 functions isn't too bad as combinatorial explosion goes, but it still indicates that an important abstraction is missing. For example Option's methods had the same issue before we worked out as_ref() and such.

In this case the clear alternative is Haskell's comparing, which transforms the _with forms into the _by forms. I'm not certain that the ergonomics are acceptable in Rust, but I think they probably could be.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Feb 22, 2015

It looks like comparing can call the conversion function multiple times for the same item, is that correct? That might be undesired if the conversion is expensive.

@noctune

This comment has been minimized.

Copy link
Contributor

noctune commented Feb 22, 2015

@shepmaster Yes, but you can avoid that cost by using a map, though somewhat less ergonomic. Let's say we want to find the max of some computationally expensive f(x). We could then just call i.map(|x| (x, f(x))).max_by(comparing(|(_,h)| h)).

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Feb 22, 2015

I agree with the "less ergonomic" comment - there's yet another map to get back to the original object:

i.map(|x| (x, f(x))).max_with(comparing(|(_, h)| h)).map(|(v, _)| v)

Maybe if nothing else, maybe this bug can be to just rename and normalize the *_by methods to *_with to prevent misunderstanding. *_with is probably the more powerful option, and future work could add comparing or _*by as determined.

@kmcallister

This comment has been minimized.

Copy link
Contributor

kmcallister commented Feb 22, 2015

Yeah,

i.map(|x| (x, f(x))).max_with(comparing(|(_, h)| h)).map(|(v, _)| v)

looks a lot better in Haskell:

map fst . maximumBy (comparing snd) $ map (\x -> (x, f x)) i

Although the right-to-left data flow always bugged me. Naturally, this code can be made even shorter and less clear.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 19, 2015

Upon thinking more on this, I'm starting to convince myself that the scoring fn variant is not all that useful for Rust unfortunately. For example if you have a struct Person which has a name: String and you want to sort an array of people by their name, you'd in theory want to do:

people.sort_by(|person| &person.name);

Unfortunately you cannot return a reference into the argument passed into the closure, so this will not compile. This means you'll need to either call clone (bad) or move into a wrapper struct (bad). This somewhat indicates to me that we want |A, B| -> C for these functions.

Additionally, the interaction between sort and max/min is interesting in that sort can greatly benefit from the closure returning an Ordering (e.g. the item in question could go either left or right in the sort). The min/max functions, however, only really need a bool return value indicating whether A is less than B (or greater than). For this case, returning an Ordering is probably overkill and may even be more inefficient in some cases.

There is a downside, however, to not returning Ordering, which is that it's not super clear what the bool returned means. Does it mean that A should be accepted? Does it mean that A is the smallest? Does it mean that B is the smallest? This is somewhat countered by the fact though that the documentation could clearly say to call a < b which may solve this problem.

All this is basically leading me to:

  1. Scoring functions may not be too useful, so comparison functions should probably be favored.
  2. Comparison functions may only want to return Ordering when required, otherwise a bool should be favored for performance reasons.

How does that sound to others?

@EdorianDark

This comment has been minimized.

Copy link
Contributor

EdorianDark commented Mar 22, 2015

For the min/max funcitons it is probably better to introduce a new enum {smaller, greatorOrEqual} or to return an bool.

The sort_by function is not very to discover. In servo there is even a other sort function https://github.com/servo/rust-selectors/blob/master/src/quicksort.rs .
They are also discussing the api servo/rust-selectors#11 .

I think it would be better to leave the api unstable and clean it with an rfc.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Mar 24, 2015

@alexcrichton You've raised some excellent points -- in particular about scoring not being as usable as one might imagine at first.

I'm going to take this off the milestone; I think the sort_by API, which is stable, is fine. We can work out exactly which API to use for max_by/min_by when we stabilize them, but there's not a great deal of pressure to stabilize them right this second.

triage: P-high ()

@aturon aturon removed this from the 1.0 beta milestone Mar 24, 2015

@gavinb

This comment has been minimized.

Copy link
Contributor

gavinb commented Apr 9, 2015

It would be great if max_by returned an Ordering like sort_by. This would enable comparisons with more than simply values, such as ordering by a value but returning a tuple so that you can get the index of the maximum value.

@alexcrichton alexcrichton referenced this issue Apr 9, 2015

Closed

APIs stabilization metabug #24028

5 of 29 tasks complete

Nemikolh added a commit to oil-lang/oil-rs that referenced this issue Apr 18, 2015

Added temporary wrapper to make f32 verify Ord.
Once max_by will be updated, this wrapper will be removed.
See rust-lang/rust#15311
@yongqli

This comment has been minimized.

Copy link

yongqli commented Jun 26, 2015

I'd like to see this issue resolved as well. It might also make sense to implement max_by for f64, like how there is a f64::max. We've currently had to write our own max_by for f64s.

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Sep 6, 2015

I wouldn't mind if the function was renamed (max_with, max_using or just inconsistent name are all fine by me), but I like its simplicity. It's a very common pattern in my programs, so I like having a shortcut for this.

However, I wouldn't want to it to require Ordering, or some gymnastics with extra closures, since that would make it verbose to the point it's easier to write a for loop that does the same.

Clash with Ord-less f64 is a problem indeed, but it's a pervasive problem that affects all such functions in Rust, so I think it should be dealt with for all functions (on language level), not just by tweaking this one function.

@photino

This comment has been minimized.

Copy link

photino commented Sep 13, 2015

I would like to see this issue resolved. Usability, clarity, and consistency is more important. It will be great to implement max_by/min_by like sort_by using a comparison function returning an Ordering. Not just f32 and f64, there are many other types that don't have an Ord.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 14, 2015

I do find scoring useful in practice, personally. I can see how the example "score" of person.name is bad, requiring cloning and all, but it's still useful when you're dealing with something that may be expensive to compute. Suppose I have a list of directories, and I want the one with the biggest total size -- that recursive traversal is potentially expensive, but the resulting score is a value I can just Copy. Yes, the map workaround is feasible, but a scoring max_by or max_with or whatever will still be nicer.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Jan 5, 2016

We decided in #27724 to go with max_by_key for the existing max_by functionality (resp. min_by_key & min_by), with max_by probably eventually taking the -> Ordering comparison function.

@huonw huonw closed this Jan 5, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.