Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should PartialOrd and Ord impls be equivalent? #41270

Closed
frankmcsherry opened this issue Apr 13, 2017 · 21 comments
Closed

Should PartialOrd and Ord impls be equivalent? #41270

frankmcsherry opened this issue Apr 13, 2017 · 21 comments
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. P-medium Medium priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@frankmcsherry
Copy link
Contributor

frankmcsherry commented Apr 13, 2017

Nightly introduces a change that alters the behavior of sorting, by using the PartialOrd methods lt, le and such, rather than Ords cmp method.

This has implications for me, on account of I am a weirdo with types that implement both PartialOrd and Ord, with the property that Ord is a strict refinement of PartialOrd. That is, whenever partial_cmp returns a non-None value then cmp returns the same value, but there are incomparable elements of the partial order that are ordered by Ord. This sounds like a mess, but is important for doing things like efficient deduplication (via sort(), dedup()). For example, I have pairs of integers that are partially ordered (via product order), and I need to be able to deduplicate them.

I think this is a "breaking change" in that code that previously did one thing now does another thing. It may be that it is intended, and that all implementations of PartialOrd not equivalent to their Ord implementation were in error and get no love, but (i) it would be great to have someone actually state that, and (ii) the PartialOrd docs could use some sprucing up (they currently say that PartialOrd is for defining sort orders, which is ...).

I can fix all of my code, by moving all the PartialOrd functionality into the Lattice trait I already have, but I don't see why an Ord implementation should imply a total PartialOrd implementation (you should add a default implementation of cmp that unwraps partial_cmp in that case).

Edit bluss pointed me at #12517 which discusses the naming and stuff, but I couldn't see that the discussion came to a conclusion on whether PartialOrd must be equivalent to Ord when it exists.

@TimNN TimNN added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 13, 2017
@TimNN
Copy link
Contributor

TimNN commented Apr 13, 2017

For the record, the PR that changed this is #39538.

@frankmcsherry
Copy link
Contributor Author

For concreteness, on play.rust-lang.org, the following code succeeds under stable, and panics on nightly. It defines a partially ordered Coord using the product order, and derives the lexicographic Ord total order, which is a refinement of the product order.

use std::cmp::Ordering;

#[derive(Debug, Eq, PartialEq, Ord, Copy, Clone)]
struct Coord {
    x: u32,
    y: u32,
}

impl PartialOrd for Coord {
    fn partial_cmp(&self, other: &Coord) -> Option<Ordering> {
        if self.x == other.x && self.y == other.y { Some(Ordering::Equal) }
        else if self.x <= other.x && self.y <= other.y { Some(Ordering::Less) }
        else if self.x >= other.x && self.y >= other.y { Some(Ordering::Greater) }
        else { None }
    }
}

fn main() {
    let c1 = Coord { x: 1, y: 2 };
    let c2 = Coord { x: 2, y: 1 };

    let mut v1 = vec![c1, c2];
    let mut v2 = vec![c2, c1];

    v1.sort();
    v2.sort();

    assert_eq!(v1, v2);
}

@TimNN
Copy link
Contributor

TimNN commented Apr 13, 2017

Nominating since this can probably be classified as either a breaking change or a regression.

@bluss
Copy link
Member

bluss commented Apr 13, 2017

T: Ord enables the comparison operators on T (while PartialOrd defines the comparison operators) and this design is only reasonable if they describe the same order. In my understanding this is a requirement of Ord and we can observe it being used across std, for example in BinaryHeap, in Iterator::max, in the slice sort methods. It is also used in an assertion in BTreeMap::range. I get the impression that implementations are picking explicit .cmp() or comparison operators depending only on if they need a ternary ordering result or not.

Ord says Trait for types that form a total order. and defines this in terms of the comparison operators. partial_cmp says This method returns an ordering between self and other values if one exists.; and the ordering does exist if the type is Ord.

This relationship should be clearer in the docs.

@frankmcsherry
Copy link
Contributor Author

If this interpretation is correct, this line from the PartialOrd docs is especially weak:

If your type is Ord, you can implement partial_cmp() by using cmp().

For what it is worth, I think it is totally reasonable to allow PartialOrd to be a relaxation of Ord and just ask people who plan on using Ord methods to actually use them rather than the semantically distinct PartialOrd methods. It may not be what you all want, but it is totally reasonable. ;)

Can you say why it is important to unify the two, when both exist? The only reason I can see is that the assumption has existed for a while, and someone would have to audit the code for lt and le mis-use (with some likely perf regressions dropping down to ternary comparisons).

@frankmcsherry
Copy link
Contributor Author

Also, the requirement that PartialOrd be total when an Ord implementation exists seriously weakens the utility of PartialOrd. It seems to mean that non-total partial orders cannot use sorting to do things like deduplication, unless one pops open their guts and uses sort_by(). That seems pretty limiting to me, but I might be non-standard in using partial orders for something other than f32 and f64. ;)

@bluss
Copy link
Member

bluss commented Apr 13, 2017

but it is totally reasonable. ;)

It's also totally reasonable that the assertion in your example program's assertion fails on nightly. PartialOrd's docs say it is “Trait for values that can be compared for a sort-order.” and your vector is being sorted in a way that is consistent with PartialOrd.

It is unreasonable if Ord inherits PartialOrd but does not describe the same order. Using T: Ord gives you access to the < operator, and it would be confusing and wrong if < didn't implement the total order then.

That seems pretty limiting to me, but I might be non-standard in using partial orders for something other than f32 and f64. ;)

Yes, the design doesn't really take anything more than floats into account. I don't think PartialOrd is designed to be useful on its own. It's a crutch for floating point numbers, and a way to separate them out from the regular totally ordered types.

Floats are one example of a type that could implement a total order, different from its partial order, but the trait Ord is not the right trait for that.

@frankmcsherry
Copy link
Contributor Author

frankmcsherry commented Apr 13, 2017

It's also totally reasonable that the assertion in your example program fails on nightly.

Yup, I think both are reasonable. I was reacting to your claim that it is unreasonable any other way. :)

Using T: Ord gives you access to the < operator, and it would be confusing and wrong if < didn't implement the total order then.

I agree that it can be confusing. However, it gives you access to the < operator because Ord requires that there is a partial order, not because Ord provides the operator. To be clear, I'm not trying to insist that you are wrong or anything, just that it is not unreasonable. I also (at the moment) think it is strictly more expressive to decouple the two.

Can I propose an alternate, less confusing design if cmp and partial_cmp should be equivalent (basically what Niko suggested in the discussion linked above): If Ord is imposing laws without changing the interface, it should not require the new method cmp. There is only one semantics that makes sense, which is to unwrap partial_cmp, so the trait can provide that implementation. You could expose it in the trait to allow efficient overrides, but at this point it is on the overriding implementor of Ord to be ensure correctness. In the example above, the bug seems to be that derive(Ord) derives an incorrect implementation (it should derive one equivalent to partial_cmp().unwrap(), and instead derives one based on lexicographic ordering of fields).

@bluss
Copy link
Member

bluss commented Apr 13, 2017

Derives are recipes that are good for common cases but not for every case. It's often wrong to request derive(Ord) when implementing PartialOrd manually — and that could be a warning.

@bluss
Copy link
Member

bluss commented Apr 13, 2017

If we imagine that Ord and PartialOrd implement two different orderings, then the sort function's bound T: Ord has access to both. It should explicitly document which of the two orderings it will use.

@frankmcsherry
Copy link
Contributor Author

I think it should be the Ord implementation always. If they are known to be the same, no worries, but if they are allowed to be different the PartialOrd implementation is not sufficient to guarantee that you can order a collection.

@alexcrichton
Copy link
Member

The libs team discussed this at triage yesterday and we believe that this issue is the same as #40192, which we closed with the intention of "PartialOrd must behave the same as Ord, if implemented". It sounds like though we definitely need to update the docs!

@frankmcsherry out disposition was to stay the current course (while updating docs), but have you found it difficult to update the code you have locally? If that takes awhile we don't mind considering backing out the change temporarily to reevaluate.

One reason we're tempted to continue to require that Ord/PartialOrd are equivalent is along the lines of what @bluss mentioned above, different parts of the standard library will react differently if they behave differently. The intent is that they're used interchangeable on Ord types, where the PartialOrd methods may just be optimizations over Ord::cmp.

@alexcrichton alexcrichton added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. regression-from-stable-to-beta Performance or correctness regression from stable to beta. and removed I-nominated regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Apr 26, 2017
@frankmcsherry
Copy link
Contributor Author

frankmcsherry commented Apr 26, 2017

I'm planning on removing PartialOrd from my code because there are already methods in std that require Ord but just use the PartialOrd implementations (e.g. cmp::min). I haven't started doing this yet; it will be annoying but at the same time re-assuring given the mystery surrounding the intended semantics of PartialOrd and Ord.

I think the main frustrating thing is going to be auditing code to make sure I don't use le anywhere inappropriate. On the plus side, I can start using it the way you all do, apparently. ;)

Edit: I have no idea if it is a breaking change, but adding a default implementation for Ord::cmp that unwraps partial_cmp would go a long way to clarifying the intent that they should be identical when implemented.

@alexcrichton
Copy link
Member

Ok in that case I believe that the outstanding issue here to be fixed is to clarify the documentation of PartialOrd and Ord especially with respect to the behavior of the two traits if both of them are implemented.

@frankmcsherry if you find it difficult to upgrade though please let us know!

@alexcrichton alexcrichton added the A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools label Apr 26, 2017
@frankmcsherry
Copy link
Contributor Author

I'll let everyone know, not to worry. ;)

@brson brson added E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. P-medium Medium priority labels May 4, 2017
@brson
Copy link
Contributor

brson commented May 4, 2017

Next step here is to update the documentation for both PartialOrd and Ord to indicate that the two must agree on the ordering of any types they are implemented on.

@Rufflewind
Copy link
Contributor

Rufflewind commented May 14, 2017

What about the relationship between Eq and Ord? Is it implied that x.cmp(y) == Equal if and only if x.eq(y)? The fact that Ord has Eq as a supertrait suggests so, but the BinaryHeap example violates it.

Edit: wrong link.

@kennytm
Copy link
Member

kennytm commented May 14, 2017

@Rufflewind Which example? There are a lot of code in that page.

@Rufflewind
Copy link
Contributor

@kennytm My link was incorrect. Here’s the correct one: https://doc.rust-lang.org/std/collections/binary_heap/#examples (“This is a larger example that implements Dijkstra's algorithm to solve the shortest path problem on a directed graph.”)

@ghost
Copy link

ghost commented May 22, 2017

@Rufflewind That's a great find in the BinaryHeap example! :)

Yes, the answer is that Ord, PartialOrd, Eq, and PartialEq must be consistent with each other.
The documentation must clearly indicate that.

More concretely, here's what is I suggest changing:

  1. The docs for Ord, PartialOrd, Eq, and PartialEq all have a section titled "How can I implement X?". Those sections should clearly state that by implementing X care must be taken to make sure that implementations of all four traits are consistent with each other, especially if some of them were derived.
  2. Fix the BinaryHeap example by implementing PartialEq and Eq manually.

Anything else I'm missing here? @brson?

@Rufflewind
Copy link
Contributor

Fix the BinaryHeap example by implementing PartialEq and Eq manually.

I think Eq can still be derived since it has no methods.

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue May 28, 2017
…ent, r=alexcrichton

Docs: impls of PartialEq/PartialOrd/Ord must agree

Fixes rust-lang#41270.

This PR brings two improvements to the docs:

1. Docs for `PartialEq`, `PartialOrd`, and `Ord` clarify that their implementations must agree.
2. Fixes a subtle bug in the Dijkstra example for `BinaryHeap`, where the impls are inconsistent.
Thanks @Rufflewind for spotting the bug!

r? @alexcrichton
cc @frankmcsherry
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue May 28, 2017
…ent, r=alexcrichton

Docs: impls of PartialEq/PartialOrd/Ord must agree

Fixes rust-lang#41270.

This PR brings two improvements to the docs:

1. Docs for `PartialEq`, `PartialOrd`, and `Ord` clarify that their implementations must agree.
2. Fixes a subtle bug in the Dijkstra example for `BinaryHeap`, where the impls are inconsistent.
Thanks @Rufflewind for spotting the bug!

r? @alexcrichton
cc @frankmcsherry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. P-medium Medium priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants