Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IComparer and IComparer<T> documentation should note ordering properties required #19571

Closed
dbjorge opened this issue Dec 6, 2016 · 20 comments
Assignees
Labels
area-System.Collections documentation Documentation bug or enhancement, does not impact product or test code
Milestone

Comments

@dbjorge
Copy link

dbjorge commented Dec 6, 2016

The documentation for IComparer (@docs.microsoft.com, @msdn) and IComparer<T> (@docs.microsoft.com, @msdn) do not note whether any particular ordering properties are expected of a valid implementation, but at least List<T>.Sort assumes some (antisymmetry, per this Connect feedback circa .NET 4.5).

The interface documentation should note what sort of ordering properties the rest of the standard library will be assuming/demanding of a Compare implementation. I'd expect this to be the usual properties of a total order.

@CIPop
Copy link
Member

CIPop commented Dec 6, 2016

cc: @rpetrusha, @AArnott

@karelz
Copy link
Member

karelz commented Dec 6, 2016

cc: @mairaw

@JonHanna
Copy link
Contributor

JonHanna commented Dec 6, 2016

Is that a responsibility of IComparer<T> or of the caller of Sort (or OrderBy etc.). If one was going to implement an order that wasn't total (e.g. XmlSchema's rules on comparing datetimes without time zones, would doing so in an IComaprer<T> be disallowed?

@dbjorge
Copy link
Author

dbjorge commented Dec 7, 2016

I would consider it reasonable to either:

  • Explicitly demand a well-defined set of ordering properties as a contract of the IComparer interfaces
  • Explicitly disclaim requiring any particular ordering guarantees as an interface contract, but include an informational warning that IComparer use sites may require certain ordering properties (with examples).

I think either of those would be a big improvement over not documenting it either way. I could be convinced that the former would be too big a breaking change to be acceptable.

In either case, I think it would improve clarity to also document it on the individual IComparer-based framework methods.

@JonHanna
Copy link
Contributor

JonHanna commented Dec 7, 2016

I think the warning is a good idea.
I think the idea of documenting the relevant methods, even more so.

@mairaw
Copy link
Contributor

mairaw commented Dec 7, 2016

I own these docs so I'd be the one making the updates. Just let me know who I should be working with to figure out what's the guidance we want to provide there. /cc @karelz

@karelz
Copy link
Member

karelz commented Dec 7, 2016

@dbjorge @JonHanna can you please suggest which text to change to what? (ideally mention the diff or old & new text) Thanks!

@JonHanna
Copy link
Contributor

JonHanna commented Dec 7, 2016

Could an extra page be added describing total ordering? A concise statement linking to it would probably be clearer and with less text added to the individual pages affected.

@mairaw
Copy link
Contributor

mairaw commented Dec 7, 2016

@JonHanna yes, it's possible to add a separate article explaining the concepts and then link to it from the affected pages.

@JonHanna
Copy link
Contributor

JonHanna commented Dec 10, 2016

Total Order

A total order is a binary relation that has the following properties:

  1. If a <= b && b <= a then a == b
  2. If a <= b && b <= c then a <= c
  3. a <= b || b <= a

In other words:

  1. For any two elements x and y, either we can decide that x should be sorted before y, that y can be sorted before x, or that they are equivalent in sort order.
  2. For any three elements, x, y and z. If we would sort x before y and y before z we would also always sort x before z. The same holds in that if we would consider x to sort equivalently to y and y to sort equivalently to z we would always consider x to sort equivalently to z.

Exceptions to Total Order

It can perhaps be easier to think of total order by considering cases that don’t have total order.

Floating point operators.

Because double.NaN <= double.NaN returns false, double.NaN < 0 returns false and 0 < double.NaN returns false, we cannot use the arithmetic operators to determine a total order on a set of floating point numbers that includes NaN.

XML Schema Datetimes without Time Zones

XML Schema defines some datatypes that are only partially ordered. One example is dateTime which represents a date and time without a timezone (comparable to DateTime in .NET). By the rules of XML Schema, 2001-03-29T12:00:57 is always sorted before 2001-04-23T09:17:39 and 2001-04-23T11:31:09 is always sorted before 2001-04-29T23:09:10 but 2001-04-23T09:17:39 can be considered neither before nor after 2001-04-23T11:31:09.

Not knowing the time zone, it could be e.g. 2001-04-23T09:17:39 in Copenhagen compared to 2001-04-23T11:31:09 in Dublin (earlier), or 2001-04-23T09:17:39 in Montreal compared to 2001-04-23T11:31:09 in Bangalore (later).

To offer a dependable ordering within the international context XML Schema time-related data types consider, one dateTime can only be considered as sorting before another if that holds for any of a possible range of timezones from -14:00 to +14:00 to have full confidence that the ordering would hold.

Importance or Total Order to Sort operations.

Many methods, especially sorting methods, that take an IComparer or IComparer<T> require that the Compare() methods of those objects give a total order. Failure to do so can result in the results not being sorted at all, exceptions, or even infinite loops. Examples include List<T>.Sort(), Array.Sort() and Enumerable.OrderBy().

Other methods may have similar issues, such as those that find minima and maxima, such as Linq's Min() and Max() extension methods.

Providing a Total Order on Partially Ordered values.

If you want to sort values that are not normally totally ordered, you need to define a total ordering for it. For example, as mentioned above double is not totally ordered because NaN cannot be placed into any meaningful position by the <, <=, =, > and >= operators. Comparer<double>.Default, Enumerable.Min() and Enumerable.Max() all solve this problem by adding their own rule that sorts NaN before any other value. This results in the three properties mentioned at the start of this article applying to the ordering.

An approach that works for many cases, is to add a tie-breaker rule onto a partial ordering. This is used here, with most comparisons being the same as with < but the "NaN equal to NaN and before everything else" providing that tie-breaking rule that makes it a total order.

@karelz
Copy link
Member

karelz commented Dec 15, 2016

@ianhays @Priya91 can you please review the text as area experts? (or pull in appropriate folks please)

@ianhays
Copy link
Contributor

ianhays commented Dec 15, 2016

@JonHanna that's a great doc on total order. 👍very concise and informative. The DateTime section is a bit verbose but explains the reasoning behind the exception to total order well.

@JonHanna
Copy link
Contributor

I split up the sentences and hopefully clarified a bit (sometimes attempts at concision end up reading more verbose than otherwise) and added a sentence at the end about using tie-breaking rules in creating total orders out of parial.

@JonHanna
Copy link
Contributor

Also fixed the copy-pasting in the example dates, as I had Montreal and Bangalore in time zones over a week apart 😁

@svick
Copy link
Contributor

svick commented Dec 16, 2016

@JonHanna

EqualityComparer<double>.Default, Enumerable.Min() and Enumerable.Max() all solve this problem by adding their own rule that sorts NaN before any other value.

Did you mean Comparer<double>.Default? EqualityComparer does not define ordering.

@JonHanna
Copy link
Contributor

@svick I did indeed. Fixed.

@ianhays
Copy link
Contributor

ianhays commented Feb 13, 2017

@mairaw did you get a chance to take a look at this?

@mairaw
Copy link
Contributor

mairaw commented Feb 13, 2017

Not yet @ianhays. I'm completely swamped at the moment with the .NET Core release.

@ianhays
Copy link
Contributor

ianhays commented Feb 13, 2017

Gotcha, thanks for the update :)

@mairaw
Copy link
Contributor

mairaw commented Apr 17, 2017

Issue moved to dotnet/docs dotnet/corefx#1939 via ZenHub

@mairaw mairaw closed this as completed Apr 17, 2017
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.0.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections documentation Documentation bug or enhancement, does not impact product or test code
Projects
None yet
Development

No branches or pull requests

8 participants