Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List(1.0, -1.0).sorted gives a deprecation warning #11844

Open
eed3si9n opened this issue Jan 2, 2020 · 54 comments · May be fixed by scala/scala#8625
Open

List(1.0, -1.0).sorted gives a deprecation warning #11844

eed3si9n opened this issue Jan 2, 2020 · 54 comments · May be fixed by scala/scala#8625
Labels

Comments

@eed3si9n
Copy link
Member

@eed3si9n eed3si9n commented Jan 2, 2020

Originally posted by @odersky in #10511 (comment)

Ref scala/scala#6323
Ref scala/scala#6410

steps

scala> List(1.0, -1.0).sorted
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res0: List[Double] = List(-1.0, 1.0)

problem

It gives a deprecation warning. I think this is very bad, not to say catastrophic.

expectation

These things should just work. I don't care what ordering is used and whether it is inconsistent or not (we all know IEEE is broken and unfixable). But we should do the right thing for everyday users. As an everyday user I do not care what the ordering does for NaN and -0.0, maybe I do not even know that these weird numbers exist! (and they should not exist, if I go by common mathematical logic). But I do care that I can sort a list of doubles without going into the deeper mysteries of Scala's implicit system. This requirement is now violated.

notes

I think we should revert ASAP. We should keep the old total ordering and offer IEEE as an optional alternative that needs to be imported explicitly.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@NthPortal wrote:

I think we should revert ASAP. We should keep the old total ordering and offer IEEE as an optional alternative that needs to be imported explicitly.

The issue is, the old behaviour was the IEEE ordering, so if we switch to a total ordering by default now, there's no transition period. Are we okay with that? (see also scala/scala#6410 (comment))

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@odersky wrote:

I believe we should simply keep the old behavior. My opinion about NaN and -0.0 is:

  • we are under no obligation to ensure that primitive types and boxed types behave the same
  • we are under no obligation to make the ordering total or not.
  • we should do the same thing that Java does
  • we should not make operations that do not involve NaN and -0.0 harder than before.

My rationale is that these constructs are illogical anyway. No matter what you do, there will be weirdness. So, the best thing to do is not upset the apple cart.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@NthPortal wrote:

FWIW, Java uses a total ordering on primitives:

scala> Array[Double](5.0, 3.0, NaN, 4.0)
res0: Array[Double] = Array(5.0, 3.0, NaN, 4.0)

scala> java.util.Arrays.sort(res0)

scala> res0
res3: Array[Double] = Array(3.0, 4.0, 5.0, NaN)
@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@NthPortal wrote:

Personally, I think we have an obligation to have Orderings be internally consistent (compare(_, _) < 0 iff lt(_, _) == true, etc.), which happens to mean having a total ordering.

Another possibility of course is to have the default ordering just throw an exception if you give it NaN ;P

If we must un-deprecate it, I feel the default should be the total ordering. I don't care particularly about how 0.0/-0.0 compare, as long as it's consistent, but I care that operations involving NaN (which is concerningly easy to summon) behave sanely and consistently. It happens to be that the current (deprecated) default is a total ordering, so we can just remove the annotation and fix up a few tests.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@odersky wrote:

Java uses total ordering right? So why and when did Scala deviate from this to use IEEE ordering?

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@NthPortal wrote:

@odersky it changed in #5104 and scala/scala#76

In essence the issue is that there is an expectation (a valid one) that min and max should comply with the IEEE specification (i.e. with java.lang.Math.{min,max}). Unfortunately, that is not consistent with a total ordering.

I highly recommend reading all of the discussions on scala/scala#6323 and scala/scala#6410 (even though they're long) for the full context of how we decided on the solution we chose.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@dwijnand wrote:

It's a shame that Java is inconsistent between java.util.Arrays.sort and java.lang.Math.{min,max}.

Can we capture this explanation and put it in on a page of docs.scala-lang.org and refer to it in the Scaladoc? Or put the whole explanation in the Scaladoc. Note @som-snytt had some thoughts around those docs in scala/scala#8191 (comment) too.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@odersky wrote:

I have already expressed my opinion. List(1.0, -1.0).sorted just has to work, no excuses allowed.

The fact that it stopped working, despite the fact that a lot of eyes looked at it, is worrying to me. Did anyone think it was acceptable that this would cause an implicit ambiguity message? Or did it just slip through with nobody realizing the consequences?

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@odersky wrote:

Can we capture this explanation and put it in on a page of docs.scala-lang.org and refer to it in the Scaladoc? Or put the whole explanation in the Scaladoc. Note @som-snytt had some thoughts around those docs in scala/scala#8191 (comment) too.

I would be skeptical about this. The fact is: NaN is irredeemably broken. It was introduced as a hack since computer manufacturers in the 70s did not think they could provide precise exceptions and Fortran did not provide exception handling. But it makes no sense, mathematically. So yes, of course things will end up being inconsistent somewhere, and the only sensible strategy is to accept that and ignore the problem (which is what Java did, and I think they did the right thing).

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@dwijnand wrote:

Did anyone think it was acceptable that this would cause an implicit ambiguity message? Or did it just slip through with nobody realizing the consequences?

Speaking only for myself, I thought it was acceptable, given the circumstances.

Can we capture this explanation and put it in on a page of docs.scala-lang.org and refer to it in the Scaladoc? Or put the whole explanation in the Scaladoc. Note @som-snytt had some thoughts around those docs in scala/scala#8191 (comment) too.

I would be skeptical about this. The fact is: NaN is irredeemably broken.

I'll give others a chance to weigh in (particularly Nth, as I expect she's spent a long time looking at all the angles), but I still think that whatever happens (change or no change) it would be good to have some kind of documentation capturing the trade-off chosen.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@odersky wrote:

I agree it's good to document, but any such documentation cannot be a reason to keep the current broken behavior.

For background: I came across this because a professor of mathematics was stumped by the behavior and did not know how to proceed.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

There are 3 pull requests for this by @NthPortal:

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 2, 2020

The biggest problem here, by far, is that we're abusing deprecations to provide an informational message. If the thing said, always, and only for min/max/gt etc., not compare or sorted

Note: this code assumes that NaN is handled under a "total ordering", where NaN
is considered to be larger than positive infinity in both sorting and min/max.
  If you wish to keep using a total ordering, import scala.math.Ordering.Double.TotalOrdering
  If you prefer to use IEEE754 ordering, import scala.math.Ordering.Double.IeeeOrdering

and on web interfaces, this was reduced to a popup, then there's no problem. But right now, we get a weird cryptic message about deprecations, without showing the deprecation, on an utterly normal-looking piece of code.

On the other hand, which behavior is followed can be critical for the correct operation of numeric code, yet difficult to debug when behavior changes, so I don't think we can just go changing it without giving users some indication that they need to examine the code.


The right solution is to introduce another trait, Extremal, that is used for min and max. Then sorted can use a total ordering, but min and max use the IEEE754 behavior.

This isn't binary compatible, and even then it's inconsistent with 2.13 behavior, but it matches 2.12 behavior and follows the "principle of least surprise" (i.e. things work the same way).


A pragmatic solution might be just to make DeprecatedDoubleOrdering un-deprecated, and restore the internal inconsistency in it where min and max would give the IEEE754 behavior, which isn't consistent with the answer you'd get from compare. Then things would just work, and the danger would be averted, and the afficionados who wanted an internally consistent ordering could go get the one they wanted.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

So here are some proposed criteria of success:

  • don't use deprecation on List(1.0, -1.0).sorted or .max
  • we should do the same thing that Java does
  • principle of least surprise (don't silently break simple looking code from Scala 2.12?)
  • bincompat with Scala 2.13.0
  • internal consistency does not matter

Is there a solution that fit this bill?

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Jan 2, 2020

  1. Un-deprecate DeprecatedOrdering
  2. Change the implementation of x min y and x max y so that the special-case {Float,Double}.DeprecatedOrdering, inn order to apply IEEE behavior.

Disgusting, but it fits the bill except for the fact that sorting behavior is still not the same as 2.12 (but I believe we all agree that the behavior of sorting in 2.12 was utterly broken, and that the new behavior is better).

Better the solution in a future where we can break bin compat: rewrite x min y so that they don't use 5 layers of abstraction that are slow and don't even work, along with all the other operations in RichXyz (I have a branch with that already).

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

summary

  • java.util.Collections.max and java.lang.Math.max do different things

    • Both java.util.Collections.min / max and Scala 2.13.1 select min and max values to be -0.0 and NaN respectively.
    • Scala 2.12.10's Array#min and Array#max values are 0.0 and 3.0 respectively, which seems like an implementation bug.
  • sorting looks ok!

    • java.util.Arrays, java.util.Collections, Scala 2.12.10, and Scala 2.13.1 with or without IEEE ordering all sorts Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0) to be Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN). << @NthPortal is there a corner case that sorts differently?

Edit:

  • generic min and max operators did in fact change
    • java.lang.Math.min / max uses NaN for both. 2.0 max NaN and 2.0 min NaN both return NaN as well for both Scala 2.12.10 and 2.13.10. However, Scala 2.12.10's typeclassMin(2.0, NaN) returns NaN whereas Scala 2.13.1 returns 2.0.

java.util.Arrays sorting primitive double

scala> import java.lang.Double.NaN
import java.lang.Double.NaN

scala> val xs = Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0); java.util.Arrays.sort(xs); xs
xs: Array[Double] = Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)
res0: Array[Double] = Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

java.util.Collections sorting boxed java.lang.Double

scala> val ys = java.util.Arrays.asList(2.0: java.lang.Double, 1.0: java.lang.Double, java.lang.Double.NaN: java.lang.Double, 3.0: java.lang.Double, 0.0: java.lang.Double, -0.0: java.lang.Double); java.util.Collections.sort(ys); ys
ys: java.util.List[Double] = [-0.0, 0.0, 1.0, 2.0, 3.0, NaN]
res4: java.util.List[Double] = [-0.0, 0.0, 1.0, 2.0, 3.0, NaN]

scala> java.util.Collections.max(ys)
res5: Double = NaN

scala> java.util.Collections.min(ys)
res6: Double = -0.0

java.lang.Math

scala> java.lang.Math.max(2.0, NaN)
res8: Double = NaN

scala> java.lang.Math.min(2.0, NaN)
res9: Double = NaN

Scala 2.12.10

[info] Starting scala interpreter...
Welcome to Scala 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.

scala> import java.lang.Double.NaN
import java.lang.Double.NaN

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
res0: Array[Double] = Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
res1: Double = 3.0

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
res2: Double = 0.0

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
res3: List[Double] = List(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
res4: Double = 3.0

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
res5: Double = 0.0

scala> 2.0 max NaN
res6: Double = NaN

scala> 2.0 min NaN
res7: Double = NaN

scala> :paste
// Entering paste mode (ctrl-D to finish)

def typeclassMax[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a max aa
}
def typeclassMin[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a min aa
}

// Exiting paste mode, now interpreting.

typeclassMax: [A](a: A, aa: A)(implicit evidence$1: Ordering[A])A
typeclassMin: [A](a: A, aa: A)(implicit evidence$2: Ordering[A])A

scala> typeclassMax(2.0, NaN)
res0: Double = NaN

scala> typeclassMin(2.0, NaN)
res1: Double = NaN

Scala 2.13.1

Welcome to Scala 2.13.1 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.

scala> import java.lang.Double.NaN
import java.lang.Double.NaN

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res0: Array[Double] = Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res1: Double = NaN

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res2: Double = -0.0

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res3: List[Double] = List(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res4: Double = NaN

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res5: Double = -0.0

scala> 2.0 max NaN
res6: Double = NaN

scala> 2.0 min NaN
res7: Double = NaN

scala> :paste
// Entering paste mode (ctrl-D to finish)

def typeclassMax[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a max aa
}
def typeclassMin[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a min aa
}

// Exiting paste mode, now interpreting.

typeclassMax: [A](a: A, aa: A)(implicit evidence$1: Ordering[A])A
typeclassMin: [A](a: A, aa: A)(implicit evidence$2: Ordering[A])A

scala> typeclassMax(2.0, NaN)
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res8: Double = NaN

scala> typeclassMin(2.0, NaN)
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res9: Double = 2.0

Scala 2.13.1 + IEEE ordering

Welcome to Scala 2.13.1 (OpenJDK 64-Bit Server VM, Java 1.8.0_222).
Type in expressions for evaluation. Or try :help.

scala> import Ordering.Double.IeeeOrdering
import Ordering.Double.IeeeOrdering

scala> import java.lang.Double.NaN
import java.lang.Double.NaN

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
res0: Array[Double] = Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
res1: Double = NaN

scala> Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
res2: Double = NaN

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted
res3: List[Double] = List(-0.0, 0.0, 1.0, 2.0, 3.0, NaN)

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).max
res4: Double = NaN

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
res5: Double = NaN

scala> 2.0 max NaN
res6: Double = NaN

scala> 2.0 min NaN
res7: Double = NaN

scala> :paste
// Entering paste mode (ctrl-D to finish)

def typeclassMax[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a max aa
}
def typeclassMin[A: Ordering](a: A, aa: A): A = {
  val ord = implicitly[Ordering[A]]
  import ord._
  a min aa
}

// Exiting paste mode, now interpreting.

typeclassMax: [A](a: A, aa: A)(implicit evidence$1: Ordering[A])A
typeclassMin: [A](a: A, aa: A)(implicit evidence$2: Ordering[A])A

scala> typeclassMax(2.0, NaN)
res8: Double = NaN

scala> typeclassMin(2.0, NaN)
res9: Double = NaN
@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@sjrd

  1. Un-deprecate DeprecatedOrdering

I agree that just undeprecating DeprecatedOrdering seems like the way to go.

  1. Change the implementation of x min y and x max y so that the special-case {Float,Double}.DeprecatedOrdering, in order to apply IEEE behavior.

I don't know if that's necessary since RichDouble's def max delegates to scala.math.max, which delegates to java.lang.Math.max.

@Ichoran gave me an example code.

scala> def typeclassMin[A: Ordering](a: A, aa: A): A = {
     |   val ord = implicitly[Ordering[A]]
     |   import ord._
     |   a min aa
     | }
typeclassMin: [A](a: A, aa: A)(implicit evidence$1: Ordering[A])A

scala> typeclassMin(2.0, NaN)
warning: there was one deprecation warning (since 2.13.0); for details, enable `:setting -deprecation' or `:replay -deprecation'
res1: Double = 2.0

Better the solution in a future where we can break bin compat: rewrite x min y so that they don't use 5 layers of abstraction that are slow and don't even work, along with all the other operations in RichXyz (I have a branch with that already).

I think the surprising thing about this is that x max y (IEEE ordering) does not match the result of List(x, y).max, which is consistent with Java and Scala 2.12 but surprising nonetheless. When we can break the bincompat in the future, maybe we should deprecate def max from RichDouble in favor of either calling math.max or Ordering[Double].max. The "math" part hopefully conveys that it's IEEE compliant.

@martijnhoekstra

This comment has been minimized.

Copy link

@martijnhoekstra martijnhoekstra commented Jan 2, 2020

The "math" part hopefully conveys that it's IEEE compliant.

I wouldn't count on it.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 2, 2020

@martijnhoekstra You're right. I guess then rename to def ieeeMax?

@martijnhoekstra

This comment has been minimized.

Copy link

@martijnhoekstra martijnhoekstra commented Jan 2, 2020

@martijnhoekstra You're right. I guess then rename to def ieeeMax?

Personally, I would see more in an ieee.max, but it's somewhat academic when the first opportunity for bincompat breakage is 3.2

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 2, 2020

Like it or not, the IEEE754 implementation of floating point numbers is ubiquitous. I do not think we can simply ignore it as inconvenient. Weighing down "max" with a bunch of extra letters basically just means that people won't use it and/or can't read math with it, at which point you'd be doing everyone a favor by just removing it entirely.

Every operation a □ b from ℝ² → ℝ has the property in IEEE754 mathematics that it is equivalent to

val oa = if (a.isNaN) None else Some(a)
val ob = if (b.isNaN) None else Some(b)
val oc = for { ra <- oa; rb <- ob } yield ra □ rb
oc.getOrElse(NaN)

IEEE754 doesn't define max explicitly, but given that everything else works this way, it's highly incongruent to have 2.0 min NaN return 2.0; unlike with all other binary operations, it silently fails to propagate error conditions.

IEEE754 does define a total ordering, and Java botches it, as do we. All the NaNs, of all types, go to the end, in random order. However, IEEE754 defines a total ordering where the negative NaNs come first, then the positive ones come after. Within the NaNs, quiet NaN is more extreme than the signaling NaNs. (It also makes -0.0 come before 0.0.) This is, unfortunately, really inconvenient because you now have to consider two different places where NaNs might appear in a sorted list. (This maintains the invariant that xs.map(x => -x).sorted is the same as xs.map(x => -x).reverse.)

People like to complain about IEEE754, but if you actually think carefully through all the issues, keeping in mind the hardware and data type constraints, and that speed and correctness are both very important, it's actually really hard to do any better. The standard isn't there by accident. It is hard to use correctly. This is an intrinsic awkwardness with the reals, compounded by the very necessary constraint to use finite precision.

So I think the correct thing to do is adhere to the standard whenever possible. If Ordering is supposed to be total and internally consistent, then Float and Double can't have one. (This is what Rust does; it only has a PartialOrd<f64>, not an Ordering<f64>.) If we admit partial orderings into Ordering, then the IEEE754 rules are fine as is; or we can allow inconsistency within different methods of Ordering.

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 3, 2020

java.util.Arrays, java.util.Collections, Scala 2.12.10, and Scala 2.13.1 with or without IEEE ordering all sorts Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0) to be Array(-0.0, 0.0, 1.0, 2.0, 3.0, NaN). << @NthPortal is there a corner case that sorts differently?

not that I'm aware of. I believe all of Scala's sorts end up deferring to j.u.Arrays.sort somewhere, and that (since it uses a Comparator) ends up calling Ordering#compare, which is not broken anywhere (because it requires you to return some Int).

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 3, 2020

@Ichoran

Every operation a □ b from ℝ² → ℝ has the property in IEEE754 mathematics that it is equivalent to

val oa = if (a.isNaN) None else Some(a)
val ob = if (b.isNaN) None else Some(b)
val oc = for { ra <- oa; rb <- ob } yield ra □ rb
oc.getOrElse(NaN)

Thanks for this. If the notion of Double is like Option[DecimalNumber] as you portray, then I don't even know if it's appropriate to use Ordering for both the concept of sorting a list and a min x. In that world, shouldn't Array(2.0, 1.0, NaN, 3.0, 0.0, -0.0).sorted also return Array(NaN) so that its max and min value would match 2.0 max NaN?

Java did the right thing by separating the responsibilities to java.util.Collections and java.lang.Math with this regard, whereas in Scala it's not clear what we mean when we say min or max.

  • a min b should use IEEE ordering, like java.lang.Math, regardless of the Ordering[Double] that's in scope.
  • List#min should return List#sorted.head, like java.util.Collections.
  • Given the different semantics, RichDouble#min and List#min should have different method names, and provide migration path.
@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 3, 2020

@eed3si9n - That's a reasonable strategy that doesn't compromise anything too much.

Given the heavy numeric use of min and max, and the importance of naming consistency, min and max should require a Numeric while some other name (maybe minimum and maximum) can be used to pick out the first and last things in sort-order.

But how do we get there, and what do we do in the short run to make List(1.0, -1.0).sorted not ominously announce a deprecation without actually supplying any actionable information? (And avoid silently changing the meaning of numeric code written for 2.12 and before?)

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 3, 2020

But how do we get there, and what do we do in the short run to make List(1.0, -1.0).sorted not ominously announce a deprecation without actually supplying any actionable information? (And avoid silently changing the meaning of numeric code written for 2.12 and before?)

1. Undeprecate `DeprecatedOrdering` 2. Change the implementation of `x min y` and `x max y` such that in all instances they use IEEE ordering, like `java.lang.Math` (This will match the 2.12 semantics) 3. Add `x minimum y` and `x maximum y` operators that would use the same ordering as sorting order `compare(x, y)` 4. Add `List#minimum` and `List#maximum` that returns the `List#sorted.head` and `List#sorted.last` 5. Deprecate `List#max` and `List#min` after pointing them to `List#maximum` and `List#minimum` (We get to deprecate the methods whose implementation must change from 2.12)
@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 3, 2020

@eed3si9n - That sounds like a good plan for the long run. But several of those changes are not binary compatible, and I don't think we want to be locked into the current situation until 3.1 or whatever. Do we just break bincompat in the interests of getting this right?

Also, do we have < agree with compare or with IEEE754 for the default ordering? (And do we continue to supply both?)

We should also consider switching List#max and List#min to take a Numeric rather than an Ordering. The most common uses should go through just fine (not binary compatible, but it is source compatible); the weird ones can be fixed by hand.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Jan 3, 2020

I have already proposed a solution at #11844 (comment). Is there any objection to that one, at least for what we can do in a bin compat way?

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 3, 2020

@sjrd - I don't understand the difference between that one and just using IeeeOrdering

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 3, 2020

@sjrd I think your proposed solution is good. My 2. takes it further so even with TotalOrdering x min y would use IEEE behavior.

@Ichoran

I don't understand the difference between that one and just using IeeeOrdering

By continuing to use DeprecatedOrdering, List#min would return sorted.head:

scala> List(2.0, 1.0, NaN, 3.0, 0.0, -0.0).min
res5: Double = -0.0
@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 3, 2020

@eed3si9n - That's dependent on the implementation of min, but it's true, that's potentially different.

Why is that preferable to just using IeeeOrdering?

At least IeeeOrdering gives the same answer for < regardless of whether it's a generic context.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 4, 2020

@Ichoran

Why is that preferable to just using IeeeOrdering?

  1. In my opinion, IeeeOrdering does not qualify as an Ordering[Double] since it does not implement sorting order.
  2. In my Math-Collection power-sharing world view, < sits in the NaN-propagating IEEE world whereas I expect List#min (renamed to List#minimum) to return the sorted.head.

Also, do we have < agree with compare or with IEEE754 for the default ordering? (And do we continue to supply both?)

We should treat < as a mathematical operator. NaN < 2.0 is obligated to return false by IEEE and tradition.

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 4, 2020

@eed3si9n - IeeeOrdering implements sorting order with compare, just not with other methods.

I agree that it doesn't maintain the kinds of invariants you'd generally want an ordering to obey. But transiently it retains the 2.12 behavior; having an un-alerted change to 2.13 code is, I think, less scary than changing all of 2.12 and before w.r.t. anything that Ordering was used for.

So for a temporary measure, I'm not sure it's the way to go. At least I'm not able to articulate what principle we're following in order to make it the best choice, given that we're already committed to no deprecation messages, and are trying not to silently change the behavior of older code.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 5, 2020

@dwijnand

I still think that whatever happens (change or no change) it would be good to have some kind of documentation capturing the trade-off chosen.

TotalOrdering IeeeOrdering
Sort order (aka compare) Delegates to java.lang.Double.compare. this is consistent with java.util.Arrays.sort. "IEEE ordering" using Double#< makes it impossible to implement usable compare, so it delegates to java.lang.Double.compare.
Internal consistency All methods (lt, equiv, etc) are internally consistent with compare. lt, equiv etc delegate to primitive comparisons (Double#<, Double#==, etc). This makes them internally inconsistent with compare when NaN is involved, meaning you'd get different results if you called lt or compare to order things.
Predef consisntency lt is inconsistent with Double#< when NaN is involved.

equiv delegates to compare, so equiv(-0.0, 0.0) returns false, which is different from -0.0 == 0.0; and equiv(NaN, NaN) returns true, which is different from NaN == NaN.

lt delegates to Double#<, which is consistent with IEEE ordering. equiv delegates to ==, so equiv(-0.0, 0.0) returns true and equiv(NaN, NaN) returns false.

compare is inconsistent with Double#< and Double#==.

OrderingOps#< vs Double#< OrderingOps#< delegates to lt so it's inconsistent with Double#< when NaN is involved. OrderingOps#< is consistent with Double#<.
List#min List#min would be the first item on sorted, which is consistent with java.util.Collections.min. List#min would return NaN if the list contains NaN, which is consistent with IEEE ordering and java.lang.Math.min.
List#max List#max would be the last item on sorted, which is consistent with java.util.Collections.max. List#max would return NaN if the list contains NaN, which is consistent with IEEE ordering and java.lang.Math.max.
Compatibility with Scala 2.12.x The behavior of lt has changed when NaN is involved. This would affect OrderingOps#< and OrderingOps#min too. The behavior is same as Scala 2.12.x (except List#min and List#max that were buggy).
@odersky

This comment has been minimized.

Copy link

@odersky odersky commented Jan 5, 2020

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 5, 2020

a few comments on this table (specifically the rows):

  • OrderingOps#< should always forward to Ordering#lt, so there's no reason to include it separately in the table
  • "lt vs compare" and "lt vs Double#<" should be combined, as the their behaviors are consistent (if lt behaves like compare, it doesn't behave like Double#<, and vice versa)
  • "equiv vs ==" should align with "lt vs Double#<", so I would again group them (and all the other comparison ops)
  • even though max behaves the same for both of them, it's a bit confusing not to include it

The best solution (for the behavior of DeprecatedDoubleOrdering, no longer deprecated) as I see it now (as suggested by @sjrd, and indicated by s):

Operation TotalOrdering IeeeOrdering
sort order (aka compare) Delegates to java.lang.Double.compare. Mostly delegates to java.lang.Double.compare, except that 0.0 and -0.0 are considered equivalent [still pending].
lt, gt, equiv, gteq, lteq Always consistent with compare. Delegate to primitive comparisons Double#<, Double#==, etc.). This makes it inconsistent with compare when NaN is involved, meaning you'd get different results depending if you called lt or compare to order things.
min Equivalent to the first element when sorted; consistent with java.util.Collections.min. Returns NaN if either argument is NaN; consistent with IEEE ordering and java.lang.Math.min.
compatibility with Scala 2.12.x The behavior of lt, gt, equiv, lteq, gteq and min have changed when NaN is involved. This affects OrderingOps#< and OrderingOps#min as well. The behavior is same as in Scala 2.12.x.
@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 5, 2020

@Ichoran

@‌sjrd I don't understand the difference between that one and just using IeeeOrdering

with Sébastien's solution, lt, equiv, etc. remain consistent with compare, which means you don't need to worry about whether you used "the right method" when sorting

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 5, 2020

@NthPortal Thanks for the feedback.

  • OrderingOps#< should always forward to Ordering#lt, so there's no reason to include it separately in the table

The implication towards generically introduced infix operators OrderingOps#< was not immediately obvious to me when I started this journey, so I thought it's helpful to call this out. First, it might highlight the difficulty on evaluating whether some Scala 2.12 code is affected just by looking at it. Part of my brain says "yea < should match Ord obv" and the other part also says "in all languages < is NaN-aware."

  • "lt vs compare" and "lt vs Double#<" should be combined, as the their behaviors are consistent (if lt behaves like compare, it doesn't behave like Double#<, and vice versa)

"lt vs compare" is about internal consistency as an Ordering instance, whereas "lt vs Double#<" is about consistency with the Predef world. Maybe I can rename to these to clarify my point.

  • "equiv vs ==" should align with "lt vs Double#<", so I would again group them (and all the other comparison ops)

Yea. This could be gouped, but I also wanted to point that out as a potential value judgement points.

  • even though max behaves the same for both of them, it's a bit confusing not to include it

I will include max.

@dwijnand

This comment has been minimized.

Copy link
Member

@dwijnand dwijnand commented Jan 6, 2020

@eed3si9n In "Predef consistency" (which I wonder whether it should be "inherent methods consistency": consistency with the inherent methods of Double) it's good to mention equiv(-0.0, 0.0), but could you still mention equiv(NaN, NaN)?

scala> TotalOrdering.equiv(Double.NaN, Double.NaN)
res16: Boolean = true

scala> IeeeOrdering.equiv(Double.NaN, Double.NaN)
res17: Boolean = false

scala> Double.NaN == Double.NaN
res18: Boolean = false
@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 6, 2020

In this thread I've been taking similar position as @sjrd's #11844 (comment) (Undeprecate DeprecatedOrdering, tweak infix-OrderingOps#min?). I've also tried implementing it locally:

  implicit object DeprecatedDoubleOrdering extends Double.TotalOrdering {
    class IeeeOrderingOps(lhs: Double) extends OrderingOps(lhs) {
      // uncomment to extend the tweak to all ops
      // override def <(rhs: Double): Boolean = Double.IeeeOrdering.lt(lhs, rhs)
      // override def <=(rhs: Double): Boolean = Double.IeeeOrdering.lteq(lhs, rhs)
      // override def >(rhs: Double): Boolean = Double.IeeeOrdering.gt(lhs, rhs)
      // override def >=(rhs: Double): Boolean = Double.IeeeOrdering.gteq(lhs, rhs)
      // override def equiv(rhs: Double): Boolean = Double.IeeeOrdering.equiv(lhs, rhs)
      override def max(rhs: Double): Double = Double.IeeeOrdering.max(lhs, rhs)
      override def min(rhs: Double): Double = Double.IeeeOrdering.min(lhs, rhs)
    }
    override implicit def mkOrderingOps(lhs: Double): OrderingOps = new IeeeOrderingOps(lhs)
  }

and I don't think this would work. The purpose of this tweak supposedly is to fix the compatibility with Scala 2.12.x, but it just makes the situation more murky by making the a max b not correspond with ord.max(a, b). Someone could have just called implicitly[Ordering[Double]].max(x, NaN). The goes for lt, equiv etc.

I think @Ichoran's idea of going with IeeeOrdering is the safe answer considering both the Predef consistency and 2.12.x compatibility.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 6, 2020

@dwijnand Done.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Jan 6, 2020

I find it worrying that we would intentionally settle for IeeeOrdering as the default Ordering[Double], given that IeeeOrdering is not even a valid Ordering[Double]. It is not internally consistent, never mind the consistency with other aspects of the API.

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 6, 2020

@eed3si9n we don't need to tweak the Ops; just override Ordering#min and Ordering#max. at least that was my understanding of the proposal

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 6, 2020

fun fact: this issue also applies to implicit Equiv instances

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 6, 2020

@sjrd - Having an ordering that is not even an ordering is pretty bad, I agree.

However, the proposed alternative is silently breaking existing numeric code without any warning and being inconsistent when moving from a concrete to generic context. Can you explain why this is better?

@NthPortal - What about the issue of silently changing the function of existing code (for lt etc., if not max)?

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 6, 2020

@Jasper-M

Couldn't the IeeeOrdering cause problems for things like CollisionProofHashMap?

The Scaladoc of CollisionProofHashMap says

Equality as determined by the Ordering has to be consistent with equals and hashCode.

and inside of the code it seems to call compare:

  @`inline` private def compare[K, V](key: K, hash: Int, node: LLNode[K, V])(implicit ord: Ordering[K]): Int = {
    val i = hash - node.hash
    if(i != 0) i else ord.compare(key, node.key)
  }

  @`inline` private def compare[K, V](key: K, hash: Int, node: RBNode[K, V])(implicit ord: Ordering[K]): Int = {
    /*val i = hash - node.hash
    if(i != 0) i else*/ ord.compare(key, node.key)
  }

Both TotalOrdering and IeeeOrdering use java.lang.Double.compare, which is totally ordered but also inconsistent with Double#==. So they might both be broken in the identical way.

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 14, 2020

How do people feel about having the default Ordering throw an exception for NaN, and otherwise be the same as in 2.12? that way, we're not silently breaking things, but we're also not allowing major inconsistency by default?

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 14, 2020

@NthPortal Would you throw exception on compare, lt or both?

Based on the proposed criteria of success, which I summarized in #11844 (comment), the internal consistency around NaN treatment does not matter, and we should try to do what Java does for NaN so throwing an exception would be doubly bad.

@NthPortal

This comment has been minimized.

Copy link

@NthPortal NthPortal commented Jan 14, 2020

@eed3si9n I guess I'll have to disagree with your criteria then.

  • don't use deprecation on List(1.0, -1.0).sorted or .max
  • we should do the same thing that Java does
  • principle of least surprise (don't silently break simple looking code from Scala 2.12?)
  • bincompat with Scala 2.13.0

2.12 does not do the same thing Java does. Thus, in order to do what Java does, we need to break code from 2.12 loudly. Our options for that are:

  • It doesn't compile
    • ambiguous implicits [ruled out by point 4 (bincompat)]
    • implicit not found (via making it not implicit)
      • I think this would be binary compatible
      • If we do this, it now fails to compile with absolutely no helpful message. I would consider that significantly worse than the current behaviour.
  • It warns
    • The only way I can think to do this is a deprecation warning, which is the current behaviour [ruled out by point 1 (no deprecation)]
  • It throws an exception in the cases where behaviour is different
  • internal consistency does not matter

I very strongly disagree with this. internal consistency was the reason I opened the original issue; not having it leads to extremely subtle and confusing bugs. For example, 2.12's choice to use Ordering#lteq to implement TraversableOnce#min (via reduceLeft) is actually wrong and gives the wrong answer, because lteq and min are not consistent!!

Welcome to Scala 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_232).
Type in expressions for evaluation. Or try :help.

scala> List(2.0, Double.NaN, 5.0).min // oops
res0: Double = 5.0

You can have similar problems if you're trying to implement a sort or something using lt instead of compare.

We fixed IterableOnceOps#min in 2.13 by making it call min, but you shouldn't need to. It shouldn't matter whether you implement it using compare or lteq or gt or min; they should all be "right" and give the same result, or at the very least a sane one.

I'm willing to compromise slightly on internal consistency by having min behave differently from a total ordering and instead be consistent with math.min, because at least it behaves sanely in the above case.


If you don't care about internal consistency, you can just revert to 2.12's behaviour. Except that's not really what Java does for Comparator/Comparable/sorting stuff. And also I do care about internal consistency. It does matter. Which seems to just leave throwing an exception.


Would you throw exception on compare, lt or both?

probably both, to be consistent. I could likely be convinced out of throwing it from compare though

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 14, 2020

Exceptions erupting from the middle of safe-seeming code also sounds pretty bad to me. It's really quite a mess.

@eed3si9n

This comment has been minimized.

Copy link
Member Author

@eed3si9n eed3si9n commented Jan 14, 2020

FWIW "internal consistency does not matter" is more of my rephrasing of #10511 (comment) by Martin:

These things should just work. I don't care what ordering is used and whether it is inconsistent or not (we all know IEEE is broken and unfixable). But we should do the right thing for everyday users. As an everyday user I do not care what the ordering does for NaN and -0.0, maybe I do not even know that these weird numbers exist!

I personally do wish we could somehow achieve internal consistency, but my weight on it is below "Compatibility with Scala 2.12.x", "Predef consistency", and "Java consistency".

And here are some areas of "we should do the same thing that Java does"

  1. Sort using java.lang.Double.compare, NaN goes to the end when sorting
  2. < and > both return false when NaN gets involved
  3. min and max both return NaN when NaN gets involved
  4. List#min and List#max return List#sorted.head and List#sorted.last; returning NaN also might be valid

Because of the nature of NaN, I don't think the consistency is achievable here. How can a number be not greater and not less than 1.0? Yet that's how Double#< is implemented in most languages.

@odersky

This comment has been minimized.

Copy link

@odersky odersky commented Jan 14, 2020

@dwijnand

This comment has been minimized.

Copy link
Member

@dwijnand dwijnand commented Jan 14, 2020

Among differences of opinion, is everyone in agreement that the following shouldn't happen?

Welcome to Scala 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_232).
Type in expressions for evaluation. Or try :help.

scala> List(2.0, Double.NaN, 5.0).min // oops
res0: Double = 5.0

Whether it returns 2.0 or NaN I'm sure could be argued, but the fact that it returns 5.0 over either of those I hope everyone agrees is not "the right thing for everyday users".

@Ichoran

This comment has been minimized.

Copy link

@Ichoran Ichoran commented Jan 14, 2020

@dwijnand - That's absolutely the wrong behavior, but it is the result of the implementation of min on collections which neither deferred to the min of Ordering, nor took into account that Double is only partially ordered via <. Regardless of our choice here, that shouldn't happen (and won't, with the new implementation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.