Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug#10511 Add total orderings for Float and Double #6410

Merged
merged 1 commit into from
Jun 4, 2018

Conversation

NthPortal
Copy link
Contributor

Add total orderings for Float and Double, so that there
are two implicit orderings for each in scope: one
consistent with a total ordering, and one consistent with
IEEE spec.

Fixes scala/bug#10511

@NthPortal
Copy link
Contributor Author

review @Ichoran

@NthPortal
Copy link
Contributor Author

@SethTisue could you run a community build on this, to see how much code assumes that there is a non-ambiguous implicit Ordering for Float and Double in scope? Thanks :)

@NthPortal NthPortal force-pushed the bug#10511/R6 branch 3 times, most recently from 0a05589 to b69d410 Compare March 12, 2018 07:54
@SethTisue
Copy link
Member

queued: https://scala-ci.typesafe.com/job/scala-2.13.x-integrate-community-build/993/ (404 til Jenkins catches up)

@NthPortal
Copy link
Contributor Author

NthPortal commented Mar 14, 2018

Problems:

  • scalaz (tests only)
  • scala-js
  • singleton-ops (but doesn't seem to be from this)
  • scoverage
  • scala-java8-compat (tests only)

scala-js is the one I'm most concerned about

@dwijnand
Copy link
Member

is two orderings (of the same type) in implicit scope better than no orderings in implicit scope?

@sjrd
Copy link
Member

sjrd commented Mar 23, 2018

scala-js is the one I'm most concerned about

[scala-js] [error] /home/jenkins/workspace/scala-2.13.x-integrate-community-build/target-0.9.11/project-builds/scala-js-9a26f4316928c442ef5f745301eae97d16ffe241/javalib/src/main/scala/java/util/Arrays.scala:52: error: ambiguous implicit values:
[scala-js] [error]  both object FloatTotalOrdering in object Ordering of type scala.math.Ordering.FloatTotalOrdering.type
[scala-js] [error]  and object FloatIeeeOrdering in object Ordering of type scala.math.Ordering.FloatIeeeOrdering.type
[scala-js] [error]  match expected type Ordering[Float]
[scala-js] [error]     sortImpl(a)
[scala-js] [error]             ^
[scala-js] [error] /home/jenkins/workspace/scala-2.13.x-integrate-community-build/target-0.9.11/project-builds/scala-js-9a26f4316928c442ef5f745301eae97d16ffe241/javalib/src/main/scala/java/util/Arrays.scala:55: error: ambiguous implicit values:
[scala-js] [error]  both object FloatTotalOrdering in object Ordering of type scala.math.Ordering.FloatTotalOrdering.type
[scala-js] [error]  and object FloatIeeeOrdering in object Ordering of type scala.math.Ordering.FloatIeeeOrdering.type
[scala-js] [error]  match expected type Ordering[Float]
[scala-js] [error]     sortRangeImpl[Float](a, fromIndex, toIndex)
[scala-js] [error]                         ^
[scala-js] [error] /home/jenkins/workspace/scala-2.13.x-integrate-community-build/target-0.9.11/project-builds/scala-js-9a26f4316928c442ef5f745301eae97d16ffe241/javalib/src/main/scala/java/util/Arrays.scala:58: error: ambiguous implicit values:
[scala-js] [error]  both object DoubleTotalOrdering in object Ordering of type scala.math.Ordering.DoubleTotalOrdering.type
[scala-js] [error]  and object DoubleIeeeOrdering in object Ordering of type scala.math.Ordering.DoubleIeeeOrdering.type
[scala-js] [error]  match expected type Ordering[Double]
[scala-js] [error]     sortImpl(a)
[scala-js] [error]             ^
[scala-js] [error] /home/jenkins/workspace/scala-2.13.x-integrate-community-build/target-0.9.11/project-builds/scala-js-9a26f4316928c442ef5f745301eae97d16ffe241/javalib/src/main/scala/java/util/Arrays.scala:61: error: ambiguous implicit values:
[scala-js] [error]  both object DoubleTotalOrdering in object Ordering of type scala.math.Ordering.DoubleTotalOrdering.type
[scala-js] [error]  and object DoubleIeeeOrdering in object Ordering of type scala.math.Ordering.DoubleIeeeOrdering.type
[scala-js] [error]  match expected type Ordering[Double]
[scala-js] [error]     sortRangeImpl[Double](a, fromIndex, toIndex)
[scala-js] [error]                          ^
[scala-js] [error] four errors found

That can be addressed by adding the explicit Ordering that corresponds to the old behavior. Nothing to significantly worry about.

@NthPortal
Copy link
Contributor Author

@dwijnand having them both in scope means that users know what they are called and thus where their documentation is (hopefully helping them choose); if none were in scope, they wouldn't know where to start, and would just be confused why there's no Ordering for Double/Float

@NthPortal
Copy link
Contributor Author

NthPortal commented Mar 23, 2018

@sjrd Because the sorts use Ordering.lteq, I'm going to recommend FloatTotalOrdering and DoubleTotalOrdering.

The behaviour in 2.12 is that of FloatIeeeOrdering and DoubleIeeeOrdering, but it is broken and leads to non-deterministic/wrong sorts if NaN is present (which is the primary reason for this PR).

@dwijnand
Copy link
Member

@NthPortal interesting.. you're right.

Copy link
Contributor

@Ichoran Ichoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'm not sure if there's a way to get a custom message in there in case of the collision (that would be ideal), but I think it's a good compromise between discoverability and safety.

All the details look fine to me.

@hrhino
Copy link
Member

hrhino commented Apr 1, 2018

if there's a way to get a custom message in there in case of the collision

wouldn't @implicitAmbiguous work here?

@Ichoran
Copy link
Contributor

Ichoran commented Apr 1, 2018

@hrhino - Oh, yes! It would. I hadn't noticed that one. @NthPortal - Can you provide a more solution-oriented message using the annotation? (E.g. suggest passing the ordering directly, or explicitly importing the one that they mean in the context where they mean it?)

@NthPortal
Copy link
Contributor Author

I did not even know that annotation existed!

I will absolutely work on coming up with a custom message for that

@NthPortal
Copy link
Contributor Author

Is the following message too long?

"The behaviour of Double specified by IEEE is not consistent with a total ordering when dealing with NaN, so there are two orderings defined for Double: DoubleTotalOrdering, which is consistent with a total ordering, and DoubleIeeeOrdering, which is consistent as much as possible with IEEE spec and floating point operations defined in scala.math"

@Ichoran
Copy link
Contributor

Ichoran commented Apr 2, 2018

@NthPortal - Yeah, that does seem too long, and it is missing the critical section that tells the receiver of the message what to do. The scaladoc on the instances should have the wordier explanation of the difference between the orderings. If you could come up with something more brief that would be great. The most important thing is to mention that the user needs to select (by import, assignment, or explicit passing) one of the two orderings. This may be the first intentional collision between implicits that the user sees, so we should walk them through solving it.

@NthPortal
Copy link
Contributor Author

"The behaviour of IEEE Doubles is not always consistent with a total ordering, so there are two orderings defined for Double. You can import the one which best suits your needs, or pass it explicitly as an argument."

@Ichoran thoughts?

I can add the names of the two instances to the message, but the compiler should list them anyway as part of the error message.

@NthPortal
Copy link
Contributor Author

Thinking of possibly holding off on merging this until #6468 is merged, as it will simplify the changes needed for this PR

@dwijnand
Copy link
Member

dwijnand commented Apr 3, 2018

I'd say terser is better (image a few pages of this ambiguity in a build log somewhere), favouring info about how to resolve and deferring further context to another source.

so perhaps something like:

Import, or pass explicitly, the instance of Ordering according to whether total ordering or IEEE compliance is of greater importance. See xxx for more info.

maybe?

</bikeshed>

@Ichoran
Copy link
Contributor

Ichoran commented Apr 3, 2018

I like the heads-up about the problem, but I agree that terseness is a virtue. Maybe

There is more than one way to order Doubles!  Specify one by using a local import,
assigning an implicit val, or passing the ordering explicitly.  See the documentation
for details.

(And the corresponding one for Float.)

Then the documentation would give examples of each, and discuss the problem at more length.

@NthPortal
Copy link
Contributor Author

@dwijnand I have literally requested bikeshedding at this point xD

@NthPortal
Copy link
Contributor Author

I was mistaken - using @implicitAmbiguous entirely overrides the original error message, so listing the implicits' names is required.

@NthPortal
Copy link
Contributor Author

The message used in the latest commit is:

"There are multiple ways to order Doubles (Ordering.DoubleTotalOrdering, Ordering.DoubleIeeeOrdering). Specify one by using a local import, assigning an implicit val, or passing it explicitly. See the documentation for details."

If you can think of any reasonable way to shorten that further, I'm happy to do so, but I can't think of a good way to do so without losing important information.

@dwijnand
Copy link
Member

dwijnand commented Apr 3, 2018

lgtm!

@lrytz
Copy link
Member

lrytz commented May 28, 2018

It is a bit of a misuse of @deprecated, but for me it's fine. The repl seems to shows deprecations by default. There's also @migration, but there you only get a warnign if you use -Xmigration.

@NthPortal
Copy link
Contributor Author

@martijnhoekstra I would also like some warning annotations as alternatives to @deprecated, but they don't currently exist

@lrytz
Copy link
Member

lrytz commented May 28, 2018

I'm not at all against, but don't think it needs to hold up this PR.

Add total orderings for Float and Double, so that there
are two implicit orderings for each in scope: one
consistent with a total ordering, and one consistent with
IEEE spec.
@NthPortal
Copy link
Contributor Author

squashed

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎈

@lrytz lrytz merged commit b6982c8 into scala:2.13.x Jun 4, 2018
@NthPortal NthPortal deleted the bug#10511/R6 branch June 4, 2018 12:28
@SethTisue SethTisue added the release-notes worth highlighting in next release notes label Jun 6, 2018
@SethTisue
Copy link
Member

SethTisue commented Jun 6, 2018

just to show anyone interested what we actually ended up with:

scala> List(3.0, 2.0, 1.0).sorted
warning: object DeprecatedDoubleOrdering in object Ordering is deprecated (since 2.13.0): There are multiple ways to order Doubles (Ordering.Double.TotalOrdering, Ordering.Double.IeeeOrdering). Specify one by using a local import, assigning an implicit val, or passing it explicitly. See the documentation for details.

@Ichoran
Copy link
Contributor

Ichoran commented Jun 7, 2018

I like that IeeeOrdering sounds like a wail of despair, which is generally what happens to anyone who uses floating point for too long.

What do you think about the usability of this lengthy error message as the default interaction with sorting, Seth? It still feels nonideal to me, but having undetected wrong assumptions also seems quite nonideal.

@dwijnand
Copy link
Member

dwijnand commented Jun 7, 2018

IMO "See the documentation for details" could use some precision - which documentation? The documentation of {Float,Double}.{TotalOrdering,IeeeOrdering} it turns out. So maybe "See their documentation for details"?

@SethTisue
Copy link
Member

What do you think about the usability of this lengthy error message as the default interaction with sorting, Seth? It still feels nonideal to me, but having undetected wrong assumptions also seems quite nonideal.

idk... there is no ideal solution to this. if it had been just up to me, I would probably have just provided the additional orderings, updated the documentation to refer to them, not deprecated anything, and call it good, and let those with higher standards use a linter to enforce them.

but perhaps I am insufficiently devoted to lawfulness and correctness.

@NthPortal
Copy link
Contributor Author

What do you think about the usability of this lengthy error message

I would love to cut down on the length of the error message. If anyone has ideas of how to do so without losing significant information, I would be 100% behind that.

So maybe "See their documentation for details"

@dwijnand sounds good to me. I'll wait a little to see if anyone has thoughts on shortening the message in general before PRing the change though.

@nafg
Copy link
Contributor

nafg commented Aug 10, 2018

I think it's not necessary to say "Specify one by using a local import, assigning an implicit val, or passing it explicitly." That's general-purpose scala knowledge. Especially for a warning which may occur a lot of times.

How about,
"To avoid this warning, use Ordering.Double.TotalOrdering or Ordering.Double.IeeeOrdering instead. Their documentation has more info."

If it's not obvious enough from the context that they're implicits, how about this:
"Instead, use the Ordering.Double.TotalOrdering or Ordering.Double.IeeeOrdering implicit."

Anyway, a newbie has been told to see their docs. Perhaps a (shortened?) link could be included. Anyway the docs would have all the information a newbie needs. And for everyone else it should be self-explanatory.

@nafg
Copy link
Contributor

nafg commented Aug 10, 2018

(the docs could remind the reader of the various ways of using implicit Orderings)

eed3si9n added a commit to eed3si9n/scala that referenced this pull request Feb 15, 2020
Fixes scala/bug#11844
Ref scala/bug#10511
Ref scala#6410
Ref scala#76

This change the deprecation of `DeprecatedDoubleOrdering` to a migration warning instead to avoid `List(1.0, -1.0).sorted` giving deprecation warning.

This also provides some documentation on the ordering instances in Scaladoc.
eed3si9n added a commit to eed3si9n/scala that referenced this pull request Feb 18, 2020
Fixes scala/bug#11844
Ref scala/bug#10511
Ref scala#6410
Ref scala#76

This change the deprecation of `DeprecatedDoubleOrdering` to a migration warning instead to avoid `List(1.0, -1.0).sorted` giving deprecation warning.

This also provides some documentation on the ordering instances in Scaladoc.
HeartSaVioR pushed a commit to apache/spark that referenced this pull request Dec 1, 2020
…d if built with Scala 2.13

### What changes were proposed in this pull request?

This PR fixes an issue that the histogram and timeline aren't rendered in the `Streaming Query Statistics` page if we built Spark with Scala 2.13.

![before-fix-the-issue](https://user-images.githubusercontent.com/4736016/100612855-f543d700-3356-11eb-90d9-ede57b8b3f4f.png)
![NaN_Error](https://user-images.githubusercontent.com/4736016/100612879-00970280-3357-11eb-97cf-43978bbe2d3a.png)

The reason is [`maxRecordRate` can be `NaN`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala#L371) for Scala 2.13.

The `NaN` is the result of [`query.recentProgress.map(_.inputRowsPerSecond).max`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala#L372) when the first element of `query.recentProgress.map(_.inputRowsPerSecond)` is `NaN`.
Actually, the comparison logic for `Double` type was changed in Scala 2.13.
scala/bug#12107
scala/scala#6410

So this issue happens as of Scala 2.13.

The root cause of the `NaN` is [here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala#L164).
This `NaN` seems to be an initial value of `inputTimeSec` so I think `Double.PositiveInfinity` is suitable rather than `NaN` and this change can resolve this issue.

### Why are the changes needed?

To make sure we can use the histogram/timeline with Scala 2.13.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

First, I built with the following commands.
```
$ /dev/change-scala-version.sh 2.13
$ build/sbt -Phive -Phive-thriftserver -Pscala-2.13 package
```

Then, ran the following query (this is brought from #30427 ).
```
import org.apache.spark.sql.streaming.Trigger
val query = spark
  .readStream
  .format("rate")
  .option("rowsPerSecond", 1000)
  .option("rampUpTime", "10s")
  .load()
  .selectExpr("*", "CAST(CAST(timestamp AS BIGINT) - CAST((RAND() * 100000) AS BIGINT) AS TIMESTAMP) AS tsMod")
  .selectExpr("tsMod", "mod(value, 100) as mod", "value")
  .withWatermark("tsMod", "10 seconds")
  .groupBy(window($"tsMod", "1 minute", "10 seconds"), $"mod")
  .agg(max("value").as("max_value"), min("value").as("min_value"), avg("value").as("avg_value"))
  .writeStream
  .format("console")
  .trigger(Trigger.ProcessingTime("5 seconds"))
  .outputMode("append")
  .start()
```

Finally, I confirmed that the timeline and histogram are rendered.
![after-fix-the-issue](https://user-images.githubusercontent.com/4736016/100612736-c9285600-3356-11eb-856d-7e53cc656c36.png)

```

Closes #30546 from sarutak/ss-nan.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes worth highlighting in next release notes
Projects
None yet