Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ugh signed zeros #187

Closed
travisbrown opened this issue Feb 2, 2016 · 4 comments
Closed

Ugh signed zeros #187

travisbrown opened this issue Feb 2, 2016 · 4 comments

Comments

@travisbrown
Copy link
Member

Let me start by saying that I spent my entire life up until about an hour ago without realizing what a shitshow signed zeros are, and I wish I could go back to a state of innocence, so you might want to close the tab and move on.

First for the basics:

scala> 0.0 == -0.0
res0: Boolean = true

scala> Double.box(0.0) == Double.box(-0.0)
res1: Boolean = false

scala> java.lang.Double.compare(0.0, -0.0) == 0
res2: Boolean = false

Hurray, that makes no sense (but at least you might have seen it before).

Cats and Scalaz also disagree:

scala> import scalaz.Equal, scalaz.std.anyVal._
import scalaz.Equal
import scalaz.std.anyVal._

scala> Equal[Double].equal(0.0, -0.0)
res3: Boolean = true

scala> import cats.Eq, cats.std.double._
import cats.Eq
import cats.std.double._

scala> Eq[Double].eqv(0.0, -0.0)
res4: Boolean = false

Apparently IEEE 754 says something like "positive and negative zeros are distinct but equal", and that's why Java's primitive floating point types make them equal. Cats (and Algebra, although each defines its own instance) seems to be taking a more principled stand than Scalaz here on "distinct but equal" not making any sense (I don't know what the actual motivations for the choices here are, but I'm curious).

Argonaut says all zero JSON numbers are equal:

scala> import argonaut._, Argonaut._, scalaz._, Scalaz._
import argonaut._
import Argonaut._
import scalaz._
import Scalaz._

scala> Parse.parse("0.0") === Parse.parse("-0.0")
res0: Boolean = true

scala> Parse.parse("0") === Parse.parse("-0")
res1: Boolean = true

But it's perfectly happy to round-trip negative zeros if they have a fractional part:

scala> Parse.parse("-0").map(_.nospaces)
res2: scalaz.\/[String,String] = \/-(0)

scala> Parse.parse("-0.0").map(_.nospaces)
res3: scalaz.\/[String,String] = \/-(-0.0)

This seems really wrong to me—if two things serialize differently, the scalaz.Equal instance shouldn't say they're the same.

circe currently round-trips both cases (at least with Jawn as the parser—other parsers may behave in such a way that this isn't possible), but also says they're all equal:

scala> import io.circe._, io.circe.jawn._, cats.syntax.eq._
import io.circe._
import io.circe.jawn._
import cats.syntax.eq._

scala> parse("-0.0").map(_.noSpaces)
res0: cats.data.Xor[io.circe.ParsingFailure,String] = Right(-0.0)

scala> parse("-0").map(_.noSpaces)
res1: cats.data.Xor[io.circe.ParsingFailure,String] = Right(-0)

scala> parse("0.0") === parse("-0.0")
res2: Boolean = true

scala> parse("0") === parse("-0")
res3: Boolean = true

I think I'd like to keep the current round-tripping behavior (I want circe to be able to distinguish distinct JSON values in as many cases as is reasonably possible), but to make positive and negative values not equal whether or not there's a fractional part.

Any objections?

@ceedubs
Copy link
Collaborator

ceedubs commented Feb 2, 2016

I don't have any helpful input on your suggestion, but I just want to 👏 for an excellent writeup with an introductory paragraph that guaranteed I would read on.

@non
Copy link
Contributor

non commented Feb 2, 2016

So, first of all, welcome to the hell that is IEEE-754 on the JVM.

We're talking about floating-point, so obviously things are going to be terrible no matter what we do. I'm going to lay out my own feelings on the "right thing" to do, with the proviso that there are obviously problems.

First of all, I have accepted that type classes parameterized on Double or Float are usually not law-abiding. This is already true for associativity and Semigroup[Double]. Similarly, NaN violates the identity/inverse interaction for Group[Double] (we'd like for (x |+| x.inverse) = empty but try it with NaN).

Given this background, I think Algebra's behavior is wrong and Scalaz's is correct. eqv(x, y) and x == y should produce the same results, especially since == 0.0 checks will be very common. I feel strongly like a generic version of direct code shouldn't change the meaning. This does create an inconsistency between eqv and compare: the former will report equal, the latter will return a non-equal ordering.

The reason I mentioned floating-point type class instances not being law-abiding is that Scalaz's instance (like Spire's) breaks the law that eqv(x, y) = compare(x, y) == 0. Algebra's implementation of eqv for Eq[Double] defaulted to compare(x, y) == 0 which is why it deviates from the other implementations.

The other "advantage" of this view is that it keeps things like eqv(NaN, NaN) doing the same thing as NaN == NaN. Again, we can argue about whether IEEE-754 is crazy or not, but changing the behavior between direct and generic code seems really bad from my POV. If someone wants to call compare (or java.lang.Double.compare) then that's their problem, but using operators like === or < in generic code should be consistent with operators like == and < in direct code.

So what should Circe do? I propose one of these:

  1. Normalize the JSON value -0.0 as the Double value 0.0. This won't break any code that isn't specifically relying on signed-zero behavior. Since JSON doesn't guarantee IEEE-754 semantics I think this is probably fine. Given that JSON does not allow infinities or NaNs, this frees you from the last of the weird sentinel value warts that come with Double. I think this is probably the best option.
  2. Pass it through and let someone else worry about it. If you really want to support someone who wants to encode signed zeros in JSON, then just leave these values alone and let someone else worry about it. Given that signed zeros are mostly designed to help deal with certain kinds of symmetry errors in numerical analysis, I think it's relatively unlikely that they will crop up in JSON values, but if they do, we can just kick the can down the road and let someone else worry about them.

Obviously my plan embraces inconsistency, but due to the interaction of IEEE and Java there is fundamental inconsistency between double and java.lang.Double which is literally impossible to fix. All we can do is to try to minimize the pain in particular cases.

@travisbrown
Copy link
Member Author

Thanks for the detailed answer, @non!

I still think I want to go with your second option. If someone said they want to distinguish between "1000" and "1e3" as JSON values, I'd tell them they're out of luck, but if there's any vaguely reasonable context in which it makes sense to distinguish between different JSON expressions (and the spec doesn't explicitly say they're equivalent), I want to support that.

@travisbrown
Copy link
Member Author

This is fixed in #189, which distinguishes positive and negative zero values as JSON numbers.

julienrf pushed a commit to scalacenter/circe that referenced this issue May 13, 2021
Update sbt-mima-plugin to 0.6.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants