Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosing parsing of float #20

Closed
plokhotnyuk opened this issue May 9, 2019 · 1 comment
Closed

Loosing parsing of float #20

plokhotnyuk opened this issue May 9, 2019 · 1 comment

Comments

@plokhotnyuk
Copy link
Contributor

plokhotnyuk commented May 9, 2019

Current code from the master parses floats with rounding where error ~1ULP that is greater than expected ~0.5ULP of java.lang.Float.valueOf().

The rounding error can be easy reproduced when parsing string representation of some values with number of digits greater than usually used for floats.

scala> "1.00000017881393432617187499".toFloat
res0: Float = 1.0000001

scala> "1.00000017881393432617187499".toDouble.toFloat
res1: Float = 1.0000002

The detailed explanation is in this comment

The following code can print lot of such numbers after increasing number of iterations:

scala> :paste
// Entering paste mode (ctrl-D to finish)

  (1 to 1000).foreach { _ =>
    def checkAndPrint(input: String): Unit = {
      val actualOutput = io.bullet.borer.Json.decode(input.getBytes).to[Float].value
      val expectedOutput = input.toFloat
      if (actualOutput != expectedOutput) {
        println(s"input = $input, expectedOutput =$expectedOutput, actualOutput = $actualOutput")
      }
    }

    val n = java.util.concurrent.ThreadLocalRandom.current().nextLong()
    val x = java.lang.Double.longBitsToDouble(n & ~0xFFFFFFFL)
    if (java.lang.Float.isFinite(x.toFloat) && x.toFloat != 0.0) checkAndPrint(x.toString)
  }

// Exiting paste mode, now interpreting.

input = 5.726278970996646E-7, expectedOutput =5.726279E-7, actualOutput = 5.7262787E-7
input = -1.502594932252865E35, expectedOutput =-1.5025949E35, actualOutput = -1.502595E35
input = -2.053233379325646E20, expectedOutput =-2.0532335E20, actualOutput = -2.0532333E20
input = 1.511814678325955E20, expectedOutput =1.5118146E20, actualOutput = 1.5118148E20
input = 4.355016683746335E19, expectedOutput =4.3550165E19, actualOutput = 4.355017E19
input = 1.704752798329881E25, expectedOutput =1.7047527E25, actualOutput = 1.7047529E25
input = -5.250234028812652E31, expectedOutput =-5.250234E31, actualOutput = -5.2502343E31

I think it should be documented if there are no other option available.

@sirthias
Copy link
Owner

sirthias commented May 9, 2019

Thank you for reporting!
Since BORER's API differs from what Jsoniter offers (essentially it's a hybrid between AST/DOM-based parsing and Jsoniter's direct, zero look-ahead parsing approach) its JSON parser does not know, at the time it comes across a JSON number literal, what model type this number is eventually going to be converted to.
Converting certain "easy" numbers to Double on the spot is no problem and much faster than simply always parsing decimal numbers as a String and offloading the actual number parsing to java.lang.Double.parseDouble. However, for 32-bit Float values this can indeed result in the small inaccuracy you are rightfully describing in this ticket.

IMHO the best solution here is to add a config parameter readDecimalNumbersOnlyAsNumberStrings, defaulting to false, which can be enabled if this small inaccuracy for 32-bit Float values becomes a problem in the specific use case.

By default the parser will therefore stay in the normal, faster parsing mode with Double as intermediate carrier for Float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants