Loosing parsing of float #20

plokhotnyuk · 2019-05-09T06:14:25Z

Current code from the master parses floats with rounding where error ~1ULP that is greater than expected ~0.5ULP of java.lang.Float.valueOf().

The rounding error can be easy reproduced when parsing string representation of some values with number of digits greater than usually used for floats.

scala> "1.00000017881393432617187499".toFloat
res0: Float = 1.0000001

scala> "1.00000017881393432617187499".toDouble.toFloat
res1: Float = 1.0000002

The detailed explanation is in this comment

The following code can print lot of such numbers after increasing number of iterations:

scala> :paste
// Entering paste mode (ctrl-D to finish)

  (1 to 1000).foreach { _ =>
    def checkAndPrint(input: String): Unit = {
      val actualOutput = io.bullet.borer.Json.decode(input.getBytes).to[Float].value
      val expectedOutput = input.toFloat
      if (actualOutput != expectedOutput) {
        println(s"input = $input, expectedOutput =$expectedOutput, actualOutput = $actualOutput")
      }
    }

    val n = java.util.concurrent.ThreadLocalRandom.current().nextLong()
    val x = java.lang.Double.longBitsToDouble(n & ~0xFFFFFFFL)
    if (java.lang.Float.isFinite(x.toFloat) && x.toFloat != 0.0) checkAndPrint(x.toString)
  }

// Exiting paste mode, now interpreting.

input = 5.726278970996646E-7, expectedOutput =5.726279E-7, actualOutput = 5.7262787E-7
input = -1.502594932252865E35, expectedOutput =-1.5025949E35, actualOutput = -1.502595E35
input = -2.053233379325646E20, expectedOutput =-2.0532335E20, actualOutput = -2.0532333E20
input = 1.511814678325955E20, expectedOutput =1.5118146E20, actualOutput = 1.5118148E20
input = 4.355016683746335E19, expectedOutput =4.3550165E19, actualOutput = 4.355017E19
input = 1.704752798329881E25, expectedOutput =1.7047527E25, actualOutput = 1.7047529E25
input = -5.250234028812652E31, expectedOutput =-5.250234E31, actualOutput = -5.2502343E31

I think it should be documented if there are no other option available.

The text was updated successfully, but these errors were encountered:

sirthias · 2019-05-09T09:09:10Z

Thank you for reporting!
Since BORER's API differs from what Jsoniter offers (essentially it's a hybrid between AST/DOM-based parsing and Jsoniter's direct, zero look-ahead parsing approach) its JSON parser does not know, at the time it comes across a JSON number literal, what model type this number is eventually going to be converted to.
Converting certain "easy" numbers to Double on the spot is no problem and much faster than simply always parsing decimal numbers as a String and offloading the actual number parsing to java.lang.Double.parseDouble. However, for 32-bit Float values this can indeed result in the small inaccuracy you are rightfully describing in this ticket.

IMHO the best solution here is to add a config parameter readDecimalNumbersOnlyAsNumberStrings, defaulting to false, which can be enabled if this small inaccuracy for 32-bit Float values becomes a problem in the specific use case.

By default the parser will therefore stay in the normal, faster parsing mode with Double as intermediate carrier for Float.

sirthias closed this as completed in f321c48 May 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loosing parsing of float #20

Loosing parsing of float #20

plokhotnyuk commented May 9, 2019 •

edited

Loading

sirthias commented May 9, 2019

Loosing parsing of float #20

Loosing parsing of float #20

Comments

plokhotnyuk commented May 9, 2019 • edited Loading

sirthias commented May 9, 2019

plokhotnyuk commented May 9, 2019 •

edited

Loading