Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

round function in Math library sometimes doesn't work #9082

Closed
mkanenobu opened this issue Sep 27, 2018 · 12 comments
Closed

round function in Math library sometimes doesn't work #9082

mkanenobu opened this issue Sep 27, 2018 · 12 comments

Comments

@mkanenobu
Copy link

mkanenobu commented Sep 27, 2018

for example

import math
var
  n = 728278982257.0 * 1.08

echo n
# -> 786541300837.5601
echo round(n, 2)
# -> 786541300837.5601
@Bennyelg
Copy link
Contributor

Bennyelg commented Sep 27, 2018

Yep,
round(n, 2) not working while round(n, 1) works.
Looks like it because the number we trying to round is long
round(786541300837.5601, 2) wont work but round(78654130087.5601, 2) will work # removed only 1 digit.

krux02 added a commit to krux02/Nim that referenced this issue Sep 27, 2018
@krux02
Copy link
Contributor

krux02 commented Sep 27, 2018

Surprise, round with places bigger than 0 never worked. It is just that the default printing of floating point numbers does some rounding as well, so that in some cases it isn't visible that the rounding doesn't work.

@skilchen
Copy link
Contributor

These are the wonders of the surprising world of floating-point arithmetic.

Numbers as large as your 786541300837.5601 are no longer precise to the 4th decimal place. In fact 786541300837.56 has exactly the same binary representation as 786541300837.5601.

If you echo 786541300837.56 you also will get 786541300837.5601. And as another surprise echo 786541300837.5601 - 0.0001 will give you 786541300837.5599. No need to open an issue about subtraction sometimes not working...

You will also get surprising results from much smaller numbers:

import math, strformat

for i in 1..9:
  let j = float(i) + 0.015
  echo fmt"{j:4g} {j:.16f} {round(j, 2):.2f} {j:.2f}"

produces:

1.015 1.0149999999999999 1.01 1.01
2.015 2.0150000000000001 2.02 2.02
3.015 3.0150000000000001 3.02 3.02
4.015 4.0149999999999997 4.01 4.01
5.015 5.0149999999999997 5.01 5.01
6.015 6.0149999999999997 6.02 6.01
7.015 7.0149999999999997 7.02 7.01
8.015 8.0150000000000006 8.02 8.02
9.015 9.0150000000000006 9.02 9.02

Most people expect that all numbers x in the first column would be rounded to x.02. But thats not the case and this is not due to an error in the rounding procedure, but a consequence of the binary floating-point implementation (IEEE-754) used by most current computers and programming languages.

As an excercise you might try to explain the difference in the last two columns, or eg. why 7.015 is rounded to 7.02 although the binary representation of 7.015 is just as slightly smaller as in the case of 4.015 which is rounded to 4.01.

This also shows you one reason why the PR from @krux02 isn't really a good idea. It only adds some more surprises. There is really no need to deprecate proc round*[T: float32|float64](x: T, places: int = 0): T. It works as expected and mostly like similar procedures in e.g. Python (uses round to even) and Ruby.

@krux02
Copy link
Contributor

krux02 commented Sep 27, 2018

@skilchen I only deprecate the version with two arguments, because that is the one that can't be implemented precisely. The reason for that is rounding is normally done only for printing values. And for printing the rounding to decimal digits can be done precisely. If people still really really want rounding for non printing, well it's not that hard to implement.

@skilchen
Copy link
Contributor

No @krux02 the round to decimal digits can't be done precisely for printing as shown in my little example above. Only very few people expect that echo formatFloat(7.015, ffDecimal, 2) produces 7.01.

@mkanenobu
Copy link
Author

Thanks @skilchen and @krux02 . I'll use formatFloat function when I want to print rounded floating number. I think this is not round function's problem, so I close this issue.

@krux02
Copy link
Contributor

krux02 commented Sep 28, 2018

@skilchen well the important part is that formatFloat(x, ffDecimal, 2) produces a correctly rounded number with exactly two digits after the decimal point. In contrast to $(round(x,2)) which only works in some cases.

@skilchen
Copy link
Contributor

skilchen commented Sep 28, 2018

@krux02 this is more a problem of Nim's default stringification of floats than of the 2-argument rounding procedure. And i repeat for the last time that most people are surprised when you tell them that 7.015 "correctly rounded" to 2 decimal digits gives 7.01 and not 7.02 as they learned in school. This is just an artifact of IEEE-754 and has nothing to do with correctness in general.

I just found out, that Python indeed does a roundtrip float -> dtoa -> float in its rounding function.
Here is a toy with a rounding procedure that does this same roundtrip, although using C's [fe]cvt for the rounding and @LemonBoy's new dtoa for the stringification instead of D. Gay's dtoa:

import strutils, math
    
when defined(js):
  proc parseFloat(str: string): float =
    let cstr = cstring(str)
    {.emit: "`result` = parseFloat(`cstr`);".}
    
  proc parseInt(str: string): int =
    let cstr = cstring(str)
    {.emit: "`result` = parseInt(`cstr`);".}

  proc round(f:float, ndigits: int): float =
    case classify(f)
    of fcZero, fcNegZero, fcInf, fcNegInf, fcNan:
      return f
    else:
      discard

    {.emit: "`result` = parseFloat(`f`.toFixed(`ndigits`));".}
    if result == 0.0:
      {.emit: "`result` = parseFloat(`f`.toPrecision(`ndigits`));".}
  
  proc dtoa(f: float): string =
    case classify(f)
    of fcZero: return "0.0"
    of fcNegZero: return "-0.0"
    of fcInf: return "inf"
    of fcNegInf: return "-inf"
    of fcNan: return "nan"
    else:
      discard
  
    var r: cstring
    {.emit: """
      `r` = `f`.toString();
    """
    .}
    result = $r
    
else:
  import dtoa
  
  proc fcvt(f: float, ndigits: int, decpt: ptr cint, sig: ptr cint): cstring 
    {.importc: "fcvt", header: "<stdlib.h>".}

  proc ecvt(f: float, ndigits: int, decpt: ptr cint, sig: ptr cint): cstring 
    {.importc: "ecvt", header: "<stdlib.h>".}

  proc round(f: float, ndigits: int): float =
    case classify(f)
    of fcZero, fcNegZero, fcInf, fcNegInf, fcNan:
      return f
    else:
      var decpt: cint
      var sig: cint
      var s: string
      s = $fcvt(f, ndigits, addr decpt, addr sig)
      if s == "":
        s = $ecvt(f, ndigits, addr decpt, addr sig)
      if decpt <= 0:
        for i in decpt .. 0:
          s.insert("0", 0)
        s.insert(".", 1)
      else:
        s.insert(".", decpt)
      if sig != 0:
        s.insert("-", 0)
      result = parseFloat(s)
  
when defined(test):  
  import times
  proc `$`(dt: DateTime): string =
    dt.format("uuuu-MM-dd HH:mm:ss'.'fff")
  proc `$`(dur: Duration): string =
    $dur.seconds & "." & intToStr(dur.milliseconds, 3)
  proc test() =
    var t0 = now()
    var t1 = t0
    for i in 1..1_000_000_000:
      if i mod 1_000_000 == 0:
        let t2 = now()
        echo t1, " ", align($i, 10), " ", t2 - t1, " ", t2 - t0
        t1 = t2
      let j = float(i) + 0.015
      let s1 = dtoa(round(j, 2))
      let s2 = formatFloat(j, ffDecimal, 2)
      
      doAssert s1 == s2, $s1 & " != " & $s2
  
when isMainModule:
  when defined(js):
    proc paramStr(n: int): string =
      var arg: cstring
      {.emit: "`arg` = process.argv[`n` + 1];".}
      return $arg
  else:
    import os
  let number = parseFloat(paramStr(1))
  let ndigits = parseInt(paramStr(2))
  echo "number: ", number, " ndigits: ", ndigits
  
  let rounded = round(number, ndigits)
  echo rounded, " ", dtoa(rounded), " ", formatFloat(rounded, ffDecimal, min(32, ndigits))  
  
  when defined(test):
    import times, strformat
    test()

One interesting anecdotal fact is that the test procedure runs faster on the js backend than on the c backend... For one part this is due to my inability to write fast Nim code but on the other hand it underlines the fact that NodeJs is really good at dealing with floating point numbers.

@LemonBoy's dtoa needs some better handling of the special float values, i added this to the top of his dtoa procedure:

case classify(value)
  of fcZero: return "0.0"
  of fcNegZero: return "-0.0"
  of fcInf: return "inf"
  of fcNegInf: return "-inf"
  of fcNan: return "nan"
  else:
    discard

@krux02
Copy link
Contributor

krux02 commented Sep 28, 2018

It doesn't matter if people are surprised if they see echo formatFloat(7.015, ffDecimal, 2) produces 7.01, because there is no true 7.015 as an exact binary floating point number in the first place. You even explained that fact in detail. I am not sure what that huge example is supposed to show me.

@krux02
Copy link
Contributor

krux02 commented Oct 5, 2018

@skilchen People will also be surprised if they see that the following code and that it outputs "false":

var a: float32 = 10.1 
echo a == 10.1

We can't put people on a cloud where they can spare themself to learn about how floating point numbers are stored in the computer.

@skilchen
Copy link
Contributor

skilchen commented Oct 5, 2018

That exact equality comparisons of floats are a thing to avoid is one of the first things people learn about floating-point and it is not that surprising that you get better approximations of numbers that are not exactly representable as binary fractions if you have more bits available.

My goals were:

  1. Convince you to close your PR deprecate decimal rounding in binary floating point math libray #9089 yourself.
  2. Contest your claim:

    And for printing the rounding to decimal digits can be done precisely

In reality printing is affected by the exact same surprising properties of binary floating-point arithmetic as is rounding to zero or more decimal digits.

Python and my proc round(f: float, ndigits: int) above show that rounding a float to a number of decimal digits can be done just as "precisely" as in printing, if this is something that matters to you.

Here "precisely" has to be understood in terms of IEEE-754.

Its not me who wants to deprecate things just because one has to know something about binary floating-point arithmetic when using them.

Just for fun, one of the most inaccurate results you can get from floating-point arithmetic:

import math, strutils
var x = 0.3 mod 0.1
echo formatFloat(x, ffDefault, 1)

prints a "precisely" rounded result of 0.1 which is rather far away from the true value of 0.0. I don't think that we therefore should deprecate mod on floats.

@krux02
Copy link
Contributor

krux02 commented Oct 7, 2018

If you want decimal rounding in the floating point math module, it should be called roundDecimal, not round. The simple name round should be reserved for functions that respect the internal representation of round.

In your example the rounding works flawlessly. The modulo division works flawlessly as well. The problem is that there is no 0.3 and no 0.1 in floating point numbers in the first place, so 0.3 mod 0.1 is never actually calculated, because in floating point numbers there is neither a representation of 0.3 nor of 0.1. The error is already at the input not in the rounding or modulo operation

import math, strutils

import math, strutils

echo formatFloat(0.3, ffDefault, 32)
echo formatFloat(0.1, ffDefault, 32)
echo formatFloat(0.3 mod 0.1, ffDefault, 32)

#0.29999999999999998889776975374843
#0.10000000000000000555111512312578
#0.099999999999999977795539507496869

Nobody is surprised that 0.29999999999999998889776975374843 mod 0.10000000000000000555111512312578``` is 0.099999999999999977795539507496869, and nobody is surprised that 0.099999999999999977795539507496869is rounded0.1``. You can't blame mod or formatFloat, if the input numbers are already wrong.

And of course you can do exact equality comparison on floating point numbers, but only if you know what you are doing. Generally it is a good advice to avoid them, but if you know what you are doing, it is possible.

here are some examples that do work precisely:
2.0'f32 == 2.0'f64
1234567'f32 == 1234567'f64
0.5'f32 == 0.5'f64
0.125'f32 == 0.125'f32

here are some example that do not work precisely:
0.1'f32 == 0.1'f64
0.3'f32 == 0.3'f64
16777217'f32 == 1234567'f64

And for the second claim: And for printing the rounding to decimal digits can be done precisely. Yes this is true, binary can be converted lossless into a decimal floating point representation. In this representation decimal rounding can be done exactly. And there will be an exact representation of 0.3 and 0.1. This rounded number can then be printed. I am not saything this is the most efficient way to do it (it's not), but it is exact.

To sum it up: No I am not convinced to close anything, and I still hold to my claims.

Araq pushed a commit that referenced this issue Oct 9, 2018
krux02 added a commit to krux02/Nim that referenced this issue Oct 15, 2018
narimiran pushed a commit to narimiran/Nim that referenced this issue Oct 31, 2018
narimiran pushed a commit to narimiran/Nim that referenced this issue Nov 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants