Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partly wrong rounding when using printf with "%f" format #6

Open
jghub opened this issue Mar 12, 2020 · 14 comments
Open

partly wrong rounding when using printf with "%f" format #6

jghub opened this issue Mar 12, 2020 · 14 comments

Comments

@jghub
Copy link
Member

jghub commented Mar 12, 2020

I accidentally noted the following behaviour:

 printf '%.1f\n' 0.19  # --> 0.2  good
 printf '%.2f\n' 0.019  # --> 0.01  BAD
 printf '%.1g\n' 0.019  # --> 0.02 good

so %f format seems partly broken...

@hyenias
Copy link

hyenias commented Mar 20, 2020

@jghub I have been researching this issue for several days now. At present, it seems that if the float is less than zero and has a succeeding zero or zeroes after the fraction separator rounding fails unless enough additional precision is given.

testbox:~$ printf '%#.01f\n' 0.19
0.2
testbox:~$ printf '%#.01f\n' 1.19
1.2
testbox:~$ printf '%#.02f\n' 0.019
0.01
testbox:~$ printf '%#.02f\n' 1.019
1.02
testbox:~$ printf '%#.02f\n' 0.01950
0.01
testbox:~$ printf '%#.02f\n' 0.01951
0.02
testbox:~$ printf '%#.03f\n' 0.0019
0.001
testbox:~$ printf '%#.03f\n' 0.1019
0.102
testbox:~$ printf '%#.03f\n' 1.0019
1.002
testbox:~$ printf '%#.03f\n' 0.0119
0.011

As a workaround, you can apply precision to a float via typeset -F # <variable>. Once you apply precision, rounding will occur when expansion is requested. Note that the float's value remains intact even when precision is applied.

testbox:~$ typeset -F test=2/3.0
testbox:~$ echo $test
0.6666666667
testbox~$ typeset -F 2 test
testbox:~$ echo $test
0.67
testbox:~$ typeset -F 4 test
testbox:~$ echo $test
0.6667
testbox:~$ typeset -F test
testbox:~$ echo $test
0.6666666667

As an example of supplying printf the float data as compared to its expansion:

testbox:~$ typeset -F 2 ftest=0.019
testbox:~$ printf 'bad: %.2f, good: %.2f\n' ftest $ftest
bad: 0.01, good: 0.02

I believe the problem is even more complex than I have described at present. I am now creating some scripts to help provide more insights into this rounding with floats and precision issue.

@jghub
Copy link
Member Author

jghub commented Mar 20, 2020

@hyenias: thanks for this information. I agree that the situation is rather confusing. it would be good if you can manage to get a better grasp of what is going on here :).

@saper
Copy link

saper commented Mar 23, 2020

Interesting:

printf '%.2f\n' 0,0195
0,01
printf '%.2f\n' 0,0196
0,02

off by one error?

@jghub
Copy link
Member Author

jghub commented Mar 23, 2020

why/how could that be an off-by-one error? I am too slow... I mean in your example every number in the semi-open interval [0.0150, 0.0250) should round to 0.02, but actually the numbers in the open interval (0.0195, 0.0295) are "rounded" to 0.02 ... so the range is shifted by 0.045 in this example and, additionally, the lower boundary is not included (open rather than semi-open interval).

@saper
Copy link

saper commented Mar 23, 2020

I mean that while 0.019 are incorrect if we supply another digit things start to get right. Maybe just "n+1" digit factor is wrong somewhere. As @hyenias noticed all is fine with 1.019, so looks likes only the first non-zero digit does not get rounded up or down properly.

@hyenias
Copy link

hyenias commented Mar 24, 2020

I am still working on some testing script(s) to help identify the extent of the problem as well as provide workaround(s). In particular, I thought I had a good working ksh rounding function but when I tried it out on a 32bit system instead of my normal 64bit it failed.

Here are some of my current observations:

  • both ksh rounding options either printf or typeset floating precision fail in their rounding away from zero
  • providing text as input produces more rounding errors than using floats
  • 32-bit machines (maybe OSes) produce significantly less rounding errors
  • as you approach zero more rounding errors occur
  • most rounding errors occur in one of two intervals: 5s, 5-9s
  • initial testing on Intel 64bit CPUs produced nearly a 10% failure rate when values were less than one. (490981 * 1 + 99953 * 5 + 47 * 4) / 1e7 = 0.0990934
testbox1:~$ ksh test_float_rounding.ksh
count: 1.0000001, precision: 6, start: 0.0000000, stop: 1.0000000, increment: 0.0000001
typeset -A -i intervals=([5:5]=490981 [5:9]=99953 [6:9]=47)
  • on a 32bit box all 5-9 interval errors disappeared leaving only 5s errored. The [5:5] intervals with 8979 hits, most of them were false positives as my ksh rounding function produced wrong results when both the printf and float precision produced correct values. I am working on trying to find another method/formula that yields 100% results to check against.
64bitbox:~$ ksh test_float_rounding.ksh
count: 2, precision: 4, start: 0.00000, stop: 2.00000, increment: 0.00001
typeset -A -i intervals=([5:5]=11050 [5:9]=1000)
...
32bitbox:~$ ksh test_float_rounding.ksh
count: 2, precision: 4, start: 0.00000, stop: 2.00000, increment: 0.00001
typeset -A -i intervals=([5:5]=8979)

More to come later...

@hyenias
Copy link

hyenias commented Mar 26, 2020

I now have my test script running correctly on an ARMv7 box with an adjustment in my formula. The 32bitbox numbers supplied previously were run using text as input. Here are the corrected results.

Corrected:

32bitbox:~$ ksh test_float_rounding.ksh
dtype: t, count: 2, precision: 4, start: 0.00000, stop: 2.00000, increment: 0.00001
typeset -A -i intervals=([5:5]=9908)

Next up for me is to see if my test script holds up on 64bit CPU using 32bit OS.

@McDutchie
Copy link

McDutchie commented Jun 10, 2020

Hey @hyenias, since ksh-community is not going anywhere fast (see discussion in #11), I'd like to invite you to submit a pull request to the 93u+m branch at https://github.com/ksh93/ksh when you've figured out a fix. Floating point math is well over my head, so your contribution and expertise would be an asset.

@hyenias
Copy link

hyenias commented Jun 13, 2020

I was able to get my test rounding script to work on my macOS, various Linux, and FreeBSD versions back in late April. Interestingly, ksh93's floating point behavior differed the most on my 32-bit FreeBSD box with its floating point responses being more inline with what I expected.

Up until this issue, I did not remember the floating point math material from long ago. During my research, I found 0.30000000000000004.com as the most enlightening.

freebsd32-intel64:
$ typeset -F a b c
$ a=.1
$ b=.2
$ c=a+b
$ typeset -p a b c
typeset -F a=0.1000000000
typeset -F b=0.2000000000
typeset -F c=0.3000000000
$
$ printf '%.17f\n' c
0.30000000000000004
$ printf '%.17f\n' a+b
0.30000000000000004
$ printf '%.17f\n' .1+.2
0.30000000000000004
$ printf '%.17f\n' $((.1+.2))
0.30000000000000004
$ printf '%.17f\n' $((a+b))
0.30000000000000004
$ printf '%.17f\n' $((c))
0.30000000000000004
$

@McDutchie Thank you for the invite and I will be pleased to submit a pull request to your ksh93 branch. When all of my other ksh93 installations produced different results than the above along with the continuing kickoff efforts for ksh-community, I started learning other things. With all of your ksh93 branch activity, I am energized to continue working on this issue.

@hyenias
Copy link

hyenias commented Jul 12, 2020

Well, I have stared at the numbers long enough to determine the 3 following problems occurring with ksh's rounding using printf and float precision expansion.

  1. Floats that end in 5s are being represented as 49... [9 repeating] which is normal for floating point math. The problem is that ksh is acting on a string representation of the float/double/etc. Then it performs rounding via chars by adding 5 then working backwards to beginning along the string buffer. KSH just looks at the +1 requested rounding precision and starts there. A '4' char becomes '9' which is problem if 1.015's float-to-text conversion yields 1.0149999999999999.
$ typeset -F2 x=1.015
$ printf '%.2f %s %s\n' x $x $((x))
1.01 1.01 1.0149999999999999
$ # expected rounding should be 1.02
  1. For denormal or subnormal numbers ranging from 0.0 < x < 0.1, rounding fails for 5-9s probably due to some performance technique to avoid denormals such as scaling the values as floating point operations were very slow in the past when compared to integer operations.
  2. Depending on OS and CPU maybe even compiler option(s), rounding is being constrained to max precision of 6 or 10 when the max digits for the particular platform & OS should be 15-17 for a double precision floating point number. My guess is that some system defined float default has changed in someway.

@hyenias
Copy link

hyenias commented Jul 12, 2020

I will now work on some code fixes for the above. Primary culprit is src/lib/libast/sfio/sfcvt.c.

@hyenias
Copy link

hyenias commented Aug 23, 2020

Update: I have made successful progress on problem 1 and 2 above. Need to validate against the other float formats, SFFMT_EFORMAT and SFFMT_AFORMAT as I have only concentrated on plain floats and not scientific notation nor hexadecimal notation floats.

@jelmd
Copy link
Member

jelmd commented Aug 28, 2020

https://members.loria.fr/PZimmermann/papers/accuracy.pdf - interesting as well ...

@hyenias
Copy link

hyenias commented Sep 13, 2020

I have submitted corrections for sfcvt.c that fixes the original problem identified which is my 2nd bullet above about subnormal numbers. Still working on my items 1 and 3. I have also found another problem with rounding to whole numbers via a precision of 0.

ksh93/ksh#131 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants