Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sprintf %f bogus rounding #5519

Open
p6rt opened this issue Aug 2, 2016 · 9 comments
Open

sprintf %f bogus rounding #5519

p6rt opened this issue Aug 2, 2016 · 9 comments
Labels
Bug

Comments

@p6rt
Copy link

@p6rt p6rt commented Aug 2, 2016

Migrated from rt.perl.org#128818 (status was 'open')

Searchable as RT128818$

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

From zefram@fysh.org

(1180591620717411303424.0e0).Int
1180591620717411303424
sprintf("%f", 1180591620717411303424.0e0)
1180591620717410000000.000000

sprintf %f is not showing the true value of this Num, which it should.
The .Int coercion is correct, and shows that the true value is actually
available.

-zefram

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

From @pmichaud

On Tue, Aug 02, 2016 at 09​:56​:38AM -0700, Zefram wrote​:

(1180591620717411303424.0e0).Int
1180591620717411303424
sprintf("%f", 1180591620717411303424.0e0)
1180591620717410000000.000000

sprintf %f is not showing the true value of this Num, which it should.
The .Int coercion is correct, and shows that the true value is actually
available.

Nums hold only 15-17 digits of precision, so sprintf("%f")
likely has this one correct.

In particular, the true value is *not* always available, as illustrated by​:

  > (1180591620717411303424.0e0).Int
  1180591620717411303424
  > (1180591620717411303434.0e0).Int
  1180591620717411303424

Pm

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

From zefram@fysh.org

Patrick R. Michaud via RT wrote​:

Nums hold only 15-17 digits of precision, so sprintf("%f")
likely has this one correct.

No, it does not have this correct. The value represented in the Num is
exactly 2**70, and the .Int accurately reflects that. It is the sprintf
that is failing to show the Num's value.

2**70
1180591620717411303424
(2e0**70).Int
1180591620717411303424
sprintf("%f", 2e0**70)
1180591620717410000000.000000

In particular, the true value is *not* always available,

By "true value" I meant the value represented in floating point.
I did not mean the value directly represented by the decimal literal.
Your example here shows unavoidable (and correctly performed) rounding
in the decimal->float conversion. In my example I deliberately used a
value that could be exactly represented in floating point, so that this
decimal->float rounding would not arise. It is not the issue here.

It's sounding, in this and [perl #​128817], rather as though you're
thinking about floating point precision entirely in decimal terms.
It is as if you think Num implements decimal floating point, though
empirically it clearly exhibits the arithmetic behaviour of ordinary
binary floating point. Is there perhaps some underlying design decision
for decimal floating point that hasn't been carried through?

-zefram

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

From @pmichaud

On Tue, Aug 02, 2016 at 07​:46​:20PM +0100, Zefram wrote​:

sprintf("%f", 2e0**70)
1180591620717410000000.000000

In particular, the true value is *not* always available,

By "true value" I meant the value represented in floating point.

My apologies, I did not catch this meaning of "true value" from
your posts. You've correctly deduced that I've been referring to
the mathematically pure value represented by the decimal literal in
source code, and not the value that results from its conversion to
floating point.

I agree that having sprintf("%f") automatically truncate with zeros
after 15 digits of precision is perhaps not the best approach here.
I think it should perhaps be available to at least 17 places, and
possibly even more in examples like 2**70. But that's a language
specification call that someone outside of Rakudo probably needs
to make.

Pm

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 2, 2016

From zefram@fysh.org

Patrick R. Michaud wrote​:

                                       But that's a language

specification call that someone outside of Rakudo probably needs
to make.

doc/Type/Str.pod (which I checked before reporting the bug) says it's
"mostly identical to the C library sprintf function", and doesn't describe
any relevant derogation from the C definition. C practice is to show
the value actually represented by the floating-point value, correctly
rounded to the output precision. If you want the formal definition, the
most up-to-date version of the C standard that I have (not very recent,
but I don't expect this to have changed) says

# f,F A double argument representing a (finite) floating-point number
# is converted to decimal notation in the style "[-]ddd.ddd",
# where the number of digits after the decimal-point character
# is equal to the precision specification. If the precision is
# missing, it is taken as 6; if the precision is zero and the
# "#" flag is not specified, no decimal-point character appears.
# If a decimal-point character appears, at least one digit
# appears before it. The value is rounded to the appropriate
# number of digits.

This is slightly less clear than I expected​: there's a tiny bit of room
to argue about what the "appropriate number of digits" is. In context
I think it's clear enough that the appropriate number is the number of
digits following the decimal point, which the standard has just spent
half the paragraph describing. The standard does go on to make clear
that it doesn't require the rounding to be correct, though.

-zefram

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 3, 2016

From zefram@fysh.org

A related case, now that I've actually looked at the implementation​:

(2251799813685577e0 * 2e0**-51) - 1
1.46105350040671e-13
sprintf("%.13f", 2251799813685577e0 * 2e0**-51)
1.0000000000002

The Num value in this case is exactly 2251799813685577/2251799813685248,
or 1.000000000000146105350040670600719749927520751953125. (Subtracting
1 correctly shows a few of these digits.) The correct rounding to 13
decimal places is 1.0000000000001. There is plenty of precision left
in the float format after these decimal places​: one unit in that last
output place corresponds to over 450 ulp in the float format.

What's going on is that the implementation is first converting to
decimal with its default of 15 total digits, getting a correctly-rounded
1.00000000000015, and then rounding *that* to the output length,
getting 1.0000000000002. This is double rounding, a classic type of
mistake in numerical computation. You need to perform rounding only
once, straight to the output length​: either during the binary->decimal
conversion, or after a binary->decimal conversion that produced all the
digits necessary to represent the value exactly.

-zefram

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 12, 2016

From zefram@fysh.org

Attached is a straightforward implementation of the floating-point
printf conversions. I believe it to be correct. In addition to the
decimal %e, %f, and %g, I did the rather useful hexadecimal %a, thus
far entirely missing from Rakudo's sprintf.

I needed some supporting floating-point utilities, for which I translated
most of my Perl 5 module Data​::Float (including some parts not actually
used by the formatting code). Many of these constants and functions
really ought to be made available in Rakudo in some form, particularly the
IEEE 754r standard functions such as nextafter(). All of them would gain
a lot of performance from a C-level implementation that knows the exact
layout of the floating-point type; the code here is clunky but portable.

An observation​: testing floating-point code is substantially impeded by
not having reliable means of inputting and outputting floating-point
values. In this case, working on output code, I was impeded by
some problems with Rakudo's decimal->float conversions on input,
hence the handful of recent bug reports from me on that subject.
(This work also produced a slew of bug reports about Perl 5's printf
%a.) In investigating and reporting those bugs I was impeded by
the lack of good floating-point output facilities in Rakudo, hence
the less-than-obvious ways of examining Num values in those reports.
Input and output facilities support each other.

A consequent suggestion​: rather than working on decimal floating-point
I/O straight away, as a matter of priority you should add hexadecimal
floating-point input and output. Currently you have no hex output,
and only some unadvertised and limited input (no exponent handling)
in Str.Num. The hex is much easier to get right than the decimal,
because the radices are commensurable. Once hex I/O is available, it's
good for all kinds of debugging purposes, for injecting test values and
examining results. Having straightforward and reliable I/O makes all
other floating-point issues much easier to approach. Hex would also be
an easier way to solve [perl #​128819] than anything relying on decimal.

-zefram

@p6rt

This comment has been minimized.

Copy link
Author

@p6rt p6rt commented Aug 12, 2016

From zefram@fysh.org

# writable scalar references

my sub ref_to(Mu $a is rw) { my @​s; @​s[0] := $a; @​s }
my sub read_ref(@​s) { @​s[0] }
my sub write_ref(@​s, Mu $nv) { @​s[0] = $nv }

# floating-point utilities (imitating most of Perl 5 Data​::Float)

my @​powtwo = (2e0,);
my @​powhalf = (0.5e0,);

sub mult_pow2(Num​:D $value is copy, Int​:D $exp is copy) {
  my @​powa := @​powtwo;
  if $exp < 0 {
  @​powa := @​powhalf;
  $exp = -$exp;
  }
  loop (my $i = 0; $i != @​powa.end && $exp != 0; $i++) {
  $value *= @​powa[$i] if $exp +& 1;
  $exp +>= 1;
  }
  for ^$exp { $value *= @​powa[*-1]; }
  return $value;
}

my $min_finite_exp;
my $max_finite_exp;
my $max_finite_pow2;
my $min_finite;

my @​directions = (
  {
  expsign => -1,
  powa => @​powhalf,
  xexp => ref_to($min_finite_exp),
  xpower => ref_to($min_finite),
  },
  {
  expsign => +1,
  powa => @​powtwo,
  xexp => ref_to($max_finite_exp),
  xpower => ref_to($max_finite_pow2),
  },
);

while !@​directions[0]<done> || !@​directions[1]<done> {
  for @​directions -> $direction {
  next if $direction<done>;
  my $lastpow = $direction<powa>[*-1];
  my $nextpow = $lastpow * $lastpow;
  unless mult_pow2($nextpow, -$direction<expsign> *
  (1 +< $direction<powa>.end)) == $lastpow {
  $direction<done> = True;
  next;
  }
  $direction<powa>.push​: $nextpow;
  }
}

for @​directions -> $direction {
  my $expsign = $direction<expsign>;
  my $xexp = 1 +< $direction<powa>.end;
  my $extremum = $direction<powa>[*-1];
  loop (my $addexp = $xexp; $addexp +>= 1; ) {
  my $nx = mult_pow2($extremum, $expsign * $addexp);
  if mult_pow2($nx, -$expsign * $addexp) == $extremum {
  $xexp += $addexp;
  $extremum = $nx;
  }
  }
  write_ref($direction<xexp>, $expsign * $xexp);
  write_ref($direction<xpower>, $extremum);
}

sub pow2(Int​:D $exp where $exp >= $min_finite_exp && $exp <= $max_finite_exp) {
  mult_pow2(1e0, $exp)
}

my $significand_bits;
my $significand_step;
{
  my $i;
  loop ($i = 1; ; $i++) {
  my $tryeps = @​powhalf[$i];
  last unless (1e0 + $tryeps) - 1e0 == $tryeps;
  }
  $i--;
  $significand_bits = 1 +< $i;
  $significand_step = @​powhalf[$i];
  while $i-- {
  my $tryeps = $significand_step * @​powhalf[$i];
  if (1e0 + $tryeps) - 1e0 == $tryeps {
  $significand_bits += 1 +< $i;
  $significand_step = $tryeps;
  }
  }
}

my $max_finite = $max_finite_pow2 -
  pow2($max_finite_exp - $significand_bits - 1);
$max_finite += $max_finite;

my $max_integer = pow2($significand_bits + 1);

my $have_subnormal;
{
  my $testval = $min_finite * 1.5e0;
  $have_subnormal = $testval == $min_finite ||
  $testval == ($min_finite + $min_finite);
}

my $min_normal_exp = $have_subnormal ?? $min_finite_exp + $significand_bits !!
  $min_finite_exp;
my $min_normal = $have_subnormal ?? mult_pow2($min_finite, $significand_bits) !!
  $min_finite;

my $have_signed_zero = /^\-/.ACCEPTS((-0e0).Str).Bool;

enum FloatClass <Normal Subnormal Zero Infinite Nan>;

sub float_class(Num​:D $val is copy) {
  return Zero if $val == 0e0;
  return Nan if $val != $val;
  $val = -$val if $val < 0e0;
  return Infinite if $val == Inf;
  return $have_subnormal && $val < $min_normal ?? Subnormal !! Normal;
}

sub float_is_normal(Num​:D $val) { float_class($val) == Normal }

sub float_is_subnormal(Num​:D $val) { float_class($val) == Subnormal }

sub float_is_infinite(Num​:D $val) { $val == Inf || $val == -Inf }

sub float_is_nzfinite(Num​:D $val) {
  $val != 0e0 && $val == $val && !float_is_infinite($val)
}

sub float_is_zero(Num​:D $val) { $val == 0e0 }

sub float_is_finite(Num​:D $val) { $val == $val && !float_is_infinite($val) }

sub float_is_nan(Num​:D $val) { $val != $val }

sub signbit(Num​:D $val) {
  ($have_signed_zero && $val == 0e0 ??
  /^\-/.ACCEPTS($val.Str).Bool !! $val < 0e0) ?? 1 !! 0;
}

enum FloatSign <Positive Negative>;

sub float_sign(Num​:D $val where !float_is_nan($val)) {
  signbit($val) ?? Negative !! Positive
}

sub float_parts(Num​:D $val is copy where float_is_nzfinite($val)) {
  my $sign = Positive;
  if $val < 0e0 {
  $sign = Negative;
  $val = -$val;
  }
  if $have_subnormal && $val < $min_normal {
  return ($sign, $min_normal_exp,
  mult_pow2($val, -$min_normal_exp));
  }
  my $exp = 0;
  if $val < 1e0 {
  loop (my $i = @​powhalf.elems; $i--; ) {
  $exp +<= 1;
  if $val < @​powhalf[$i] {
  $exp +|= 1;
  $val = mult_pow2($val, 1 +< $i);
  }
  }
  $val *= 2e0;
  $exp = -1 - $exp;
  } elsif $val >= 2e0 {
  loop (my $i = @​powtwo.elems; $i--; ) {
  $exp +<= 1;
  if $val >= @​powtwo[$i] {
  $exp +|= 1;
  $val = mult_pow2($val, -(1 +< $i));
  }
  }
  }
  return ($sign, $exp, $val);
}

sub copysign(Num​:D $val, Num​:D $signfrom) {
  signbit($val) == signbit($signfrom) ?? $val !! -$val
}

sub nextup(Num​:D $val) {
  return $val if $val != $val || $val == Inf;
  return -$max_finite if $val == -Inf;
  return $min_finite if $val == 0e0;
  (my $sign, my $exp, my $significand) = float_parts($val);
  if $sign == Positive {
  $significand += $significand_step;
  if $significand == 2e0 {
  return Inf if $exp == $max_finite_exp;
  $significand = 1e0;
  $exp++;
  }
  } else {
  if $significand == 1e0 && $exp != $min_normal_exp {
  $significand = 2e0;
  $exp--;
  }
  $significand -= $significand_step;
  }
  return copysign(mult_pow2($significand, $exp), $val);
}

sub nextdown(Num​:D $val) { -nextup(-$val) }

sub nextafter(Num​:D $val, Num​:D $dir) {
  return $dir if $dir != $dir || $val == $dir;
  return $dir > $val ?? nextup($val) !! nextdown($val);
}

# floating-point output

my sub float_bin_scale(Num​:D $val where float_is_finite($val)) {
  return (float_sign($val), 0, 0) if float_is_zero($val);
  (my $sign, my $exp, my $significand) = float_parts($val);
  if $significand < 1e0 {
  (my $, my $addexp, $significand) =
  float_parts($significand);
  $exp += $addexp;
  }
  return ($sign, $exp, mult_pow2($significand, $significand_bits).Int);
}

my sub float_dec_scale(Num​:D $val where float_is_finite($val)) {
  (my $sign, my $exp, my $sclsig) = float_bin_scale($val);
  my $nfrac = $significand_bits - $exp;
  if $nfrac >= 0 {
  return ($sign, $nfrac, $sclsig * 5 ** $nfrac);
  } else {
  return ($sign, 0, $sclsig * 2 ** -$nfrac);
  }
}

my sub floor_log10(Int​:D $val is copy where $val >= 1) {
  state @​powten = (10,);
  while $val > @​powten[*-1] {
  my $t = @​powten[*-1];
  @​powten.push​: $t * $t;
  }
  my $exp = 0;
  loop (my $i = @​powten.elems; $i--; ) {
  $exp +<= 1;
  if $val >= @​powten[$i] {
  $exp +|= 1;
  $val div= @​powten[$i];
  }
  }
  return $exp;
}

my sub round2(Int​:D $val where $val >= 0, Int​:D $losebits) {
  if $losebits > 0 {
  my $lbit = 1 +< $losebits;
  my $hbit = $lbit +> 1;
  my $lval = $val +& ($lbit - 1);
  my $rval = $val +> $losebits;
  $rval++ if $lval == $hbit ?? $rval +& 1 !! $lval > $hbit;
  return $rval;
  } else {
  return $val +< -$losebits;
  }
}

my sub round10(Int​:D $val where $val >= 0, Int​:D $losedigits) {
  if $losedigits > 0 {
  my $lone = 10 ** $losedigits;
  my $hone = $lone +> 1;
  my $lval = $val % $lone;
  my $rval = $val div $lone;
  $rval++ if $lval == $hone ?? $rval +& 1 !! $lval > $hone;
  return $rval;
  } else {
  return $val * 10 ** -$losedigits;
  }
}

sub float_sprintf(
  Num​:D $val,
  Str​:D $style where /^<[aefg]>$/.ACCEPTS($style),
  Int​:D :$width where $width >= 0 = 0,
  Int​:D :$precision is copy where $precision >= -1 = -1,
  Bool​:D :$upper = False, Bool​:D :$left = False, Bool​:D :$plus = False,
  Bool​:D :$space = False, Bool​:D :$alternate is copy = False,
  Bool​:D :$zero is copy = False,
) {
  my $sign;
  my $prefix;
  my $digits;
  my $suffix;
  if float_is_nan($val) || float_is_infinite($val) {
  if float_is_nan($val) {
  $sign = Positive;
  $suffix = "nan";
  } else {
  $sign = float_sign($val);
  $suffix = "inf";
  }
  $prefix = "";
  $digits = "";
  $precision = 0;
  $alternate = False;
  $zero = False;
  } elsif $style eq "a" {
  ($sign, my $exp, my $sclsig) = float_bin_scale($val);
  my $nfrac = ($significand_bits + 3) div 4;
  $sclsig +<= ($nfrac * 4) - $significand_bits;
  if $precision == -1 {
  $precision = $nfrac;
  while $precision != 0 && ($sclsig +& 0xf) == 0 {
  $precision--;
  $sclsig +>= 4;
  }
  } else {
  $sclsig = round2($sclsig, ($nfrac - $precision) * 4);
  if $sclsig == (2 +< ($precision * 4)) {
  $sclsig +>= 1;
  $exp++;
  }
  }
  $prefix = "0x";
  $digits = sprintf("%0*x", $precision + 1, $sclsig);
  $suffix = sprintf("p%+d", $exp);
  } elsif $style eq "f" {
  $precision = 6 if $precision == -1;
  ($sign, my $nfrac, my $sclsig) = float_dec_scale($val);
  $sclsig = round10($sclsig, $nfrac - $precision);
  $prefix = "";
  $digits = sprintf("%0*d", $precision + 1, $sclsig);
  $suffix = "";
  } else {
  $precision = 6 if $precision == -1;
  $precision-- if $style eq "g" && $precision != 0;
  ($sign, my $nfrac, my $sclsig) = float_dec_scale($val);
  my $ndig = $sclsig == 0 ?? $nfrac !! floor_log10($sclsig);
  my $exp = $ndig - $nfrac;
  $sclsig = round10($sclsig, $ndig - $precision);
  if $sclsig == 10 ** ($precision + 1) {
  $sclsig div= 10;
  $exp++;
  }
  my $use_e = $style eq "e" || $exp < -4 || $exp > $precision;
  $precision -= $exp unless $use_e;
  if $style eq "g" && !$alternate {
  while $precision != 0 && $sclsig %% 10 {
  $precision--;
  $sclsig div= 10;
  }
  }
  $prefix = "";
  $digits = sprintf("%0*d", $precision + 1, $sclsig);
  $suffix = $use_e ?? sprintf("e%+03d", $exp) !! "";
  }
  my $mainstr = substr($digits, 0, * - $precision) ~
  ($alternate || $precision != 0 ?? "." !! "") ~
  substr($digits, * - $precision) ~ $suffix;
  $prefix = ($sign == Negative ?? "-" !! $plus ?? "+" !!
  $space ?? " " !! "") ~ $prefix;
  my $deficit = max($width - ($prefix.chars + $mainstr.chars), 0);
  my $str = $left ?? $prefix ~ $mainstr ~ (" " x $deficit) !!
  $zero ?? $prefix ~ ("0" x $deficit) ~ $mainstr !!
  (" " x $deficit) ~ $prefix ~ $mainstr;
  $str .= uc if $upper;
  return $str;
}

# main program for test

my $fmt = @​*ARGS.shift;
/^\%(<[\-\+\x[20]\#​0]>*)(<[123456789]><[0123456789]>*)?
  (\.(<[0123456789]>*))?(<[aAeEfFgG]>)
$/.ACCEPTS($fmt) or die;
my $flags = $0.Str;
my $width = $1 ?? $1.Int !! 0;
my $precision = $2 ?? ($2[0] eq "" ?? 0 !! $2[0].Int) !! -1;
my $conv = $3.Str;
my $style = lc($conv);
my $upper = /^<[A..Z]>$/.ACCEPTS($conv).Bool;
my $left = /\-/.ACCEPTS($flags).Bool;
my $plus = /\+/.ACCEPTS($flags).Bool;
my $space = /\x[20]/.ACCEPTS($flags).Bool;
my $alternate = /\#/.ACCEPTS($flags).Bool;
my $zero = /0/.ACCEPTS($flags).Bool;

for @​*ARGS {
  my $num = /^\-(0+(\.0+)?|\.0+)(<[eE]><[\-\+]>?<[0123456789]>+)?$/\
  .ACCEPTS($_) ?? -0e0 !! $_.Num;
  say $_, " -> ",
  float_sprintf($num, $style, :$width, :$precision, :$upper,
  :$left, :$plus, :$space, :$alternate, :$zero).perl;
}

@p6rt p6rt added the Bug label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.