New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support hexadecimal floats #13966
Comments
From @jhi[resubmitting since I think the grues ate my first attempt] Perl could support hexadecimal floats: * literals: 0xh.hhhp[+-]?NNN, e.g. 0x1.47ae147ae147bp-7 is 0.1 Lack of %a noted by Dan Kogai: https://groups.google.com/d/msg/perl.perl5.porters/c84JU0olnbQ/YwQczyrqE2YJ Possibly useful resource: http://www.exploringbinary.com/hexadecimal-floating-point-constants/ found by quick googling. Ruby does support the %a %A as noted by Dan, and Python has float.hex() and float.fromhex(). |
From @jhi
Oops, 0.01 |
From @iabynOn Wed, Jul 02, 2014 at 07:49:46PM -0700, Jarkko Hietaniemi wrote:
Wouldn't that change the meaning of existing legal syntax: e.g. print 0x1.10; -- |
The RT System itself - Status changed from 'new' to 'open' |
From @ilmariDave Mitchell <davem@iabyn.com> writes:
$ perl -e 'print 0x1.10p+0' -- |
From @jhiOn Thursday-201407-03, 8:26, Dagfinn Ilmari Mannsåker via RT wrote:
Yeah, I think the 'p' (hmm, is that 'P' with %A?) is a mandatory part of |
From @HugmeirOn Thu, Jul 3, 2014 at 2:34 PM, Jarkko Hietaniemi <jhi@iki.fi> wrote:
sub deadbeefp () {3} Personally, I think adding the construct + a deprecation warning for |
From @iabynOn Thu, Jul 03, 2014 at 01:25:54PM +0100, Dagfinn Ilmari Mannsåker wrote:
Ah sorry, didn't spot the p. -- |
From @jhi
You have a twisted mind, and this is a compliment.
Based on |
From @jhiSo I did some hacking to get this working for at least *printf and literals, and two patches are attached. However: the "hexadecimal floats" support seems to be quite... interesting. As in "interesting times" interesting. So it's a C99 feature. Output with sprintf %a %A, input with strtod (or strtold). In theory. The attached patches (and their tests) work with: OSX x86 (I *think* the output side at least did work in win32, but the win32 smoker must be overwhelmed or something, I seem to get no results) But cracks start to appear... OS X x86 with -Duselongdouble has differences in the *printf output On the output side differences are easy since we are talking about floats: the exponent may float. But even what the basic %a means seems to be up to interpretation: But if strtod is not working, I don't feel like rewriting David Gay's dtoa.c (which is the canonical strtod source for many operating systems, like BSD, or other OSS projects use): http://www.netlib.org/fp/dtoa.c If output is not working (or needs to be standardized), we need to dig into the fp bits ourselves. I found this from the NetBSD: https://github.com/rumpkernel/netbsd-userspace-src/blob/master/lib/libc/gdtoa/hdtoa.c |
From @jhi0001-Hexfloat-sprintf-a-A-part-of-perl-122219.patchFrom aab62f78c4f785265ec874e220e45ec4a0653b06 Mon Sep 17 00:00:00 2001
From: Jarkko Hietaniemi <jhi@iki.fi>
Date: Wed, 30 Jul 2014 21:59:57 -0400
Subject: [PATCH 1/2] Hexfloat sprintf %a/%A, part of perl #122219
Just punt the task to system printf, do whatever it does
for %a/%A (%[efgEFG] are handled likewise).
Let me count the ways this can go wrong:
(1) long doubles
(2) no %a (it's C99)
(2) different implementations of %a
(3) broken implementations of %a
(5) IEEE 754 does not define endianness (big, little, mixed (some arms))
(6) non-IEEE-754 formats (vax, cray, ibm, ...)
---
pod/perlfunc.pod | 7 +++++++
sv.c | 38 ++++++++++++++++++++++++++++++--------
t/op/sprintf.t | 4 ++--
t/op/sprintf2.t | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 94 insertions(+), 11 deletions(-)
diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod
index 173615b..877dc71 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -7109,6 +7109,8 @@ In addition, Perl permits the following widely-supported conversions:
%p a pointer (outputs the Perl value's address in hexadecimal)
%n special: *stores* the number of characters output so far
into the next argument in the parameter list
+ %a hexadecimal floats
+ %A like %a, but using upper-case letters
Finally, for backward (and we do mean "backward") compatibility, Perl
permits these unnecessary but widely-supported conversions:
@@ -7125,6 +7127,11 @@ exponent less than 100 is system-dependent: it may be three or less
(zero-padded as necessary). In other words, 1.23 times ten to the
99th may be either "1.23e99" or "1.23e099".
+Note that the hexadecimal digits produced by C<%a> and C<%A> are
+system-dependent: most machines use the 64-bit IEEE 754 double
+precision floating point, but some do not. Watch out especially
+for the C<uselongdouble> Perl configuration option.
+
Between the C<%> and the format letter, you may specify several
additional attributes controlling the interpretation of the format.
In order, these are:
diff --git a/sv.c b/sv.c
index afd4376..df2f54c 100644
--- a/sv.c
+++ b/sv.c
@@ -11376,6 +11376,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
case 'e': case 'E':
case 'f':
case 'g': case 'G':
+ case 'a': case 'A':
if (vectorize)
goto unknown;
@@ -11428,14 +11429,30 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
/* nv * 0 will be NaN for NaN, +Inf and -Inf, and 0 for anything
else. frexp() has some unspecified behaviour for those three */
if (c != 'e' && c != 'E' && (nv * 0) == 0) {
- i = PERL_INT_MIN;
- /* FIXME: if HAS_LONG_DOUBLE but not USE_LONG_DOUBLE this
- will cast our (long double) to (double) */
- (void)Perl_frexp(nv, &i);
- if (i == PERL_INT_MIN)
- Perl_die(aTHX_ "panic: frexp");
- if (i > 0)
- need = BIT_DIGITS(i);
+ i = PERL_INT_MIN;
+ /* FIXME: if HAS_LONG_DOUBLE but not USE_LONG_DOUBLE this
+ will cast our (long double) to (double) */
+ (void)Perl_frexp(nv, &i);
+ if (i == PERL_INT_MIN)
+ Perl_die(aTHX_ "panic: frexp");
+ if (c == 'a' || c == 'A') {
+ /* This computation probably overshoots,
+ * but that is better than undershooting. */
+ need +=
+ (nv < 0) + /* possible unary minus */
+ 2 + /* "0x" */
+ 2 + /* "1." */
+ /* We want one byte per each 4 bits in the
+ * mantissa. This works out to about 0.83
+ * bytes per NV decimal digit (of 4 bits):
+ * (NV_DIG * log(10)/log(2)) / 4 */
+ ((NV_DIG * 5) / 6 + 1) +
+ 2 + /* "p+" */
+ (i >= 0 ? BIT_DIGITS(i) : 1 + BIT_DIGITS(-i)) +
+ 1; /* \0 */
+ } else if (i > 0) {
+ need = BIT_DIGITS(i);
+ } /* if i < 0, the number of digits is hard to predict. */
}
need += has_precis ? precis : 6; /* known default */
@@ -11573,6 +11590,11 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p
STORE_LC_NUMERIC_SET_TO_NEEDED();
+ /* XXX Configure test for sprintf %a/%A support.
+ * It is a C99 feature, but might be implemented elsewhere.
+ * The bad news is that if there is no support,
+ * we would need to implement %a/%A ourselves. */
+
/* hopefully the above makes ptr a very constrained format
* that is safe to use, even though it's not literal */
GCC_DIAG_IGNORE(-Wformat-nonliteral);
diff --git a/t/op/sprintf.t b/t/op/sprintf.t
index 4c41b16..234a7d6 100644
--- a/t/op/sprintf.t
+++ b/t/op/sprintf.t
@@ -179,7 +179,7 @@ __END__
>%6. 6s< >''< >%6. 6s INVALID REDUNDANT< >(See use of $w in code above)<
>%6 .6s< >''< >%6 .6s INVALID REDUNDANT<
>%6.6 s< >''< >%6.6 s INVALID REDUNDANT<
->%A< >''< >%A INVALID REDUNDANT<
+>%A< >0< >< >tested in sprintf2.t skip: all<
>%B< >2**32-1< >11111111111111111111111111111111<
>%+B< >2**32-1< >11111111111111111111111111111111<
>%#B< >2**32-1< >0B11111111111111111111111111111111<
@@ -213,7 +213,7 @@ __END__
>%#X< >2**32-1< >0XFFFFFFFF<
>%Y< >''< >%Y INVALID REDUNDANT<
>%Z< >''< >%Z INVALID REDUNDANT<
->%a< >''< >%a INVALID REDUNDANT<
+>%a< >0< >< >tested in sprintf2.t skip: all<
>%b< >2**32-1< >11111111111111111111111111111111<
>%+b< >2**32-1< >11111111111111111111111111111111<
>%#b< >2**32-1< >0b11111111111111111111111111111111<
diff --git a/t/op/sprintf2.t b/t/op/sprintf2.t
index 6fd0bde..72bde57 100644
--- a/t/op/sprintf2.t
+++ b/t/op/sprintf2.t
@@ -12,7 +12,54 @@ BEGIN {
eval { my $q = pack "q", 0 };
my $Q = $@ eq '';
-plan tests => 1406 + ($Q ? 0 : 12);
+# %a and %A depend on the floating point config
+# This totally doesn't test non-IEEE-754 float formats.
+my @hexfloat;
+if ($Config{nvsize} == 8) { # IEEE-754, we hope, the most common out there
+ @hexfloat = (
+ [ '%a', '0', '0x0p+0' ],
+ [ '%a', '1', '0x1p+0' ],
+ [ '%a', '1.0', '0x1p+0' ],
+ [ '%a', '3.14', '0x1.91eb851eb851fp+1' ],
+ [ '%a', '-1.0', '-0x1p+0' ],
+ [ '%a', '-3.14', '-0x1.91eb851eb851fp+1' ],
+ [ '%a', '0.1', '0x1.999999999999ap-4' ],
+ [ '%a', '2**-10', '0x1p-10' ],
+ [ '%a', '2**10', '0x1p+10' ],
+ [ '%a', '1e-9', '0x1.12e0be826d695p-30' ],
+ [ '%a', '1e9', '0x1.dcd65p+29' ],
+ [ '%13a', '3.14', '0x1.91eb851eb851fp+1' ],
+ [ '%.7a', '3.14', '0x1.91eb852p+1' ],
+ [ '%.8a', '3.14', '0x1.91eb851fp+1' ],
+ [ '%.20a', '3.14', '0x1.91eb851eb851f0000000p+1' ],
+ [ '%20.10a', '3.14', ' 0x1.91eb851eb8p+1' ],
+ [ '%20.15a', '3.14', '0x1.91eb851eb851f00p+1' ],
+ [ '%A', '3.14', '0X1.91EB851EB851FP+1' ],
+ );
+} elsif ($Config{nvsize} == 16) { # x86 long double, at least
+ @hexfloat = (
+ [ '%a', '0', '0x0p+0' ],
+ [ '%a', '1', '0x8p-3' ],
+ [ '%a', '1.0', '0x8p-3' ],
+ [ '%a', '3.14', '0xc.8f5c28f5c28f5c3p-2' ],
+ [ '%a', '-1.0', '-0x8p-3' ],
+ [ '%a', '-3.14', '-0xc.8f5c28f5c28f5c3p-2' ],
+ [ '%a', '0.1', '0xc.ccccccccccccccdp-7' ],
+ [ '%a', '2**-10', '0x8p-13' ],
+ [ '%a', '2**10', '0x8p+7' ],
+ [ '%a', '1e-9', '0x8.9705f4136b4a597p-33' ],
+ [ '%a', '1e9', '0xe.e6b28p+26' ],
+ [ '%13a', '3.14', '0xc.8f5c28f5c28f5c3p-2' ],
+ [ '%.7a', '3.14', '0xc.8f5c28fp-2' ],
+ [ '%.8a', '3.14', '0xc.8f5c28f6p-2' ],
+ [ '%.20a', '3.14', '0xc.8f5c28f5c28f5c300000p-2' ],
+ [ '%20.10a', '3.14', ' 0xc.8f5c28f5c3p-2' ],
+ [ '%20.15a', '3.14', '0xc.8f5c28f5c28f5c3p-2' ],
+ [ '%A', '3.14', '0XC.8F5C28F5C28F5C3P-2' ],
+ );
+}
+
+plan tests => 1406 + ($Q ? 0 : 12) + @hexfloat;
use strict;
use Config;
@@ -336,3 +383,10 @@ is $o::count, '1', 'sprinf %1s overload count';
$o::count = 0;
() = sprintf "%.1s", $o;
is $o::count, '1', 'sprinf %.1s overload count';
+
+for my $t (@hexfloat) {
+ my ($format, $arg, $expected) = @$t;
+ $arg = eval $arg;
+ my $result = sprintf($format, $arg);
+ is($result, $expected, "'$format' '$arg' -> '$result' cf '$expected'");
+}
--
1.8.5.2 (Apple Git-48)
|
From @jhi0002-Hexfloat-literals-part-of-perl-122219.patchFrom 4d7069f0e1cf210e0cf8a3385cfb5e5716a5303b Mon Sep 17 00:00:00 2001
From: Jarkko Hietaniemi <jhi@iki.fi>
Date: Thu, 31 Jul 2014 12:37:58 -0400
Subject: [PATCH 2/2] Hexfloat literals, part of perl #122219
Punt to strtod/strtold, just like with decimal floats.
The hexfloat support is C99 feature, like its converse %a/%A.
---
MANIFEST | 1 +
pod/perldata.pod | 8 +++++
pod/perldiag.pod | 17 ++++++++++
t/op/hexfloat.t | 78 +++++++++++++++++++++++++++++++++++++++++++
t/op/sprintf2.t | 8 +++++
toke.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
6 files changed, 203 insertions(+), 9 deletions(-)
create mode 100644 t/op/hexfloat.t
diff --git a/MANIFEST b/MANIFEST
index 54c5bea..5b99b16 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -5086,6 +5086,7 @@ t/op/hash-rt85026.t See if hash iteration/deletion works
t/op/hash.t See if the complexity attackers are repelled
t/op/hashwarn.t See if warnings for bad hash assignments work
t/op/heredoc.t See if heredoc edge and corner cases work
+t/op/hexfloat.t See if hexadecimal float literals work
t/op/inccode.t See if coderefs work in @INC
t/op/inccode-tie.t See if tie to @INC works
t/op/incfilter.t See if the source filters in coderef-in-@INC work
diff --git a/pod/perldata.pod b/pod/perldata.pod
index d8edfe9..40d3336 100644
--- a/pod/perldata.pod
+++ b/pod/perldata.pod
@@ -402,6 +402,7 @@ integer formats:
0xdead_beef # more hex
0377 # octal (only numbers, begins with 0)
0b011011 # binary
+ 0x1.999ap-4 # hexadecimal floating point
You are allowed to use underscores (underbars) in numeric literals
between digits for legibility (but not multiple underscores in a row:
@@ -425,6 +426,13 @@ Hexadecimal, octal, or binary, representations in string literals
representation. The hex() and oct() functions make these conversions
for you. See L<perlfunc/hex> and L<perlfunc/oct> for more details.
+Hexadecimal floating point is useful for accurately presenting
+floating point values, avoiding conversions to or from decimal floating
+point, and therefore avoiding possible loss in precision. Notice
+that while most current platforms use 64-bit IEEE 754 floating point,
+not all do. For example x86 platforms can be configured with "long doubles",
+which are not compatible with normal "doubles".
+
You can also embed newlines directly in your strings, i.e., they can end
on a different line than they begin. This is nice, but if you forget
your trailing quote, the error will not be reported until Perl finds
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index e41c8cc..d3553bd 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -2172,6 +2172,23 @@ created on an emergency basis to prevent a core dump.
(F) The parser has given up trying to parse the program after 10 errors.
Further error messages would likely be uninformative.
+=item Hexadecimal float malformed: '%s'
+
+(W syntax) Hexadecimal float literals (like 0x12.34p5) are unsupported
+in this system.
+
+=item Hexadecimal float overflow: '%s'
+
+(W syntax) Hexadecimal float literal overflowed.
+
+=item Hexadecimal float underflow: '%s'
+
+(W syntax) Hexadecimal float literal underflowed.
+
+=item Hexadecimal float unsupported: '%s'
+
+(F) Hexadecimal float literals (like 0x12.34p5) are unsupported in this system.
+
=item Hexadecimal number > 0xffffffff non-portable
(W portable) The hexadecimal number you specified is larger than 2**32-1
diff --git a/t/op/hexfloat.t b/t/op/hexfloat.t
new file mode 100644
index 0000000..eb8f6bb
--- /dev/null
+++ b/t/op/hexfloat.t
@@ -0,0 +1,78 @@
+#!./perl
+
+use strict;
+
+BEGIN {
+ chdir 't' if -d 't';
+ require './test.pl';
+}
+
+plan(tests => 38);
+
+# Test hexfloat literals.
+
+is(0x1p0, 1);
+is(0x1.p0, 1);
+is(0x1.0p0, 1);
+
+is(0x1p1, 2);
+is(0x1.p1, 2);
+is(0x1.0p1, 2);
+
+is(0x.1p0, 0.0625);
+is(0x0.1p0, 0.0625);
+
+# Positive exponents.
+is(0x1p2, 4);
+is(0x1p+2, 4);
+
+# Negative exponents.
+is(0x1p-1, 0.5);
+is(0x1.p-1, 0.5);
+is(0x1.0p-1, 0.5);
+
+is(0x1p+2, 4);
+is(0x1p-2, 0.25);
+
+is(0x3p+2, 12);
+is(0x3p-2, 0.75);
+
+# Negative sign.
+is(-0x1p+2, -4);
+is(-0x1p-2, -0.25);
+
+is(0x0.10p0, 0.0625);
+is(0x0.1p0, 0.0625);
+is(0x.1p0, 0.0625);
+
+is(0x12p+3, 144);
+is(0x12p-3, 2.25);
+
+# Hexdigits (lowercase).
+is(0x9p+0, 9);
+is(0xap+0, 10);
+is(0xfp+0, 15);
+is(0x10p+0, 16);
+is(0x11p+0, 17);
+is(0xabp+0, 171);
+is(0xab.cdp+0, 171.80078125);
+
+# Uppercase hexdigits and exponent prefix.
+is(0xAp+0, 10);
+is(0xFp+0, 15);
+is(0xABP+0, 171);
+is(0xAB.CDP+0, 171.80078125);
+
+# Underbars.
+is(0xa_b.c_dp+0, 171.80078125);
+
+# Note that the hexfloat representation is not unique
+# since the exponent can be shifted: no different from
+# 3e4 cf 30e3 cf 30000.
+
+# Needs to use within because of long doubles.
+within(0x1.999999999999ap-4, 0.1, 1e-9);
+within(0xc.ccccccccccccccdp-7, 0.1, 1e-9);
+
+# sprintf %a/%A testing is done in sprintf2.t,
+# trickier than necessary because of long doubles.
diff --git a/t/op/sprintf2.t b/t/op/sprintf2.t
index 72bde57..824c06a 100644
--- a/t/op/sprintf2.t
+++ b/t/op/sprintf2.t
@@ -34,7 +34,11 @@ if ($Config{nvsize} == 8) { # IEEE-754, we hope, the most common out there
[ '%.20a', '3.14', '0x1.91eb851eb851f0000000p+1' ],
[ '%20.10a', '3.14', ' 0x1.91eb851eb8p+1' ],
[ '%20.15a', '3.14', '0x1.91eb851eb851f00p+1' ],
+
[ '%A', '3.14', '0X1.91EB851EB851FP+1' ],
+
+ [ '%a', 0x12.34p5, '0x1.234p+9' ],
+ [ '%a', 0x1_2.3_4p5, '0x1.234p+9' ],
);
} elsif ($Config{nvsize} == 16) { # x86 long double, at least
@hexfloat = (
@@ -55,7 +59,11 @@ if ($Config{nvsize} == 8) { # IEEE-754, we hope, the most common out there
[ '%.20a', '3.14', '0xc.8f5c28f5c28f5c300000p-2' ],
[ '%20.10a', '3.14', ' 0xc.8f5c28f5c3p-2' ],
[ '%20.15a', '3.14', '0xc.8f5c28f5c28f5c3p-2' ],
+
[ '%A', '3.14', '0XC.8F5C28F5C28F5C3P-2' ],
+
+ [ '%a', 0x12.34p5, '0x9.1ap+6' ],
+ [ '%a', 0x1_2.3_4p5, '0x9.1ap+6' ],
);
}
diff --git a/toke.c b/toke.c
index b0997ef..8454d6f 100644
--- a/toke.c
+++ b/toke.c
@@ -9796,6 +9796,7 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
bool floatit; /* boolean: int or float? */
const char *lastub = NULL; /* position of last underbar */
static const char* const number_too_long = "Number too long";
+ bool hexfloat = FALSE;
PERL_ARGS_ASSERT_SCAN_NUM;
@@ -9909,6 +9910,14 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
/* make sure they said 0x */
if (shift != 4)
goto out;
+
+ if (s[1] == '.' &&
+ /* hexfloat? peekahead to avoid matching ".." */
+ (isXDIGIT(s[2]) || s[1] == 'p' || s[2] == 'P')) {
+ s++;
+ goto out;
+ }
+
b = (*s++ & 7) + 9;
/* Prepare to put the digit we have onto the end
@@ -9977,6 +9986,25 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
sv, NULL, NULL, 0);
else if (PL_hints & HINT_NEW_BINARY)
sv = new_constant(start, s - start, "binary", sv, NULL, NULL, 0);
+ if (*s == '.' || *s == 'p' || *s == 'P') {
+ /* sloppy (on the underbars) but quick detection of
+ * hexfloats, the decimal detection will be more
+ * thorough. */
+ const char* h = s;
+ if (*h == '.') {
+ h++;
+ while (isXDIGIT(*h) || *h == '_') h++;
+ }
+ if (*h == 'p' || *h == 'P') {
+ h++;
+ if (*h == '+' || *h == '-')
+ h++;
+ if (isDIGIT(*h)) {
+ hexfloat = TRUE;
+ goto decimal;
+ }
+ }
+ }
}
break;
@@ -9989,10 +10017,16 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
decimal:
d = PL_tokenbuf;
e = PL_tokenbuf + sizeof PL_tokenbuf - 6; /* room for various punctuation */
- floatit = FALSE;
+ floatit = FALSE;
+ if (hexfloat) {
+ floatit = TRUE;
+ *d++ = '0';
+ *d++ = 'x';
+ s = start + 2;
+ }
/* read next group of digits and _ and copy into d */
- while (isDIGIT(*s) || *s == '_') {
+ while (isDIGIT(*s) || (hexfloat && isXDIGIT(*s)) || *s == '_') {
/* skip underscores, checking for misplaced ones
if -w is on
*/
@@ -10032,7 +10066,8 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
/* copy, ignoring underbars, until we run out of digits.
*/
- for (; isDIGIT(*s) || *s == '_'; s++) {
+ for (; isDIGIT(*s) || (hexfloat && isXDIGIT(*s)) ||
+ *s == '_'; s++) {
/* fixed length buffer check */
if (d >= e)
Perl_croak(aTHX_ "%s", number_too_long);
@@ -10058,12 +10093,21 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
}
/* read exponent part, if present */
- if ((*s == 'e' || *s == 'E') && strchr("+-0123456789_", s[1])) {
- floatit = TRUE;
+ if (((*s == 'e' || *s == 'E') || (*s == 'p' || *s == 'P')) &&
+ strchr("+-0123456789_", s[1])) {
+ floatit = TRUE;
+
+ /* regardless of whether user said 3E5 or 3e5, use lower 'e',
+ ditto for p (hexfloats) */
+ if ((*s == 'e' || *s == 'E')) {
+ /* At least some Mach atof()s don't grok 'E' */
+ *d++ = 'e';
+ } else if ((*s == 'p' || *s == 'P')) {
+ *d++ = 'p';
+ }
+
s++;
- /* regardless of whether user said 3E5 or 3e5, use lower 'e' */
- *d++ = 'e'; /* At least some Mach atof()s don't grok 'E' */
/* stray preinitial _ */
if (*s == '_') {
@@ -10127,9 +10171,47 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp)
STORE_NUMERIC_LOCAL_SET_STANDARD();
/* terminate the string */
*d = '\0';
- nv = Atof(PL_tokenbuf);
+ if (hexfloat) {
+ /* for hexfloats, punt to strtod/strtold, or die. */
+ /* XXX Configure test for strtod/strtold hexfloat support.
+ * It is a C99 feature, but might be implemented elsewhere. */
+ char* endp = PL_tokenbuf;
+ dSAVE_ERRNO;
+ SETERRNO(0,0);
+#if defined(USE_LONG_DOUBLE) && defined(HAS_STRTOLD)
+ nv = strtold(PL_tokenbuf, &endp);
+#elif defined(HAS_STRTOD)
+ nv = strtod(PL_tokenbuf, &endp);
+#else
+ Perl_croak(aTHX_
+ "Hexadecimal float unsupported: '%s'",
+ PL_tokenbuf);
+#endif
+ /* XXX test these warnings */
+ /* errno is ERANGE, commonly, but any non-zero
+ * errno should indicate failure (note that the
+ * scope above is intentionally tight: set errno
+ * to zero, call strtod or strtold, inspect errno.) */
+ if (errno) {
+ if (nv == NV_INF || nv == -NV_INF)
+ Perl_ck_warner(aTHX_ packWARN(WARN_SYNTAX),
+ "Hexadecimal float overflow: '%s'",
+ PL_tokenbuf);
+ else if (nv == 0.0)
+ Perl_ck_warner(aTHX_ packWARN(WARN_SYNTAX),
+ "Hexadecimal float underflow: '%s'",
+ PL_tokenbuf);
+ }
+ if (endp == NULL || endp == PL_tokenbuf || *endp)
+ Perl_ck_warner(aTHX_ packWARN(WARN_SYNTAX),
+ "Hexadecimal float malformed: '%s'",
+ PL_tokenbuf);
+ RESTORE_ERRNO;
+ } else {
+ nv = Atof(PL_tokenbuf);
+ }
RESTORE_NUMERIC_LOCAL();
- sv = newSVnv(nv);
+ sv = newSVnv(nv);
}
if ( floatit
--
1.8.5.2 (Apple Git-48)
|
From [Unknown Contact. See original ticket]So I did some hacking to get this working for at least *printf and literals, and two patches are attached. However: the "hexadecimal floats" support seems to be quite... interesting. As in "interesting times" interesting. So it's a C99 feature. Output with sprintf %a %A, input with strtod (or strtold). In theory. The attached patches (and their tests) work with: OSX x86 (I *think* the output side at least did work in win32, but the win32 smoker must be overwhelmed or something, I seem to get no results) But cracks start to appear... OS X x86 with -Duselongdouble has differences in the *printf output On the output side differences are easy since we are talking about floats: the exponent may float. But even what the basic %a means seems to be up to interpretation: But if strtod is not working, I don't feel like rewriting David Gay's dtoa.c (which is the canonical strtod source for many operating systems, like BSD, or other OSS projects use): http://www.netlib.org/fp/dtoa.c If output is not working (or needs to be standardized), we need to dig into the fp bits ourselves. I found this from the NetBSD: https://github.com/rumpkernel/netbsd-userspace-src/blob/master/lib/libc/gdtoa/hdtoa.c |
From @arcJarkko Hietaniemi via RT <perlbug-comment@perl.org> wrote:
Excellent — thanks!
According to this page: http://msdn.microsoft.com/en-us/library/hf4y5e3w(v=vs.71).aspx the compiler in Visual Studio 2003 doesn't support %a formats in Corrections welcome from anyone who knows anything about win32.
I think that's not terribly unreasonable. An IEEE double has 53 bits But I take your point that it's somewhat vexing for these purposes.
That's undeniably a fairly cruddy %a implementation (in the sense that
As far as I know, it's possible to implement hex float I/O without What would happen if we borrowed one of the other implementations -- |
From @jhi
I should have included more examples, I think Solaris provided those...
For example: what is the '%a' supposed to "optimize for"? As few
Indeed. (Which reminds me that our inf/nan support is still a bit dubious.)
BSD licensed code is no problem, we have historically borrowed used For the netlib code, somebody with legal chops would have to take a look |
From @jhi
Now did. Ugh. In Solaris 10, strtod must be in "c99 mode" for the hexfloats to be recognized. (strtold is always in this mode). The "c99 mode' is achieved by using "c99" as the Solaris Studio compiler (driver), instead of "cc". In Solaris 9 (or earlier), there is no support for hexfloats. (Not blaming Solaris in particular: I'm pretty certain many older OS releases will be similarly C99-unsupportive.) If one is not using Solaris Studio cc (something beginning with g, maybe), one can live dangerously and explicitly link in either of /usr/lib/{32,64}/values-xpg6.o and get the "c99 strtod". Dangerous living because probably many other things get "upgraded", too. Executive summary: using the netlib dtoa.c (*) is starting to sound even more siren-like. (*) an odd name, given that it's strtod implementation... |
From [Unknown Contact. See original ticket]
Now did. Ugh. In Solaris 10, strtod must be in "c99 mode" for the hexfloats to be recognized. (strtold is always in this mode). The "c99 mode' is achieved by using "c99" as the Solaris Studio compiler (driver), instead of "cc". In Solaris 9 (or earlier), there is no support for hexfloats. (Not blaming Solaris in particular: I'm pretty certain many older OS releases will be similarly C99-unsupportive.) If one is not using Solaris Studio cc (something beginning with g, maybe), one can live dangerously and explicitly link in either of /usr/lib/{32,64}/values-xpg6.o and get the "c99 strtod". Dangerous living because probably many other things get "upgraded", too. Executive summary: using the netlib dtoa.c (*) is starting to sound even more siren-like. (*) an odd name, given that it's strtod implementation... |
From @jhi
Good news, everyone... the netlib dtoa.c contains *both* strtod() and dtoa(), the latter useable for sprintfing. It is quite widely used: Python, PHP, and *Java*; and Chrome, Firefox, and Safari. More useful reading: http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ |
From [Unknown Contact. See original ticket]
Good news, everyone... the netlib dtoa.c contains *both* strtod() and dtoa(), the latter useable for sprintfing. It is quite widely used: Python, PHP, and *Java*; and Chrome, Firefox, and Safari. More useful reading: http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ |
From @jhiFrom https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122482:
On the hexadecimal output the killer wording in the C99 seems to be that trailing zeros *may* be printed. And this is what Solaris does, but glibc (Linux), and whatever is used in OS X, do not. |
From [Unknown Contact. See original ticket]From https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122482:
On the hexadecimal output the killer wording in the C99 seems to be that trailing zeros *may* be printed. And this is what Solaris does, but glibc (Linux), and whatever is used in OS X, do not. |
From @cpansproutOn Thu Jul 03 08:18:01 2014, jhi wrote:
This came up on the list a couple of years ago. At the time I think the consensus was to allow parser plugins to extend the syntax, instead of hard-coding one of them into toke.c. When we first tried to reserve this syntax (or something similar) by deprecating 0xf00 followed by a dot, several cases showed up in the perl tests themselves. I think they got changed, masking the fact that such syntax already occurs in real life. Now this is all from memory without actually looking anything up.... -- Father Chrysostomos |
From @jhi
Having looked at the toke.c now for a while, I think the plugin plan is wishful thinking unless something drastic happens first.
I would find that surprising... the "pEXPONENT" part is currently syntax error. |
From [Unknown Contact. See original ticket]
Having looked at the toke.c now for a while, I think the plugin plan is wishful thinking unless something drastic happens first.
I would find that surprising... the "pEXPONENT" part is currently syntax error. |
From @jhiFor better or worse, I have now submitted http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have which implement hexadecimal floats, without depending on C99 or using system printf/strtod. The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell. |
From [Unknown Contact. See original ticket]For better or worse, I have now submitted http://perl5.git.perl.org/perl.git/commit/dc91db6 Configure scan for the kind of long double we have which implement hexadecimal floats, without depending on C99 or using system printf/strtod. The dc91db6 will probably contain many bad guesses for the non-Configure platforms. Only smokes will tell. |
From @craigberryOn Thu, Aug 14, 2014 at 6:56 AM, Jarkko Hietaniemi via RT
Would there be any advantage in toke.c to using Uquad_t or U64TYPE |
From @jhiOn Thursday-201408-14, 8:51, Craig A. Berry wrote:
Ah, good point. As a matter of fact, I use that very fact in sv.c (I also need to think more carefully what happens/should happen at |
From @arcJarkko Hietaniemi via RT <perlbug-comment@perl.org> wrote:
Hurrah! Thanks very much for this. Earlier in this ticket, Brian Fraser pointed out the existence of sub ap1 { 'z' } Jarkko reports having found no such affected code using grep.cpan.me, Any thoughts? Am I worrying unnecessarily? -- |
From @jhiOn Thursday-201408-14, 9:13, Aaron Crane wrote:
I would wait for Andreas' CPAN smokes. |
From @jhiOn Friday, August 15, 2014, Craig A. Berry <craig.a.berry@gmail.com> wrote:
Neither do I, I just recently wrote it... That means that for some reason v (the pointer for the hexdigits (really -- |
From @sisyphus-----Original Message-----
At least that one works correctly for me on (debian wheezy) powerpc64 Here's some values that don't look right, however: For 1e-298, the 2 doubles (most significant first) are 0210be08d0527e1d and If I do 'printf "%A", 1e-298;' then I get: Those 4 zeroes in the middle are wrong - they should appear at the end. (In the Data::Float::DoubleDouble representation, I opted to have the first Another value I looked at was 193e-3. Again, the prefix looks wrong - most siginificant 13 bits are 1100010110100. Data::Float::DoubleDouble says +0x1.8b4395810624dd2f1a9fbe76c8cp-3 (and I'll I also looked at 2 ** 200. That came out as 0X0P+0. The fourth value I looked at was 2 ** 0.5. As with 193e-3, the least The actual script I ran is attached (try.pl), but to run it you'll need to Btw, I've just checked that the above Data::Float::DoubleDouble values agree Thanks for taking this on, Jarrko. Apologies that I haven't come up with Cheers, |
From @sisyphus1000010111110000010001101000001010010011111100001110101101001110001001011011 1100010110100001110010101100000010000011000100100110111010010111100011010100 1000000000000000000000000000000000000000000000000000000000000000000000000000 1011010100000100111100110011001111111001110111100110010010000100010110010111 |
From @sisyphus-----Original Message-----
I don't think this is central to this thread. The setting of the last hex char to "c" arises from the (known) perl bug As regards 193e-3, instead of assigning correct doubles (3fc8b4395810624e We can force perl to assign the correct double-double representation (and use Math::NV qw(:all); If we do that then the correct representation of I suppose D:F:DD could strive to detect and correct perl's mistakes, but Cheers, |
From @jhiOn Sunday-201408-17, 6:40, sisyphus1@optusnet.com.au wrote:
The currently-in-blead version is all sorts of wrong for IEEE 754 128 |
From @jhi
Get thee the http://perl5.git.perl.org/perl.git and retry. It's probably still quite wrong for double-doubles, but at least it |
From @sisyphus-----Original Message-----
The value expressed for 2 ** 200 is a big improver ;-) Of the other values I looked at last night, they seem to have changed only For example, yesterday's blead presented 1e-298 as: Today's blead presents it as: And the correct rendition is: Even for an easily representable float such as 128.625 (where the entire Anyway - good luck with it. (It would be nice to see this up and running Is it not possible for you to achieve the desired result via C's %La/%LA Cheers, |
From @jhi
If you could do: grep longdblkind config.sh I'll also email you a test code, the output of which would be of interest.
That would leave us dependent on the vendors' implementations of C99. (1) C99 - which we do not require, and enabling of which requires often (2) there's wiggle room in the spec, which inevitably leads into
|
From @jhiThis is way, way implemented already. |
@jhi - Status changed from 'open' to 'resolved' |
From @pjacklamI'm trying to implement hexadecimal/octal/binary floats for bignum and the Case 1: This looks OK: $ perl -wle 'print 0x0.1p+0' but what's with the following output: $ perl -wle 'print 0x0.10000000000000001p+0' If I do a similar case with decimal floats, the added ...0001 at the end $ perl -wle 'print 0.1e0' $ perl -wle 'print 0.10000000000000000000000000001e0' Case 2: The following gives me a warning about an invalid octal digit, as expected: $ perl -wle 'print 018p0' But if the invalid digit is after the dot, I get no warning: $ perl -wle 'print 01.8p0' Ditto with binary numbers: $ perl -wle 'print 0b2.1p0' But no warning if the digit 2 is after the dot: $ perl -wle 'print 0b1.2p0' Peter |
From @pjacklam(Sorry I forgot to add perl5-porters to my previous message.) Here is another odd case. This looks OK: $ perl -wle 'print 0 + 0x0.p0' But what's with the following? $ perl -wle 'print 0 + 0x.p0' Peter 2015-11-05 15:15 GMT+01:00 Peter John Acklam <pjacklam@gmail.com>:
|
From @jhiOn Thu, Nov 5, 2015 at 9:15 AM, Peter John Acklam via RT
Well, that is strange, too: I have built it fine in x86 Solaris. (Also in sparc solaris, but there I think is only gcc, not Sun
The hexadecimal float parsing code knows exactly when the added digits
Sorry to state the obvious but hexadecimal floats are about ... In most cases when you start mixing dots and digits (or hexdigits), I tried to carve a very strict path through the lexer which allows
-- |
From @AbigailOn Thu, Nov 05, 2015 at 03:24:05PM +0100, Peter John Acklam wrote:
The 0x.p0 is parsed as 0 . 'p0', and it gives an error under strict I understand the C<< . 'p0' >> part, but what I did not expect is $ perl -Mstrict -wE 'say 0x' C<< 0x_ >> is valid as well, but it warns about a misplaced _ in $ perl -Mstrict -wE 'say 0x_' But it's great for JAPHs. Abigail |
From @jhi
... but I think it's the right thing to do, since one of the major -- |
From @jhi
http://perl5.git.perl.org/perl.git/blob/HEAD:/t/op/hexfp.t may be illuminating here. There I parse valid and invalid hexfp. The tests after the comment "Test certain things that are not -- |
From @khwilliamsonOn 11/05/2015 08:13 AM, Jarkko Hietaniemi wrote:
I view it as not a downside, but an upside |
From Eirik-Berg.Hanssen@allverden.noOn Thu, Nov 5, 2015 at 3:47 PM, Jarkko Hietaniemi <jhi@iki.fi> wrote:
Wow. So … $ perl -wle 'print 0b1.1p0' … that's just emergent behaviour? Cool! :) Eirik |
From @jhiOn Thursday-201511-05 10:29, Eirik Berg Hanssen via RT wrote:
"Emergent behaviour" describes the whole of Perl rather beautifully, |
From @jhiSo it really does look like the hexfp parsing code implementation is leaking over to supporting unintentionally also binary and octal... My preference would be to stop this particular emergent behaviour, at least for now. Stopping the leak would be trivial: --- a/toke.c - if (UNLIKELY(HEXFP_PEEK(s))) { and then e.g. 0b101.101p0 would barf as expected, on the "p0". Whether we want to really support "binfp" and "octfp", I don't know. I'm worried about what surprising corners of the language that will reveal... |
From @pjacklam2015-11-05 15:47 GMT+01:00 Jarkko Hietaniemi <jhi@iki.fi>:
Yes, but Perl being ... eh ... Perl, you never know. :-) And oct() Anyway, thanks for the explanations, everyone! Peter |
From @ikegamiOn Thu, Nov 5, 2015 at 9:47 AM, Jarkko Hietaniemi <jhi@iki.fi> wrote:
That addresses the warning, but what about the fact that the result could |
From @jhi
I have no idea what are you saying here. If you are saying that hexadecimal floating constants should silently -- |
From Eirik-Berg.Hanssen@allverden.noOn Fri, Nov 6, 2015 at 10:14 PM, Jarkko Hietaniemi <jhi@iki.fi> wrote:
He's not saying anything about "silently". I think he's suggesting that dropping the low-order digits would be eirik@purplehat[23:08:10] I'm inclined to agree. (Hey, without warnings enabled, it is actually dropping the high-order Eirik |
From @jhiOn Friday-201511-06 17:10, Eirik Berg Hanssen wrote:
Okay, that's a definite bug. It seems I lacked enough creative Hmm. What would be a good mode of failure here? Emit a warning
|
From @jhiOn Friday-201511-06 17:10, Eirik Berg Hanssen wrote:
Now opened https://rt-archive.perl.org/perl5/Ticket/Display.html?id=126582 explicitly for this.
|
Migrated from rt.perl.org#122219 (status was 'resolved')
Searchable as RT122219$
The text was updated successfully, but these errors were encountered: