Skip to content

Commit

Permalink
Switch MVM_coerce_n_s to Ryū from Grisu3 with a sprintf fallback.
Browse files Browse the repository at this point in the history
As Zoffix notes in commit 067c059 which switched *to* Grisu3, as
Grisu3 can only handle ~99.5% of cases, it needs to be paired with a
fallback algorithm for the other 0.5%. To save adding a *second* external
library, we had been using sprintf with the format %17g for this. Using %17g
instead of the default width of 15 produced *correct* results, in that the
output would always round trip correctly, but for some values it was not
producing the *optimal* result - ie the shortest decimal representation
where two or more decimal values map to the same binary value.

Unfortunately for some obvious cases, what we had exposed this. For example:

    nqp -e 'nqp::say(1e23)'
    9.9999999999999992e+22
    nqp -e 'nqp::say(1e23 - 9.9999999999999992e+22
    0

As noted in that commit, Dragon4 would have been a better fallback, and
at the time "Grisu3+Dragon4 [was] the current state of the art."

Since then Ryū has been published. The author states that it is faster,
but I haven't verified these benchmarks. As it is a *complete* algorithm -
it handles 100% of inputs - we can use it without needing any fallback,
thus simplifying our code and fixing the buggy sub-optimal results.
Hence I am not concerned if it isn't always faster than Grisu3, as
correctness trumps "getting the wrong answer as fast as possible".

Current state of the art might actually be Dragonbox, but the reference
implementation is in C++, and it is not clear whether that algorithm has
had sufficient peer review yet to find any subtle problems. If someone
else authors a suitably licensed implementation in portable C, we should
consider switching to it, but it is not worth doing that ourselves.

We build the single object file we need from 3rdparty/ryu/ryu by directly
adding it to our Makefile. The ryu codebase uses bazel as a build system,
which seems to be portable, but isn't popular enough to be packaged by
Debian. We don't need this, because it builds everything - C, Java,
benchmarks, etc. The ryu subdirectory has a fallback Makefile which
builds just the C code. This seems useful, but it fails unless it can
build all the object files, and the 128-bit generic code doesn't build
on (at least some) 32 bit platforms. Again, something we don't need.

So the simplest working solution seems to be this seeming hack - add
"3rdparty/ryu" to our C compiler include path and treat the object file
as one of our own.
  • Loading branch information
nwc10 committed May 5, 2021
1 parent 8cc7fbd commit 7907b85
Show file tree
Hide file tree
Showing 2 changed files with 180 additions and 9 deletions.
3 changes: 2 additions & 1 deletion build/Makefile.in
Expand Up @@ -42,6 +42,7 @@ CFLAGS = @cflags@ @ccdef@MVM_TRACING=$(TRACING) @ccdef@MVM_CGOTO=$(CGOTO) @cc
CINCLUDES = @moar_cincludes@ \
@ccinc@@shaincludedir@ \
@ccincsystem@3rdparty/tinymt \
@ccincsystem@3rdparty/ryu \
@ccincsystem@3rdparty/dynasm \
@ccincsystem@3rdparty/cmp \
@ccincsystem@3rdparty \
Expand Down Expand Up @@ -106,7 +107,6 @@ OBJECTS1 = src/core/callsite@obj@ \
src/core/ops@obj@ \
src/core/hll@obj@ \
src/core/loadbytecode@obj@ \
src/math/grisu@obj@ \
src/core/coerce@obj@ \
src/core/dll@obj@ \
src/core/ext@obj@ \
Expand Down Expand Up @@ -248,6 +248,7 @@ OBJECTS2 = src/6model/reprs/MVMDLLSym@obj@ \
src/platform/random@obj@ \
src/platform/memmem32@obj@ \
3rdparty/freebsd/memmem@obj@ \
3rdparty/ryu/ryu/d2s@obj@ \
src/platform/malloc_trim@obj@ \
src/moar@obj@ \
@platform@ \
Expand Down
186 changes: 178 additions & 8 deletions src/core/coerce.c
@@ -1,5 +1,5 @@
#include "moar.h"
#include "math/grisu.h"
#include "ryu/ryu.h"

#if defined(_MSC_VER)
#define strtoll _strtoi64
Expand Down Expand Up @@ -264,17 +264,187 @@ MVMString * MVM_coerce_n_s(MVMThreadContext *tc, MVMnum64 n) {
else if (n != n) {
return MVM_string_ascii_decode_nt(tc, tc->instance->VMString, "NaN");
}

char buf[64];
/* What we get back is 0E0, 1E0, 3.14E0, 1E2, ... Infinity.
* What we'd like is the classic "fixed decimal" representation for
* small values, and the exponent as a lower case 'e'. So we do some
* massaging, and handle infinity above. We could leave NaN to fall
* through here, but if so it would hit our "something went wrong" code,
* which somewhat downplays the absolute "this path means a bug". So I think
* that it's still clearer handling it above. */
const int orig_len = d2s_buffered_n(n, buf);
const char *first = buf;

/* Take any leading minus away. We put it back at the end. */
int len = orig_len;
if (*first == '-') {
++first;
--len;
}

if (len < 3 || !(first[1] == '.' || first[1] == 'E')) {
/* Well, this shouldn't be possible. */
}
else {
char buf[64];
if (dtoa_grisu3(n, buf, 64) < 0)
MVM_exception_throw_adhoc(tc, "Could not stringify number (%f)", n);
else {
MVMStringIndex len = strlen(buf);
MVMGrapheme8 *blob = MVM_malloc(len);
memcpy(blob, buf, len);
const char *end = first + len;
const char *E = NULL;
if (end[-2] == 'E') {
E = end - 2;
}
else if (end[-3] == 'E') {
E = end - 3;
}
else if (len >= 4 && end[-4] == 'E') {
E = end - 4;
}
else if (len > 4 && end[-5] == 'E') {
E = end - 5;
}

/* This is written out verbosely - arguably there is some DRY-violation.
* It seems clearer to write it all out, than attempt to fold the common
* parts together and make it confusing and potentially buggy.
* I hope that the C compiler optimiser can spot the common code paths.
*/

if (E) {
MVMGrapheme8 *blob;
size_t e_len = end - (E + 1);
if (e_len == 2 && E[1] == '-') {
/* 1E-1 etc to 1E-9 etc */
if (E[2] > '4') {
/* 1E-5 etc to 1E-9 etc. Need to add a zero. */
len = orig_len + 1;
blob = MVM_malloc(len);
/* Using buf here, not first, means that we copy any '-'
* too. */
size_t new_e = E - buf;
memcpy(blob, buf, new_e);
blob[new_e] = 'e';
blob[new_e + 1] = '-';
blob[new_e + 2] = '0';
blob[new_e + 3] = E[2];
}
else {
/* Convert to fixed format, value < 1 */
unsigned int zeros = E[2] - '0' - 1;
size_t dec_len;
if (E == first + 1) {
/* No trailing decimals */
dec_len = 0;
}
else {
dec_len = E - (first + 2);
}
len = 2 /* "0." */
+ zeros /* "", "0", "00" or "000" */
+ 1 /* first digit */
+ dec_len; /* rest */

MVMGrapheme8 *pos;
if (first == buf) {
blob = MVM_malloc(len);
pos = blob;
} else {
++len;
blob = MVM_malloc(len);
pos = blob;
*pos++ = '-';
}

*pos++ = '0';
*pos++ = '.';

while (zeros) {
*pos++ = '0';
--zeros;
}

*pos++ = *first;

if (dec_len) {
memcpy(pos, first + 2, dec_len);
}
}
}
else if (e_len == 1 || (e_len == 2 && E[1] == '1' && E[2] < '5')) {
/* 1E0 etc to 1E14 etc.
* Convert to fixed format, possibly needing padding,
* possibly with trailing decimals, possibly neither. */
unsigned int exp = e_len == 1 ? E[1] - '0' : 10 + E[2] - '0';
size_t dec_len;
if (E == first + 1) {
/* No trailing decimals */
dec_len = 0;
}
else {
dec_len = E - (first + 2);
}
size_t padding = exp > dec_len ? exp - dec_len : 0;
size_t before_dp = exp > dec_len ? dec_len : exp;
int has_dp = dec_len > exp;

len = 1 + padding + dec_len + has_dp;

MVMGrapheme8 *pos;
if (first == buf) {
blob = MVM_malloc(len);
pos = blob;
} else {
++len;
blob = MVM_malloc(len);
pos = blob;
*pos++ = '-';
}

*pos++ = *first;

if (before_dp) {
memcpy(pos, first + 2, before_dp);
pos += before_dp;
}

if (has_dp) {
/* In this case, we never need to pad with zeros. */
*pos++ = '.';
memcpy(pos, first + 2 + before_dp, dec_len - exp);
}
else {
/* In this case, we might need to pad with zeros. */
while (padding) {
*pos++ = '0';
--padding;
}
}
}
else if (E[1] == '-') {
/* Stays in scientific notation, but need to change to 'e'. */
len = orig_len;
blob = MVM_malloc(len);
size_t new_e = E - buf;
memcpy(blob, buf, new_e);
blob[new_e] = 'e';
memcpy(blob + new_e + 1, E + 1, e_len);
} else {
/* Stays in scientific notation, but need to change to 'e'
* and add a + */
len = orig_len + 1;
blob = MVM_malloc(len);
size_t new_e = E - buf;
memcpy(blob, buf, new_e);
blob[new_e] = 'e';
blob[new_e + 1] = '+';
memcpy(blob + new_e + 2, E + 1, e_len);
}
return MVM_string_ascii_from_buf_nocheck(tc, blob, len);
}
}

/* Something went wrong. Should we oops? */
MVMGrapheme8 *blob = MVM_malloc(orig_len);
memcpy(blob, buf, orig_len);
return MVM_string_ascii_from_buf_nocheck(tc, blob, orig_len);
}

void MVM_coerce_smart_stringify(MVMThreadContext *tc, MVMObject *obj, MVMRegister *res_reg) {
Expand Down

0 comments on commit 7907b85

Please sign in to comment.