Switch MVM_coerce_n_s to Ryū from Grisu3 with a sprintf fallback.

As Zoffix notes in commit 067c059 which switched *to* Grisu3, as Grisu3 can only handle ~99.5% of cases, it needs to be paired with a fallback algorithm for the other 0.5%. To save adding a *second* external library, we had been using sprintf with the format %17g for this. Using %17g instead of the default width of 15 produced *correct* results, in that the output would always round trip correctly, but for some values it was not producing the *optimal* result - ie the shortest decimal representation where two or more decimal values map to the same binary value. Unfortunately for some obvious cases, what we had exposed this. For example: nqp -e 'nqp::say(1e23)' 9.9999999999999992e+22 nqp -e 'nqp::say(1e23 - 9.9999999999999992e+22 0 As noted in that commit, Dragon4 would have been a better fallback, and at the time "Grisu3+Dragon4 [was] the current state of the art." Since then Ryū has been published. The author states that it is faster, but I haven't verified these benchmarks. As it is a *complete* algorithm - it handles 100% of inputs - we can use it without needing any fallback, thus simplifying our code and fixing the buggy sub-optimal results. Hence I am not concerned if it isn't always faster than Grisu3, as correctness trumps "getting the wrong answer as fast as possible". Current state of the art might actually be Dragonbox, but the reference implementation is in C++, and it is not clear whether that algorithm has had sufficient peer review yet to find any subtle problems. If someone else authors a suitably licensed implementation in portable C, we should consider switching to it, but it is not worth doing that ourselves. We build the single object file we need from 3rdparty/ryu/ryu by directly adding it to our Makefile. The ryu codebase uses bazel as a build system, which seems to be portable, but isn't popular enough to be packaged by Debian. We don't need this, because it builds everything - C, Java, benchmarks, etc. The ryu subdirectory has a fallback Makefile which builds just the C code. This seems useful, but it fails unless it can build all the object files, and the 128-bit generic code doesn't build on (at least some) 32 bit platforms. Again, something we don't need. So the simplest working solution seems to be this seeming hack - add "3rdparty/ryu" to our C compiler include path and treat the object file as one of our own.
MoarVM · May 5, 2021 · 7907b85 · 7907b85
1 parent 8cc7fbd
commit 7907b85
Show file tree

Hide file tree

Showing 2 changed files with 180 additions and 9 deletions.
diff --git a/build/Makefile.in b/build/Makefile.in
@@ -42,6 +42,7 @@ CFLAGS    = @cflags@ @ccdef@MVM_TRACING=$(TRACING) @ccdef@MVM_CGOTO=$(CGOTO) @cc
 CINCLUDES = @moar_cincludes@ \
             @ccinc@@shaincludedir@ \
             @ccincsystem@3rdparty/tinymt \
+            @ccincsystem@3rdparty/ryu \
             @ccincsystem@3rdparty/dynasm \
             @ccincsystem@3rdparty/cmp \
             @ccincsystem@3rdparty \
@@ -106,7 +107,6 @@ OBJECTS1 = src/core/callsite@obj@ \
           src/core/ops@obj@ \
           src/core/hll@obj@ \
           src/core/loadbytecode@obj@ \
-          src/math/grisu@obj@ \
           src/core/coerce@obj@ \
           src/core/dll@obj@ \
           src/core/ext@obj@ \
@@ -248,6 +248,7 @@ OBJECTS2 = src/6model/reprs/MVMDLLSym@obj@ \
           src/platform/random@obj@ \
           src/platform/memmem32@obj@ \
           3rdparty/freebsd/memmem@obj@ \
+          3rdparty/ryu/ryu/d2s@obj@ \
           src/platform/malloc_trim@obj@ \
           src/moar@obj@ \
           @platform@ \

diff --git a/src/core/coerce.c b/src/core/coerce.c
@@ -1,5 +1,5 @@
 #include "moar.h"
-#include "math/grisu.h"
+#include "ryu/ryu.h"
 
 #if defined(_MSC_VER)
 #define strtoll _strtoi64
@@ -264,17 +264,187 @@ MVMString * MVM_coerce_n_s(MVMThreadContext *tc, MVMnum64 n) {
     else if (n != n) {
         return MVM_string_ascii_decode_nt(tc, tc->instance->VMString, "NaN");
     }
+
+    char buf[64];
+    /* What we get back is 0E0, 1E0, 3.14E0, 1E2, ... Infinity.
+     * What we'd like is the classic "fixed decimal" representation for
+     * small values, and the exponent as a lower case 'e'. So we do some
+     * massaging, and handle infinity above. We could leave NaN to fall
+     * through here, but if so it would hit our "something went wrong" code,
+     * which somewhat downplays the absolute "this path means a bug". So I think
+     * that it's still clearer handling it above. */
+    const int orig_len = d2s_buffered_n(n, buf);
+    const char *first = buf;
+
+    /* Take any leading minus away. We put it back at the end. */
+    int len = orig_len;
+    if (*first == '-') {
+        ++first;
+        --len;
+    }
+
+    if (len < 3 || !(first[1] == '.' || first[1] == 'E')) {
+        /* Well, this shouldn't be possible. */
+    }
     else {
-        char buf[64];
-        if (dtoa_grisu3(n, buf, 64) < 0)
-            MVM_exception_throw_adhoc(tc, "Could not stringify number (%f)", n);
-        else {
-            MVMStringIndex len = strlen(buf);
-            MVMGrapheme8 *blob = MVM_malloc(len);
-            memcpy(blob, buf, len);
+        const char *end = first + len;
+        const char *E = NULL;
+        if (end[-2] == 'E') {
+            E = end - 2;
+        }
+        else if (end[-3] == 'E') {
+            E = end - 3;
+        }
+        else if (len >= 4 && end[-4] == 'E') {
+            E = end - 4;
+        }
+        else if (len > 4 && end[-5] == 'E') {
+            E = end - 5;
+        }
+
+        /* This is written out verbosely - arguably there is some DRY-violation.
+         * It seems clearer to write it all out, than attempt to fold the common
+         * parts together and make it confusing and potentially buggy.
+         * I hope that the C compiler optimiser can spot the common code paths.
+         */
+
+        if (E) {
+            MVMGrapheme8 *blob;
+            size_t e_len = end - (E + 1);
+            if (e_len == 2 && E[1] == '-') {
+                /* 1E-1 etc to 1E-9 etc */
+                if (E[2] > '4') {
+                    /* 1E-5 etc to 1E-9 etc. Need to add a zero. */
+                    len = orig_len + 1;
+                    blob = MVM_malloc(len);
+                    /* Using buf here, not first, means that we copy any '-'
+                     * too. */
+                    size_t new_e = E - buf;
+                    memcpy(blob, buf, new_e);
+                    blob[new_e] = 'e';
+                    blob[new_e + 1] = '-';
+                    blob[new_e + 2] = '0';
+                    blob[new_e + 3] = E[2];
+                }
+                else {
+                    /* Convert to fixed format, value < 1 */
+                    unsigned int zeros = E[2] - '0' - 1;
+                    size_t dec_len;
+                    if (E == first + 1) {
+                        /* No trailing decimals */
+                        dec_len = 0;
+                    }
+                    else {
+                        dec_len = E - (first + 2);
+                    }
+                    len = 2         /* "0." */
+                        + zeros     /* "", "0", "00" or "000" */
+                        + 1         /* first digit */
+                        + dec_len;  /* rest */
+
+                    MVMGrapheme8 *pos;
+                    if (first == buf) {
+                        blob = MVM_malloc(len);
+                        pos = blob;
+                    } else {
+                        ++len;
+                        blob = MVM_malloc(len);
+                        pos = blob;
+                        *pos++ = '-';
+                    }
+
+                    *pos++ = '0';
+                    *pos++ = '.';
+
+                    while (zeros) {
+                        *pos++ = '0';
+                        --zeros;
+                    }
+
+                    *pos++ = *first;
+
+                    if (dec_len) {
+                        memcpy(pos, first + 2, dec_len);
+                    }
+                }
+            }
+            else if (e_len == 1 || (e_len == 2 && E[1] == '1' && E[2] < '5')) {
+                /* 1E0 etc to 1E14 etc.
+                 * Convert to fixed format, possibly needing padding,
+                 * possibly with trailing decimals, possibly neither. */
+                unsigned int exp = e_len == 1 ? E[1] - '0' : 10 + E[2] - '0';
+                size_t dec_len;
+                if (E == first + 1) {
+                    /* No trailing decimals */
+                    dec_len = 0;
+                }
+                else {
+                    dec_len = E - (first + 2);
+                }
+                size_t padding = exp > dec_len ? exp - dec_len : 0;
+                size_t before_dp = exp > dec_len ? dec_len : exp;
+                int has_dp = dec_len > exp;
+
+                len = 1 + padding + dec_len + has_dp;
+
+                MVMGrapheme8 *pos;
+                if (first == buf) {
+                    blob = MVM_malloc(len);
+                    pos = blob;
+                } else {
+                    ++len;
+                    blob = MVM_malloc(len);
+                    pos = blob;
+                    *pos++ = '-';
+                }
+
+                *pos++ = *first;
+
+                if (before_dp) {
+                    memcpy(pos, first + 2, before_dp);
+                    pos += before_dp;
+                }
+
+                if (has_dp) {
+                    /* In this case, we never need to pad with zeros. */
+                    *pos++ = '.';
+                    memcpy(pos, first + 2 + before_dp, dec_len - exp);
+                }
+                else {
+                    /* In this case, we might need to pad with zeros. */
+                    while (padding) {
+                        *pos++ = '0';
+                        --padding;
+                    }
+                }
+            }
+            else if (E[1] == '-') {
+                /* Stays in scientific notation, but need to change to 'e'. */
+                len = orig_len;
+                blob = MVM_malloc(len);
+                size_t new_e = E - buf;
+                memcpy(blob, buf, new_e);
+                blob[new_e] = 'e';
+                memcpy(blob + new_e + 1, E + 1, e_len);
+            } else {
+                /* Stays in scientific notation, but need to change to 'e'
+                 * and add a + */
+                len = orig_len + 1;
+                blob = MVM_malloc(len);
+                size_t new_e = E - buf;
+                memcpy(blob, buf, new_e);
+                blob[new_e] = 'e';
+                blob[new_e + 1] = '+';
+                memcpy(blob + new_e + 2, E + 1, e_len);
+            }
             return MVM_string_ascii_from_buf_nocheck(tc, blob, len);
         }
     }
+
+    /* Something went wrong. Should we oops? */
+    MVMGrapheme8 *blob = MVM_malloc(orig_len);
+    memcpy(blob, buf, orig_len);
+    return MVM_string_ascii_from_buf_nocheck(tc, blob, orig_len);
 }
 
 void MVM_coerce_smart_stringify(MVMThreadContext *tc, MVMObject *obj, MVMRegister *res_reg) {