Improve control over the output format #27

HeroicKatora · 2018-07-27T22:26:08Z

Would it be possible to control whether the number is printed in exponent or decimal notation? Or, if more convenient, return the chosen location for the decimal point, if present, the number of digits as well as the exponent. And make it possible to get the chosen exponent as integer.

I'm trying to integrate this into a formatting library and as such would like to transform the output into all of the forms available for printf(). In particular, the decimal formatting with user provided precision makes it necessary to reformat the number heavily. While some are trivial to implement, reparsing the exponent to decide on the automatic formatting and moving the floating point separator feels suboptimal.

The double-conversion interfaces could be a guideline here but the defaults and best options are quite specialized to ECMAScript.

The text was updated successfully, but these errors were encountered:

StephanTLavavej · 2018-07-27T23:31:10Z

Note that Ryu is inherently incapable of emitting arbitrary precision. You could modify the algorithm to emit additional digits in scientific or fixed notation, but after running out of the digits that are currently trimmed to achieve the shortest round-trip representation, you'd have to fill with zeroes. While it would still round-trip, it would be mathematically incorrect. Consider this example:

C:\Temp>type precision.cpp
#include <stdio.h>

int main() {
    const double two_40 = 1LL << 40;
    const double two_minus40 = 1 / two_40;

    printf("%.27e from printf\n", two_minus40);
    puts("9.094947017729282379150390625e-13 expected for %.27e");
    printf("%.40f from printf\n", two_minus40);
    puts("0.0000000000009094947017729282379150390625 expected for %.40f");
}

C:\Temp>cl /EHsc /nologo /W4 /MTd precision.cpp
precision.cpp

C:\Temp>precision
9.094947017729282379150390625e-13 from printf
9.094947017729282379150390625e-13 expected for %.27e
0.0000000000009094947017729282379150390625 from printf
0.0000000000009094947017729282379150390625 expected for %.40f

Ryu emits 9.094947017729282E-13 for 2^-40.

As part of implementing C++17 , I'll need to modify Ryu to emit shortest round-trip in fixed notation, but I am not yet sure how to contribute that upstream (I need to implement two more variants that switch between fixed and scientific depending on either printf's rules or an overall shortest criterion, so I need to figure out what the interface will be). I might end up separating the algorithm into two parts - the core part that generates the digits in a uint32_t/uint64_t and the exponent, and then a formatting part, which should allow all four charconv formats to be cleanly implemented.

HeroicKatora · 2018-07-27T23:47:33Z

~~Outputting trailing zeroes is fully within the specifications of printf as far as I am aware, basically guaranteeing roundtrip is the only necessity.~~ [As far as I am concerned this is acceptable for my use case] Padding with zeroes is also the current strategy. But generating decimal representation from scientific notation can involve a memory move (at most a few bytes but still) of the complete suffix to make space for preceding zero digits. Together with the necessity of reevaluating the exponent, this is quite an overhead over what I imagine would be more straightforward to do in the internal representation.

As part of implementing C++17 ,

That is awesome. Finally hope for efficient string conversions in the standard.

Edit:

Although the wording is not extremely precise, it appears to imply to print more digits than significant digits available in the source floating point value.

The value is rounded to the appropriate number of digits.

7.21.6.1.8
if the number of significant decimal digits is at most DECIMAL_DIG, then the result should be correctly rounded. If the number of significant decimal digits is more than DECIMAL_DIG but the source value is exactly representable with DECIMAL_DIG digits, then the result should be an exact representation with trailing zeros

7.21.6.1.13

ulfjack · 2018-07-28T13:18:11Z

Additional output formats would definitely be welcome, with the mentioned caveat.

ulfjack · 2018-08-07T07:17:07Z

Marking this as specific to the C implementation. If someone is interested in special output formats for Java, please file a separate issue.

StephanTLavavej · 2018-08-28T23:50:02Z

I have code for fixed notation and I should have time to polish it up and submit a pull request in October.

StephanTLavavej · 2018-10-23T01:35:13Z

Here's the fixed notation code that I wrote for VS 2017 15.9 (slightly revised): https://github.com/StephanTLavavej/ryu/blob/msvc-2018.10.22/ryu/d2s.c#L388

It isn't ready for a pull request yet. Outstanding issues:

Mechanical: I wrote this after __uglifying Ryu's identifiers. (New identifiers are MSVC STL _Ugly, making them easy to distinguish.) De-uglifying isn't a problem, it just takes a bit of time.
This uses C++17 charconv's interface: the [_First, _Last) range, chars_format requesting a format, and the bounds-checked to_chars_result. To upstream this, we'll probably need a different interface. (I am very interested in keeping my code closely aligned with upstream, but I still need to ship the charconv interface.)
Also regarding the interface, the fixed notation codepath is currently intertwined with the Ryu scientific notation codepath. Always running Ryu allows us to use its output (suitably decimal/zero filled) for most cases, and even the "large exact integer" case benefits from Ryu's output to determine the output length. I think I could be far less invasive here, and extract this into separate functions. (I wrote this under a deadline, hence the hastily designed structure, although I have a high level of confidence in the logic itself.)
I am currently using the MSVC intrinsics _BitScanForward and _BitScanForward64; replacing them with Clang/GCC intrinsics or portable code shouldn't be difficult.
None of this applies to the generic128 codepaths which I haven't worked with at all.
The digit-printing code is a copy-pasted mess; it may be worth centralizing now.

If this is interesting, I can continue to work on it after dealing with other things on my plate.

ulfjack · 2018-11-05T21:31:03Z

I'm definitely interested in seeing some progress here. Unfortunately, I haven't had any time to work on this.

Artoria2e5 · 2020-09-22T11:09:38Z

Just to fix a dead link: https://github.com/StephanTLavavej/ryu/blob/bb357f7/ryu/d2s.c#L317.

See https://reviews.llvm.org/D70631 for Steph's LLVM PR based on it (but updated with ryu printf); the PR implements %g precision by post-processing. A lookup table is used to skip the reformatting.

Personally I would like to see %g become a thing too.

ulfjack mentioned this issue Jul 28, 2018

Improvement: Compare with more algorithms in benchmark #28

Open

ColinH mentioned this issue Jul 28, 2018

Double-conversion compatible output? #30

Closed

mikkelfj mentioned this issue Aug 6, 2018

Floating point precision and missing integer fields dvidelabs/flatcc#90

Open

ulfjack added the C Affects the C implementation in ryu/. label Aug 7, 2018

tiehuis mentioned this issue Aug 9, 2018

look into using Ryu instead of Errol3 for floating point printing ziglang/zig#1299

Closed

ulfjack mentioned this issue Dec 7, 2020

A DecimalFloatingPoint struct in the public side of the API #190

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve control over the output format #27

Improve control over the output format #27

HeroicKatora commented Jul 27, 2018

StephanTLavavej commented Jul 27, 2018

HeroicKatora commented Jul 27, 2018 •

edited

Loading

ulfjack commented Jul 28, 2018

ulfjack commented Aug 7, 2018

StephanTLavavej commented Aug 28, 2018

StephanTLavavej commented Oct 23, 2018

ulfjack commented Nov 5, 2018

Artoria2e5 commented Sep 22, 2020 •

edited

Loading

Improve control over the output format #27

Improve control over the output format #27

Comments

HeroicKatora commented Jul 27, 2018

StephanTLavavej commented Jul 27, 2018

HeroicKatora commented Jul 27, 2018 • edited Loading

ulfjack commented Jul 28, 2018

ulfjack commented Aug 7, 2018

StephanTLavavej commented Aug 28, 2018

StephanTLavavej commented Oct 23, 2018

ulfjack commented Nov 5, 2018

Artoria2e5 commented Sep 22, 2020 • edited Loading

HeroicKatora commented Jul 27, 2018 •

edited

Loading

Artoria2e5 commented Sep 22, 2020 •

edited

Loading