New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the Ryū algorithm for a simpler/faster implementation of float -> string conversion #52811

Open
bstrie opened this Issue Jul 28, 2018 · 10 comments

Comments

Projects
None yet
8 participants
@bstrie
Copy link
Contributor

bstrie commented Jul 28, 2018

There's a new paper making the rounds on the topic of converting floats to their decimal string representations that claims to be both simpler and faster than prior algorithms: https://pldi18.sigplan.org/event/pldi-2018-papers-ry-fast-float-to-string-conversion . I'm particularly interested in the simplicity aspect, since I recall some old conversations regarding our current machinery for this being somewhat subtle and complicated (for speed purposes, I imagine). If we could drastically simplify the implementation without sacrificing speed, that might be a win. Good student or intern project, methinks.

(Apologies for how wishlisty this issue is, I've no clue who might be the presiding expert on our current float->string implementation.)

@bstrie bstrie added the T-compiler label Jul 28, 2018

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jul 28, 2018

cc @dtolnay in case this is of interest for serde

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jul 28, 2018

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Jul 28, 2018

The repository can be found at https://github.com/ulfjack/ryu and the paper at https://dl.acm.org/citation.cfm?id=3192369.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Jul 29, 2018

I did a line-by-line translation of the C implementation to unsafe Rust in https://github.com/dtolnay/ryu. It performs better than dtoa on some inputs but the output is less human readable. For example dtoa and serde_json prefer to print 0.3 as 0.3, while Ryū currently prints it as 3E-1.

Input f64 dtoa Ryū
0.1234 47 ns 66 ns
2.718281828459045 64 ns 40 ns
1.7976931348623157e308 73 ns 42 ns

Benchmark commit: dtolnay/dtoa@655152b

@matthieu-m

This comment has been minimized.

Copy link

matthieu-m commented Jul 29, 2018

@dtolnay : I would imagine that changing the output format would be trivial, no?

On the other hand, the fact that it performs worse on 0.1234 (+40%) is a tad concerning. I know that in some application domains the numbers are decimals by nature, and only floating point representation for performance reasons. If Ryu performs worse on numbers with a small number of digits, then dtoa could be preferable.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Jul 29, 2018

As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ryū generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers.

The Ryū benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits.

@matthieu-m

This comment has been minimized.

Copy link

matthieu-m commented Jul 29, 2018

As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ryū generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers.

Thank you for confirming my suspicion.

The Ryū benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits.

I would guess that whether the number of digits is random, or not, varies by application. This makes it quite difficult to pick one algorithm or the other.

@dzamlo

This comment has been minimized.

Copy link
Contributor

dzamlo commented Jul 29, 2018

Ryū is seems to be slower in a few case versus dtoa, but it seems to always be faster than the std implementation (which use grisu3 and dragon as a fallback if I'm not mistaken).

What trade-off does dtoa makes to be faster than std? And wouldn't it be fairer to compare Ryū with the std implementation ? As far as I understand Ryū should produce the same output as the std implementation.

prefer to print 0.3 as 0.3, while Ryū currently prints it as 3E-1

From what I understand, both Ryū and the std implementation compute the digits to print and the 10 exponent. And from that you can produce a string representation. It just happen that the implementation of the second part provided in the Ryū make a different choice, but using what we have now for that is very likely to be trivial.

@nagisa nagisa added I-slow T-libs and removed T-compiler labels Jul 30, 2018

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jul 30, 2018

Retagging as T-libs, T-compiler hardly has anything to do with libstd matters.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Aug 14, 2018

I switched from dtoa (Grisu2) to the ryu crate in serde_json for a nice improvement in benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment