Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robust float parsing in the standard library #2207

Open
andrewrk opened this Issue Apr 7, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@andrewrk
Copy link
Member

commented Apr 7, 2019

Thanks to @tiehuis's work in #1958, #375 is closed and zig has basic float parsing in the standard library. But as he notes, there are the following deficiencies:

  • it doesn't handle some edge cases
  • error behavior is not zig-idiomatic
  • handling the edge cases might require an allocator
  • does not handle hex floats

I did a proof of concept of porting musl's float parsing code to implement f128 parsing in stage1. You can see this implementation in parse_f128.c. I believe that further porting this code to the Zig standard library would check all the above boxes.

This issue is the "deserialization" side of #1181. Note the goal stated in #1181 (comment):

I would like to propose that the default float printing and default float parsing in Zig should have roundtrip bit-for-bit value preservation for all values for all float types.

@andrewrk andrewrk added this to the 1.0.0 milestone Apr 7, 2019

@mattsta

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2019

jumping in randomly here: for battle-tested float parsing, also check out the sqlite and luajit input parsers and compare their tradeoffs and performance characteristics (both are implemented to be more instruction cache friendly than stdlib versions too):

and as for roundtrip bit-for-bit value preservation (ignoring things like f16 having 2k "unique" NaNs (and same-meaning-different-bits just gets worse with larger binary lengths)), the best I end up with when running my systems in "super paranoid mode" is: parse string to native float/double/long double, convert native float back to string, string compare original string to see if match is exact.

But, now an entire new decision opens up: which reliable float pretty print algorithm will conversions be tested against?

@tiehuis

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

It should be safe to assume that Ryu will be used. This currently is not in std (there is a bit more work to do regarding fixed precision modes) but I have an implementation of this here: https://github.com/tiehuis/zig-ryu that should be usable for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.