Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large number of incorrect parsing #9

Closed
lemire opened this issue Mar 23, 2021 · 5 comments
Closed

Large number of incorrect parsing #9

lemire opened this issue Mar 23, 2021 · 5 comments

Comments

@lemire
Copy link
Contributor

lemire commented Mar 23, 2021

For https://github.com/fastfloat/fast_float, we have extensive tests. I have run them through on FastDoubleParser and found many failures, I have collected them in this gist...

https://gist.github.com/lemire/641a34589c36747f6d24ed6d29ac75f0

The algorithm at https://github.com/fastfloat/fast_float handles all of these cases correctly.

You may refer to https://arxiv.org/abs/2101.11408 or to the C# port at https://github.com/CarlVerret/csFastFloat

@wrandelshofer
Copy link
Owner

Thank you for the test cases!
Most of them were caused by wrong treatment of unsigned longs in the code. I have fixed those.
I am going to look at the remaining ones next.

@wrandelshofer
Copy link
Owner

Another class of inputs were wrong because the code did not compensate for skipped digits if there are more than 19 digits.

@wrandelshofer
Copy link
Owner

Thank you very much for the pointers to the corresponding code sections!

3be8131 includes now the code that you have marked in https://github.com/fastfloat/fast_float/blob/main/include/fast_float/parse_number.h#L111-L116.

When that fails, the code currently falls back to java.lang.Double.parseDouble(), so that all test cases should pass now.

I am planning to replace all fall back calls to Double.parseDouble() in upcoming commits.

For decimal floating point literals, I believe, it is best to port the parse_long_mantissa function from your C++ code, and everything that comes with it. Your class "decimal" appears to do the stuff that I need. (I tinkered with class java.math.BigDecimal. But it round-trips to a String which it then feeds into Double.parseDouble for doing the conversion - which kind of defeats the purpose).

I am not sure how to implement the corresponding cases for hexadecimal floating point literals. In this case, we have a number that is composed of 1^sign * mantissa * 2^exponent. I tried with the class java.math.BigInteger, but memory usage was too high. Thats why in 3be8131 the code currently only tries Clingers fast path, and then immediately falls back to Double.parseDouble.

@wrandelshofer
Copy link
Owner

This is fixed in ac66003

This revision includes a port of the Decimal class from the C++ code in fast_float. However, it turns out that the code in OpenJDK class jdk.internal.math.FloatingDecimal is by at least by one order of magnitudes faster. So performance is better if method parseRestOfDecimalFloatLiteralTheHardWay() in FastDoubleParser just calls Double.parseDouble().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants