Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dom and OnDemand resolve double types to different results #2017

Closed
DenineLu opened this issue Jun 7, 2023 · 9 comments
Closed

dom and OnDemand resolve double types to different results #2017

DenineLu opened this issue Jun 7, 2023 · 9 comments

Comments

@DenineLu
Copy link

DenineLu commented Jun 7, 2023

When I parse 0.8825149536132812, dom gets an error in the result, Did I write something wrong? Thanks

code:

padded_string paddedInputJson = R"({"score":0.8825149536132812})"_padded;
simdjson::dom::parser parser_dom;
simdjson::dom::element ele;
ele = parser_dom.parse(paddedInputJson);
std::cout << "     dom:" << ele["score"] << std::endl;

simdjson::ondemand::parser parser_ondemand;
simdjson::ondemand::value val;
val = parser_ondemand.iterate(paddedInputJson);
std::cout << "ondemand:" << val["score"] << std::endl;

result:

     dom:0.8825149536132813
ondemand:0.8825149536132812
@DenineLu
Copy link
Author

DenineLu commented Jun 7, 2023

DOM seems to retain only 16 decimal places, and the 16th place is no longer exact

@lemire
Copy link
Member

lemire commented Jun 7, 2023

In binary64, the decimal string 0.8825149536132812 is represented as 115673 * 2**(-17) which is 0.88251495361328125.

You therefore have that 0.8825149536132812 == 0.88251495361328125 == 0.8825149536132813.

@lemire
Copy link
Member

lemire commented Jun 7, 2023

Recommended reading:

The number parser in simdjson is state-of-the-art. It has been widely adopted: it is part of the Rust, Go, C# standard libraries. It is part of the GCC C++ standard library, and so forth.

We always welcome bug reports, but you should assume that we are correct: we have not yet encountered a single bug regarding our number parsing.

@lemire lemire closed this as completed Jun 7, 2023
@lemire
Copy link
Member

lemire commented Jun 7, 2023

From the paper above...

Capture d’écran, le 2023-06-07 à 09 55 56

You cannot expect all 16-digit numbers to be represented with binary64. You only have 15 digits of accuracy. There is no way in the current C++ standard to have more precision than that, short of rolling your own formats.

@lemire
Copy link
Member

lemire commented Jun 7, 2023

Note that this has nothing to do with simdjson. It is just standard C++.

@DenineLu
Copy link
Author

DenineLu commented Jun 8, 2023

In binary64, the decimal string 0.8825149536132812 is represented as 115673 * 2**(-17) which is 0.88251495361328125.

You therefore have that 0.8825149536132812 == 0.88251495361328125 == 0.8825149536132813.

Thank you very much for your answer, now I understand the reason for this! But since my usage scenario needs to get unresolved values, I need to go for the ondemand method.

@lemire
Copy link
Member

lemire commented Jun 8, 2023

@DenineLu

Both DOM and ondemand give the same result. It is not possible that you get the double value 0.8825149536132812 in one case and a distinct double value 0.8825149536132813 in the second case.

There are two good reasons for that:

  1. The two values are the same. There is no 0.8825149536132812 distinct from 0.8825149536132813 in C++.
  2. The number parser is the same. Both DOM and ondemand use the same number parsing functions.

Your code example does not do what you think it does. In one instance (ondemand), you are just printing out the content of the JSON string, and in the other (DOM), you are re-serializing a parsed number.

@lemire
Copy link
Member

lemire commented Jun 8, 2023

Your title was... "dom and OnDemand resolve double types to different results"

But it is not what your code does at all...

Try the following code...

  padded_string paddedInputJson = R"({"score":0.8825149536132812})"_padded;
  simdjson::dom::parser parser_dom;
  simdjson::dom::element ele;
  ele = parser_dom.parse(paddedInputJson);
  double score_dom = ele["score"];
  std::cout << "     dom:" << score_dom << std::endl;

  simdjson::ondemand::parser parser_ondemand;
  simdjson::ondemand::value val;
  val = parser_ondemand.iterate(paddedInputJson);
  double  score_ondemand = val["score"];
  std::cout << "ondemand:" << score_ondemand << std::endl;

@DenineLu
Copy link
Author

DenineLu commented Jun 8, 2023

Your title was... "dom and OnDemand resolve double types to different results"

But it is not what your code does at all...

Try the following code...

  padded_string paddedInputJson = R"({"score":0.8825149536132812})"_padded;
  simdjson::dom::parser parser_dom;
  simdjson::dom::element ele;
  ele = parser_dom.parse(paddedInputJson);
  double score_dom = ele["score"];
  std::cout << "     dom:" << score_dom << std::endl;

  simdjson::ondemand::parser parser_ondemand;
  simdjson::ondemand::value val;
  val = parser_ondemand.iterate(paddedInputJson);
  double  score_ondemand = val["score"];
  std::cout << "ondemand:" << score_ondemand << std::endl;

Thank you very much for your guidance, I understand my mistake, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants