-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean updating ondemand #2
Clean updating ondemand #2
Conversation
@lemire The main problem that I am having is that the |
Getting back to you tomorrow! |
@NicolasJiaxin Let me try to understand the issue. |
BTW your struggles are exactly the point. :-) I want you to find all the confusing parts so we can discuss and see if we can improve our API later. Or, at least, the documentation. |
Here is your own code in simdjson (tests): bool run_success_test(const padded_string & json,std::string_view json_pointer,std::string expected) {
TEST_START();
ondemand::parser parser;
ondemand::document doc;
ondemand::value val;
std::string_view actual;
ASSERT_SUCCESS(parser.iterate(json).get(doc));
ASSERT_SUCCESS(doc.at_pointer(json_pointer).get(val));
ASSERT_SUCCESS(simdjson::to_json_string(val).get(actual));
ASSERT_EQUAL(actual,expected);
... As you can see, you do |
if(parsed.at_pointer(std::string_view(query)).get(queried) == simdjson::SUCCESS) { | ||
return deserialize(queried, parse_opts); // #nocov | ||
auto queried = parsed.at_pointer(std::string_view(query)); | ||
if(queried.second == simdjson::SUCCESS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This troubles me that you are able to do queried.second. I thought we made this impossible..
The syntax is supposed to be...
ondemand::value queried;
simdjson::error_code error = parsed.at_pointer(std::string_view(query)).get(queried);
if(queried ==...}
Does that no work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what I had in the before, but if you go back to yesterday's commit e91ee40, I still have an error with regard to the get()
method which says that it is not implemented with the given type (simdjson::ondemand::value
). And also, when I look in the documentation, it says that is should not be supported, so I was surprised to see it work in our own tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll investigate right now. Meanwhile please don't use first/second. We don't want end users to use it. It is confusing and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to revert to commit e91ee40 right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just finished producing a PR directly on simdjson so that it will no longer possible to use first/second. I don't have yet a good idea of the issue you are encountering. Let have a look first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need more coffee too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will revert then, so this is not broken code.
You should not be able to use |
Somehow one is able to access first and second. I am investigating. It should not be possible. |
We can revert to commit e91ee40 once that permission is removed where I used |
176c1be
to
e91ee40
Compare
You probably know this, but there is a complete log of the tests and the failed tests in |
I realize this but thanks for the reminder. I have not managed to get to it today, more later. |
@NicolasJiaxin Let us try to improve support for integers. Could you have a look at my proposal at simdjson/simdjson#1703 ? |
@lemire It works! As expected, all failed tests (except one) were related to numbers issue. The only test that was still failing was a test with valid_json, but I changed it because I think this is another instance of On Demand that does not know that the JSON is invalid (yet), but I think you should check it out to be sure it is ok to change it. |
expect_false(any(is_valid_json(valid_utf8))) | ||
# Change to expect_true since valid json is only detected when parsed/accessed. | ||
#expect_false(any(is_valid_json(valid_utf8))) | ||
expect_true(any(is_valid_json(valid_utf8))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one.
if (FALSE) { | ||
.write_file("JUNK JSON", test_file1) | ||
.write_file('"VALID JSON"', test_file2) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this one I removed.
@NicolasJiaxin That's fantastic. I think that's all we needed to do for the summer. Would you do a PR from your repo to the eddelbuettel/rcppsimdjson repo, while explaining your work? Label it clearly as a prototype and explain a bit what you did. This way people will be able to build on your work (if they choose to do so). |
A few remarks regarding On Demand after doing this work:
|
I will close this PR as I have opened one to the main repo here. |
I have opened an issue.
Can you elaborate on how this might be used? I am concerned about double and triple parsing. It would be a terrible pattern to do Right now, you can check if the number is negative. This is fast. If it is positive, then you can do I am not dismissing your proposal. I just want do understand it. That is, we just don't want to throw new functions into the API. We want to keep the API as tight as possible. Adding more functions makes it harder to use. Now, if we had more functions that can be used in a counterproductive manner, we risk making things worse. (I am not dismissing your proposal. To be sure.) |
I thought that |
@NicolasJiaxin So let us say I have a number string ... 'xxxxxxxxx'. Now, I do not want you to ever scan the number twice. It seems to me that what you would do is something
That's very bad because you could call is_large_integer, then is_integer, then get_double, thus scanning the input string three times. Even just scanning the input string twice is really bad. I'd never want anyone to do it.
Yeah. Parsing the numbers twice is not good. |
@lemire Ahh... yes, I see what was your concern now. You are right, there is probably no useful point of having |
Let me see if I can do a patch that solves this, somewhat. |
Duplicate of #1.
Clean version of update.