-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This exposes 'reset' for object and array instances. #1696
Conversation
I'm OK with it. We can mark it dangerous later if we come up with something better, but it is OK to have things like |
@jkeiser I'd rather not expose it if we can avoid it, but @NicolasJiaxin has been doing some serious work with it and he'll be able to give us feedback soon enough. |
@jkeiser Thanks for the feedback btw. |
I agree we should try not to expose it if we can do fork or something safer/clearer instead, just saying it's not necessarily a disastrous API sin in this case :) |
@NicolasJiaxin I have exposed the bug you reported as part of this PR. |
@lemire If this is ever merged, maybe you could consider adding arrays/objects/values rewind for value::rewind() {
json_type t;
SIMDJSON_TRY(type().get(t));
switch (t)
{
case json_type::array:
array(value).rewind();
break;
case json_type::object:
object(value).rewind();
break;
default:
return INCORRECT_TYPE;
}
} Also, this does not even have to be public. It can be protected, and I think this would work internally for |
Yes. I expect to merge this PR. I think that @jkeiser has de facto approved it (he wrote "I am OK with it") so it is not silly.
At a glance, it seems like a good proposal but I am concerned with a confusion between So, suppose I have this... {"a":"my string"} And I do However, we don't do that for object and arrays. We just set back the point at the beginning. If someone grabbed a string value, it got written to our string buffer. So if you see a loop where you repeatedly access the same values in the same array, you will end up with a buffer overflow. We could still support what you seek, but we would need that a rewind call on the an object or array resets the string buffer. I will open a distinct issue. (Update: see #1701 ) |
I am now thinking that we could call rewind for object and array "reset" instead of "rewind" so that it is clearly distinct from "document.rewind()". It is indeed a distinct functionality and I used the same name for both. Thoughts? cc @jkeiser @NicolasJiaxin |
Yes I think that is a nice distinction to make, rewind is exclusive for documents |
@NicolasJiaxin I will make the change now. |
@NicolasJiaxin I have just renamed these functions to reset. Note that your proposal is not a bad one and I think we will implement it. I just don't want us and the users to get confused. |
@lemire Yes, I just wanted to share the thought, because I came across an issue in lemire/rcppsimdjson#2 where |
I love working with smart people. |
I am going to merge this. I am not super happy, but it looks like a step forward. |
I am not super happy about exposing these methods, but it appears that we need to be able to scan arrays and objects twice (@NicolasJiaxin ?) in some instances. These rewind functions will do just that.
It is somewhat dangerous and should be used with caution because if you access string values multiple times, it will be unsafe (and inefficient). That is, they do not rewind the string buffer.
Note that this functionality competes with @jkeiser 's 'fork' idea but I really would prefer to release 1.0 in the near future and the fork functionality seems speculative. Currently, we have always just one json_iterator instance, and it seems that fork would allow us to copy that... which could potentially expose other bugs.
The benefit of this implementation is that it is non-invasive, so realistic for release 1.0.
If it works out for 1.0, we can revisit this design post-1.0.