Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json_object: implement json_object_deep_copy() function #328

Merged
merged 2 commits into from Nov 30, 2017
Merged

json_object: implement json_object_deep_copy() function #328

merged 2 commits into from Nov 30, 2017

Conversation

commodo
Copy link
Contributor

@commodo commodo commented Jun 16, 2017

Seems to perform better than outputting to string and re-parsing it.

BENCHMARK - 1000000 iterations of 'dst2 = json_tokener_parse(json_object_get_string(src2))' took 20 seconds
BENCHMARK - 1000000 iterations of 'dst2 = json_tokener_parse(json_object_get_string(src2))' took 7 seconds

This function will be needed for JSON patch [RFC 6902] ; before applying a patch, a backup of the original object will be made ; in case any patch operation fails, the JSON object should not be modified at all.
This is covered in section 5. Error Handling

This PR also adds the json_object_new_null() function (which is requested here #226 ).
While copying the JSON object, it's easier to have an object-ual representation for the JSON null.

Signed-off-by: Alexandru Ardelean ardeleanalex@gmail.com

@commodo
Copy link
Contributor Author

commodo commented Jun 19, 2017

I thought a bit about the json_object_new_null() change.
This might have a slight change, which should not be a problem with json_object_is_type(jso, json_type_null) and json_object_is_type(json_object_new_null(), json_type_null) returning both true.

I don't feel this should be a problem in the whole JSON context.
After taking a quick look through the JSON RFC I did not find anything to enforce the null-ness case.

But, if there's another opinion here, I'm open for discussion.

@@ -31,7 +30,7 @@ int main(void)
printf("parsed string: ");
puts(json);
puts("FAIL");
retval=1;
return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you bailing early? I expect that this will cause memory leak errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea here was that if the test fails here, stop it here, so that we don't generate more output
but, since the output is compared against an expected output the test should fail;
will fix

printf("%s\n", json_object_get_string(jso));

json_object_put(null_object);
puts("PASS");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be left out, since whether it actually passed depends on whether all of the output matches what's in the .expected file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, we should adjust the "PASS" lines here to more accurately indicate that it's just the individual sections of the test that passed.

json_object.c Outdated
{
int rc;
/* Nothing to do, so return succes */
if (!src && !dst)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO !dst should always be an error. If you're not providing some place to put the copied object you're doing something wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

json_object.c Outdated
}

/* Free previous destination object */
json_object_put(*dst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip this if *dst == NULL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json_object_put() handles null;
adding a null-check here may yield a small performance improvement, in the sense that a jump to the json_object_put() block, won't happen if that is null

most modern compilers would probably optimize this call
not sure about older compilers & deployment [of json-c on them]

hmm ; i guess i'm neutral about adding or not-adding it;
instead of over-thinking it, i'll just add it

json_object.c Outdated
return 0;

/* Check if arguments are sane ; refcount must not be higher than 1 */
if (!src || !dst || (*dst && (*dst)->_ref_count > 1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to allow *dst to be set at all, then it seems unnecessarily limiting to have the ref count restriction here. I'd say just json_object_put it if there's something there, do a *dst = NULL, and continue on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but, if refcount > 1, then it implies that the object may be shared by some other entity;
overwriting it in that case may cause weird/unexpected behavior for any other entity holding a refcount to this object

we can drop the refcount check [if you really insist] ;
i'd feel a bit uncomfortable about it, though, but i could live with it

@hawicz
Copy link
Member

hawicz commented Jul 8, 2017

Having 'null' in the JSON that you parse turn into something other than a C NULL is a fairly significant API change. If we were to really switch to having 'null' values show up as json_type_null C objects then we need to go whole-hog and start treating NULL as an invalid value in many places (such as json_object_object_add), and have a major library version bump.
The fact that 'null' values show up as C NULL's will make the deepcopy code a bit more complicated, but it doesn't seem like it should be all that bad, and trying to tie implementing deepcopy to all the API changes needed to really support json_type_null as an object seems like it's asking for trouble.

json_object.h Outdated
* If the destination object is non-NULL, it will be overwrriten
* completely, if the refcount is 1.
*
* Note: the same result could be achieved by using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be reworded a bit. Perhaps just:

This does roughly the same thing as json_tokener_parse(json_object_get_string(src))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

json_object.h Outdated
/**
* Copy the contents of the JSON object.
* If the destination object is NULL, it will be allocated.
* If the destination object is non-NULL, it will be overwrriten
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

If *dst is non-NULL, it will be passed to json_object_put (which may free it if the refcount drops to 0)
and a completely new object will be created and assigned to *dst.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before addressing this, let's resolve this comment first :)
9f420cd#r126346219

/* errno should be set by the function that faulted */
if (!*dst)
return -1;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to copy over at least the serializer and data used by json_object_new_double_s(), and ideally any user-specified ones. However, for user-specified ones we have the problem that we can free userdata, but have no way to copy it. hmm...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack
i had no idea about that serializer data;

@commodo
Copy link
Contributor Author

commodo commented Jul 10, 2017

Having 'null' in the JSON that you parse turn into something other than a C NULL is a fairly significant API change. If we were to really switch to having 'null' values show up as json_type_null C objects then we need to go whole-hog and start treating NULL as an invalid value in many places (such as json_object_object_add), and have a major library version bump.
The fact that 'null' values show up as C NULL's will make the deepcopy code a bit more complicated, but it doesn't seem like it should be all that bad, and trying to tie implementing deepcopy to all the API changes needed to really support json_type_null as an object seems like it's asking for trouble.

i think i could implement a deep-copy that may work without this null JSON object change ;
i am no longer as sure [how possible it is] as i would have been when i first implemented the change tho ;
but I could try to re-visit this ;

the null object change, is mostly a personal ambition that i tried to do ;
i agree the API change is significant ;
maybe i could re-implement the deep copy without this, and re-visit the whole null object thing after a version bump ;

@commodo
Copy link
Contributor Author

commodo commented Jul 11, 2017

@hawicz

  • dropped json_object_new_null() function ; maybe will re-visit this later [ let's see ]
  • added copying for serializer data [from json_object_new_double_s() ] ; it's function json_object_copy_serializer_data() ; please check if you agree with it it :)
  • add null check to *dst
  • returning EINVAL now ; even if both !src & !dst

Still need to resolve the discussion [and related comment] if (*dst)->_refcount > 1.

json_object.c Outdated
if (src->o_type != json_type_double)
return 0;
/* Nothing to do */
if (!src->_userdata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_userdata doesn't actually need to be non-NULL. You can set a serializer that doesn't need userdata.

json_object.c Outdated
/* Nothing to do */
if (!src->_userdata)
return 0;
if (!src->_user_delete) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop this check too. As with _userdata, _user_delete is optional.

json_object.c Outdated
static int json_object_copy_serializer_data(struct json_object *src, struct json_object *dst)
{
/* `src` and `dst` were validated by now ==> no null check for them */
if (src->o_type != json_type_double)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we currently only use json_object_userdata_to_json_string for double's, there's no particular reason it couldn't be used for other types as well, and if it's in place we really don't care what type the object is, so please drop this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, regarding other/external serializers, maybe a _user_copy function would be needed

json_object.c Outdated
json_object_object_foreachC(src, iter) {
/* This handles the `json_type_null` case */
if (!iter.val) {
json_object_object_add(*dst, iter.key, NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about have this line be "jso = NULL;", drop the continue, and change 1331 to "else if ...". That way we have a single call to json_object_object_add, and the return value is checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

json_object.c Outdated
json_object_object_add(*dst, iter.key, NULL);
continue;
}
if ((rc = json_object_deep_copy_recursive(iter.val, &jso)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to do something with rc before returning? If not, don't bother using it. It looks like the only place it's actually used at all is right at the end of this function, and even there it's not really needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack;
not sure anymore ;
the rc is mostly out of habit, when I try to use more goto statements ; which is not the case
will remove rc altogether

json_object.c Outdated
struct json_object *jso1 = json_object_array_get_idx(src, i);
/* This handles the `json_type_null` case */
if (!jso1) {
json_object_array_add(*dst, NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as for the json_type_object case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -0,0 +1,12 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this to be a symlink to the newly added test_basic.test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack ;
missed it through all the rebasing

@hawicz
Copy link
Member

hawicz commented Jul 13, 2017

Did you respond about the (*dst)->_refcount > 1 check somewhere? Github isn't showing that to me (nor my original comment :( )
Anyway, I think that check should be dropped. Either it should support the normal reference count handling, where passing something with _refcount==1 frees the object, and _refcount>1 just decrements by 1, or it should enforce that *dst is NULL, with my preference being the former.

@commodo
Copy link
Contributor Author

commodo commented Jul 13, 2017

Did you respond about the (*dst)->_refcount > 1 check somewhere? Github isn't showing that to me (nor my original comment :( )

It was this comment series:

Github is kinda crappy after a long review ; Bitbucket is s bit better, but not by much.
On Github, per-line comments get imploded after a force-push, because the lines are outdated [so you need to manually find them and expand them ].
One way to go about it, is to close a PR and open a new one [and reference the old PR/comments] ; after a few force pushes, it gets even harder to find old comments.
[ semi-offtopic: i see a business opportunity to build a per-line-of-codes review tool that is better/easier to use ]

Anyway, I think that check should be dropped. Either it should support the normal reference count handling, where passing something with _refcount==1 frees the object, and _refcount>1 just decrements by 1, or it should enforce that *dst is NULL, with my preference being the former.

Regarding Enforcing that *dst is NULL, with my preference being the former.
I can agree with this.

For the _refcount == 1 enforcement, I had in mind the ability to do some sort of construct like:

struct json_object dst = NULL;
for ( ;; ) {
   < logic1 >
   json_deep_copy(src, &dst)
   < logic2 >
}

So, in this case you can rewrite the *dst object.
The thought came to be during the JSON Patch work. A JSON Patch replace operation would simply be a json_deep_copy(src, &dst).

But I can understand [what you're saying] that, as a standalone function, json_deep_copy() would better enforce that *dst is NULL, and let the caller handle free-ing/prep-ing the *dst object.
[ Assuming, that's the direction you're aiming for ]

@commodo
Copy link
Contributor Author

commodo commented Jul 13, 2017

what if we change

extern int json_object_deep_copy(struct json_object *src, struct json_object **dst); 

to

extern struct json_object *json_object_deep_copy(struct json_object *src); 

?

@commodo
Copy link
Contributor Author

commodo commented Jul 19, 2017

re-spinned with review comments addressed
[ hopefully all of them [that were made so far]
feel free to add more

Because doing `json_tokener_parse(json_object_get_string(src))`
feels sloppy, dirty, and makes me want to cry at night
sometimes.

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Seems to perform better than outputting to string
and re-parsing it.

BENCHMARK - 1000000 iterations of 'dst2 = json_tokener_parse(json_object_get_string(src2))' took 20 seconds
BENCHMARK - 1000000 iterations of 'dst2 = json_tokener_parse(json_object_get_string(src2))' took 7 seconds

It should make a difference on embedded systems.
The test was performed on a i5 desktop CPU [~3.5 years of age].

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
@hawicz hawicz merged commit 1eab22f into json-c:master Nov 30, 2017
@commodo commodo deleted the json_deep_copy branch November 25, 2020 07:33
@commodo commodo restored the json_deep_copy branch November 25, 2020 07:33
@commodo commodo deleted the json_deep_copy branch November 25, 2020 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants