Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup JsonValue serialization. #1052

Merged
merged 1 commit into from
Aug 18, 2021

Conversation

ryzhyk
Copy link
Contributor

@ryzhyk ryzhyk commented Aug 17, 2021

JsonValue is the type we use to represent dynamically typed JSON in
DDlog. We used to serialize it by converting to serde::json::Value
and serializing that. This conversion is expensive and unnecessary, so
we implement Serialize for JsonValue natively instead. We should
do the same for deserialize, but this is much more work, so we leave it
for when we really need it in the future.

One side effect of this change is that keys in JSON maps, which are
istring's, are serialized in a deterministic, but unpredictable order,
instead of alphabetical order.

Signed-off-by: Leonid Ryzhyk lryzhyk@vmware.com

`JsonValue` is the type we use to represent dynamically typed JSON in
DDlog.  We used to serialize it by converting to `serde::json::Value`
and serializing that.  This conversion is expensive and unnecessary, so
we implement `Serialize` for `JsonValue` natively instead.  We should
do the same for deserialize, but this is much more work, so we leave it
for when we really need it in the future.

One side effect of this change is that keys in JSON maps, which are
`istring`'s, are serialized in a deterministic, but unpredictable order,
instead of alphabetical order.

Signed-off-by: Leonid Ryzhyk <lryzhyk@vmware.com>
@ryzhyk ryzhyk requested a review from Kixiron August 17, 2021 00:38
@@ -80,18 +94,6 @@ impl<'de> Deserialize<'de> for ValueWrapper {
}
}

impl From<ValueWrapper> for JsonValue {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where the benefit is coming from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@ryzhyk ryzhyk merged commit 243760f into vmware:master Aug 18, 2021
@ryzhyk ryzhyk deleted the optimize_json_serialization branch August 18, 2021 17:09
match self {
JsonValue::JsonNull => serializer.serialize_unit(),
JsonValue::JsonBool { b } => serializer.serialize_bool(*b),
JsonValue::JsonNumber { ref n } => val_from_num(n.clone()).serialize(serializer),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try and avoid this clone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be cheap, it's just cloning a numeric value.

pub struct ValueWrapper(serde_json::value::Value);

impl serde::Serialize for ValueWrapper {
impl serde::Serialize for JsonValue {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
if serializer.is_human_readable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly handle this? Doesn't serde's infrastructure do the oh its own?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code could use some documentation, but it's there for a reason. The problem is that serde::json::Value cannot be deserialized from non-self-describing formats like bincode. As a workaround, we first serialize the JSON value into a JSON string and store the string in bincode. The next problem for which there is currently no good solution is that there is no way for the Serialize/Deserialize implementation to tell whether it is working with a self-describing format. We use is_human_readable as a very rough and generally incorrect approximation of that.

Comment on lines +70 to 60
serde_json::to_string(self)
.map_err(|e| serde::ser::Error::custom(e))?
.serialize(serializer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really wrong, we shouldn't need this branch at all since the first branch just serializes the value and this branch will only end up calling the first branch when it serializes self

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, it will use the first branch to serialize itself into a JSON string and then store the string inside the serializer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants