-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup JsonValue serialization. #1052
Conversation
`JsonValue` is the type we use to represent dynamically typed JSON in DDlog. We used to serialize it by converting to `serde::json::Value` and serializing that. This conversion is expensive and unnecessary, so we implement `Serialize` for `JsonValue` natively instead. We should do the same for deserialize, but this is much more work, so we leave it for when we really need it in the future. One side effect of this change is that keys in JSON maps, which are `istring`'s, are serialized in a deterministic, but unpredictable order, instead of alphabetical order. Signed-off-by: Leonid Ryzhyk <lryzhyk@vmware.com>
@@ -80,18 +94,6 @@ impl<'de> Deserialize<'de> for ValueWrapper { | |||
} | |||
} | |||
|
|||
impl From<ValueWrapper> for JsonValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this where the benefit is coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
match self { | ||
JsonValue::JsonNull => serializer.serialize_unit(), | ||
JsonValue::JsonBool { b } => serializer.serialize_bool(*b), | ||
JsonValue::JsonNumber { ref n } => val_from_num(n.clone()).serialize(serializer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try and avoid this clone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be cheap, it's just cloning a numeric value.
pub struct ValueWrapper(serde_json::value::Value); | ||
|
||
impl serde::Serialize for ValueWrapper { | ||
impl serde::Serialize for JsonValue { | ||
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> | ||
where | ||
S: serde::Serializer, | ||
{ | ||
if serializer.is_human_readable() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to explicitly handle this? Doesn't serde's infrastructure do the oh its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code could use some documentation, but it's there for a reason. The problem is that serde::json::Value
cannot be deserialized from non-self-describing formats like bincode. As a workaround, we first serialize the JSON value into a JSON string and store the string in bincode. The next problem for which there is currently no good solution is that there is no way for the Serialize
/Deserialize
implementation to tell whether it is working with a self-describing format. We use is_human_readable
as a very rough and generally incorrect approximation of that.
serde_json::to_string(self) | ||
.map_err(|e| serde::ser::Error::custom(e))? | ||
.serialize(serializer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really wrong, we shouldn't need this branch at all since the first branch just serializes the value and this branch will only end up calling the first branch when it serializes self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, it will use the first branch to serialize itself into a JSON string and then store the string inside the serializer.
JsonValue
is the type we use to represent dynamically typed JSON inDDlog. We used to serialize it by converting to
serde::json::Value
and serializing that. This conversion is expensive and unnecessary, so
we implement
Serialize
forJsonValue
natively instead. We shoulddo the same for deserialize, but this is much more work, so we leave it
for when we really need it in the future.
One side effect of this change is that keys in JSON maps, which are
istring
's, are serialized in a deterministic, but unpredictable order,instead of alphabetical order.
Signed-off-by: Leonid Ryzhyk lryzhyk@vmware.com