-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[proposal] Make an option for lazy-deserialization of complex types #1373
Comments
Thanks for the clear write-up. This seems like a very interesting idea which I'd like us to explore! Few thoughts:
In summary, I'm interested in seeing this in PR and landing ultimately as an optional feature, though let's consider the complexity it adds to the code base vs. the benefits once there's a PR. |
Actually, simplest examples are the way they treat String values in generated code: private java.lang.Object name_ = "";
/**
* <code>string name = 1;</code>
* @return The name.
*/
public java.lang.String getName() {
java.lang.Object ref = name_;
if (!(ref instanceof java.lang.String)) {
com.google.protobuf.ByteString bs =
(com.google.protobuf.ByteString) ref;
java.lang.String s = bs.toStringUtf8();
name_ = s;
return s;
} else {
return (java.lang.String) ref;
}
} You may note here that they are also throw away the underlying ByteString (which is good for the memory footprint), but I'm not sure if it better to store it for later serialization or not. Anyway, this code looks hacky and follows "at least once" deserializations.
I suppose this is true, but I think it may be relaxed a bit by thinking about specification of types in advance and a 'little bit' of refactoring in generator. I'm on the PoC PR for this issue and the type derivation is one of the most painful places indeed.
That's not a problem since it will use the same implicit conversion as the regular user code, but in generated context.
I'm on it, and I'll try to measure it in benchmarks.
|
Closing due to inactivity. Feel free to comment if this is still needed. |
One of crucial performance bottleneck that
protobuf-java
adresses is figting with eager deserialization of complex (e.g. variable-sized) types.There are even classes like
LazyField<T>
, that stores aByteString
underneath until actual access to the field occurs.If we don't actually care about binary compatibility (which is usually the case for the generated code), we may accomplish that in a neat API-compatible way.
Firstly, we introduce a sealed
LazyField[T]
and a case classes which would model such datatype:And then, when option for lazy fields is enabled, we just generate LazyField[T] instead of T for any type we consider "complex" (string, repeated, message, e.g.)
Implicits from
case object LazyField
should maintain API compatibility and end-users shouldn't notice anything.The text was updated successfully, but these errors were encountered: