Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] support type forward/backward compatibility #197

Closed
chaokunyang opened this issue May 10, 2023 · 0 comments · Fixed by #195
Closed

[Java] support type forward/backward compatibility #197

chaokunyang opened this issue May 10, 2023 · 0 comments · Fixed by #195
Labels
enhancement New feature or request java

Comments

@chaokunyang
Copy link
Collaborator

chaokunyang commented May 10, 2023

Is your feature request related to a problem? Please describe.
In #181, we implement ObjectSerializer for custom object serialization. But many application is more compilciated, the serialization peer and deserialiation peer may have inconsisitent schema, thus different class difinition. Each peer will envolve schema independently.

In such cases, the deserialization will failed in ObjectSerializer, since it didn't write class fields meta, the deserializtion only work if the class has same meta with serialization peer. We need a way to support skip unexisted fields when deserilization met a field which doesn't exist in this class. And set field to null if it doesn't exist in the serialization data.

Describe the solution you'd like
Under the type compatibility mode, Fury divides fields into four types:

  • Fields that can be represented by four bytes to indicate the type information: the field type is a final type, and the class ID is less than 63, occupying one byte, while the field name occupies three bytes;
  • Fields that can be represented by eight bytes to indicate the type information: the field type is a final type, and the class ID is less than 127, occupying one byte, while the field name occupies seven bytes. Each character is represented by 6 bits, and seven bytes can represent nine characters;
  • Other fields that are final types: the field name and the field type are encoded together;
  • Other fields that are non-final types: the field name and the field type are encoded separately.

If there are fields with the same name in parent and child classes, the classname is included as part of the field name to be encoded. Then, these fields are sorted in ascending order according to the integer value of the field name encoding. During serialization, the encoded integer value is written first, followed by the specific field data. During deserialization, the current field in the byte stream can be directly judged for its existence in the current type's fields based on the integer size. It may be in front of the first field of the type, among the nonexistent fields, or after the last field of the type with that type. Then, depending on the corresponding case, it is determined whether to serialize or skip serialization. Specific details can be found in the deserialization code below.

This allows avoiding the overhead of deserializing String field names in Hessian and Kryo, as well as the overhead of hash lookup and binary search.

Describe alternatives you've considered
The solution proposed in this issue will write class meta every time the object is serialized. If multiple objects of same time are serialized as a whole, the meta will be serialized multiple times, which is unneccessary.

We can use meta sharing to write meta only once in a serialization for an object graph. And the meta can be encoded to binary, so the actual meta writting will be just a memory copy, which is far more faster

Additional context
#180 #181

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request java
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant