-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-6869] [scala] Specify serialVersionUID for all Scala serializers #4090
Conversation
R: @StefanRRichter @aljoscha tagging you because I talked to you about the issue offline :) Could you have a quick look? |
One caveat that this PR does not yet fully fix: At this moment, I've hit a bit of a wall trying to resolve this. The problem was always there pre-1.3, as if users simply change the order of their Scala type serializer generation (simply changing call order of Note that this problem only exists for the heap backends, because there we deserialize eagerly on restore. |
Overall, I think this is ok as a best effort until we have some eager registration that helps with the remaining problems in the heap backend. |
…hot classes Prior to this commit, the configuration snapshot classes of Scala serializers did not have the proper default empty constructor that is used for deserializing the configuration snapshot. Since some Scala serializers' config snapshots extend the Java CompositeTypeSerializerConfigSnapshot, their config snapshot classes are also changed to be implemented in Java since in Scala we can only call a single base class constructor from subclasses.
Previously, Scala serializers did not specify the serialVersionUID, and therefore prohibited restore from previous Flink version snapshots because the serializers' implementations changed. The serialVersionUIDs added in this commit are identical to what they were (as generated by Java) in Flink 1.2, so that we can at least restore state that were written with the Scala serializers as of 1.2.
…s classed serializers This commit lets the TypeSerializerSerializationProxy be tolerable for serialVersionUID mismatches when reading anonymous classed serializers. Our Scala case class serializers require this since they use Scala macros to be generated at compile time, and therefore is not possible to fix a certain serialVersionUID for them. This commit also updates the streaming state docs to educate the user to avoid using anonymous classes for their state serializers.
try { | ||
typeSerializer = InstantiationUtil.deserializeObject(buffer, userClassLoader); | ||
|
||
ClassLoader old = Thread.currentThread().getContextClassLoader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to previousClassLoader
…d anonymous serializers This commit lets the TypeSerializerSerializationProxy be tolerable for serialVersionUID mismatches when reading anonymous classed serializers or our Scala serializers. Our Scala serializers require this since they use Scala macros to be generated at compile time, and therefore is not possible to fix a certain serialVersionUID for them. For non-generated Scala serializers, we still also need this because their serialVersionUIDs pre-1.3 may vary depending on the Scala version used. This can be seen as a workaround, and should be reverted once 1.2 savepoint compatibility is no longer maintained. This commit also updates the streaming state docs to educate the user to avoid using anonymous classes for their state serializers. This closes #4090.
This PR fixes 2 issues:
1. Configuration snapshots of Scala serializers were not readable:
Prior to this PR, the configuration snapshot classes of Scala serializers did not have the proper default empty constructor that is used for deserializing the configuration snapshot. Since some Scala serializers' config snapshots extend the Java
CompositeTypeSerializerConfigSnapshot
, their config snapshot classes are also changed to be implemented in Java since in Scala we can only call a single base class constructor from subclasses.2. Scala serializers did not specify the serialVersionUID:
Previously, Scala serializers did not specify the
serialVersionUID
, and therefore prohibited restore from previous Flink version snapshots because the serializers' implementations changed in 1.3. TheserialVersionUID
s added in this PR are identical to what they were (as generated by Java) in Flink 1.2, so that we can at least restore state that were written with the Scala serializers as of 1.2.