Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-11334][core] Migrate EnumValueSerializer to use new serialization compatibility abstractions #7734

Conversation

klion26
Copy link
Member

@klion26 klion26 commented Feb 18, 2019

What is the purpose of the change

Migrate EnumValueSerializer to use new serialization compatibility abstractions

Brief change log

This patch contains:

  • add a new class ScalaEnumSerializerSnapshot
  • return a ScalaEnumSerializerConfigSnapshot with ScalaEnumSerializerSnapshot when calling EnumValueSerializer#snapshotConfiguration
  • add a migration test EnumValueSerializerSnapshotMigrationTest to test the compatibility
  • remove function EnumValueSerializer#ensureCompatibility()

Verifying this change

This change is already covered by existing tests EnumValueSerializerTest and EnumValueSerializerUpgradeTest

and add a migration test EnumValueSerializerSnapshotMigrationTest

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (yes)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 18, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❌ 1. The [description] looks good.
  • ❌ 2. There is [consensus] that the contribution should go into to Flink.
  • ❗ 3. Needs [attention] from.
  • ❌ 4. The change fits into the overall [architecture].
  • ❌ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot approve description to approve the 1st aspect (similarly, it also supports the consensus, architecture and quality keywords)
  • @flinkbot approve all to approve all aspects
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval

…ion compatibility abstractions

This commit migrate EnumValueSerializer to use new serialization compatibilty abstractions
  * add a new class `ScalaEnumSerializerSnapshot`
  * return a `ScalaEnumSerializerConfigSnapshot ` with `ScalaEnumSerializerSnapshot` when calling `EnumValueSerializer#snapshotConfiguration`
  * add a migration test `EnumValueSerializerSnapshotMigrationTest` to test the compatibility
  * remove function `EnumValueSerializer#ensureCompatibility()`
@klion26 klion26 force-pushed the Scala_FLINK11334_EnumValueSerializer_Compatibility branch from 41718d5 to 00e1022 Compare February 18, 2019 12:24
@klion26
Copy link
Member Author

klion26 commented Feb 18, 2019

@flinkbot attention @tzulitai

@aljoscha aljoscha self-requested a review February 19, 2019 10:07
Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! Overall, the code is good, but I requested some changes to functionality. That we also need to discuss with @igalshilman

finally if (inViewWrapper != null) inViewWrapper.close()
}

override def restoreSerializer(): TypeSerializer[E#Value] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not create a serializer but tries to create an instance of the enum (which I think would fail). I'm wondering why this method is never called in the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aljoscha I think the implementation is wrong, there should create an EnumValueSerializer, I'll update it, and I'll dig it a bit to find why this function did not been called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function will be called when restore, in the migration test we just restore the serializer of 1.6 and 1.7, so there didn't have any place will call this function. I will add a test to test this function.

if (!previousEnumConstant.equals(enumValue.toString)) {
// compatible only if new enum constants are only appended,
// and original constants must be in the exact same order
return TypeSerializerSchemaCompatibility.compatibleAfterMigration()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the previous checking code in EnumValueSerializer also returned compatible after migration here but I don't think we do that migration here, so what happens if the index of the value actually changed? Maybe we should be conservative here and disallow any changes to the enum.

@igalshilman, what do you think about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index of the value actually changed, there should be incompatible, I'll update it. because in EnumValueSerializer#serializer we use Enumeration#Value.id

readVersion: Int, in: DataInputView, userCodeClassLoader: ClassLoader): Unit = {
val inViewWrapper = new DataInputViewStream(in)
try {
if (readVersion == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can't happen that this path would ever be taken, because we never try and restore the old TypeSerializerConfigSnapshot in here.

@igalshilman Could you confirm?

}

object ScalaEnumSerializerSnapshot {
val VERSION = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be in line with the other newly added TypeSerializerSnapshots, this should maybe be version 2. But I see that version 3 could also be valid since the old TypeSerializerConfigSnapshot for enums was already at version 2.

@igalshilman Sorry for bothering again, but what do you think?

@@ -100,15 +100,15 @@ class EnumValueSerializerUpgradeTest extends TestLogger with JUnitSuiteLike {
*/
@Test
def checkRemovedField(): Unit = {
assertTrue(checkCompatibility(enumA, enumC).isIncompatible)
assertTrue(checkCompatibility(enumA, enumC).isCompatibleAfterMigration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this should probably be incompatible.

}

/**
* Check that changing the enum field order requires migration
*/
@Test
def checkDifferentFieldOrder(): Unit = {
assertTrue(checkCompatibility(enumA, enumD).isIncompatible)
assertTrue(checkCompatibility(enumA, enumD).isCompatibleAfterMigration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this should probably be incompatible.

@aljoscha
Copy link
Contributor

Please don't change anything right now. I'm preparing a set of changes on your PR.

@klion26
Copy link
Member Author

klion26 commented Feb 19, 2019

Got it.

@aljoscha
Copy link
Contributor

@klion26 Thanks for your contribution again! 😄It turns out that the enum serializers are quite tricky, @igalshilman and I spent some time to create an updated PR that includes your commit: #7766

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants