Skip to content

[FLINK-12064] [core, State Backends] RocksDBKeyedStateBackend snapshot uses incorrect key serializer if reconfigure happens during restore#8076

Closed
carp84 wants to merge 1 commit intoapache:masterfrom
carp84:flink-12064
Closed

Conversation

@carp84
Copy link
Member

@carp84 carp84 commented Mar 29, 2019

What is the purpose of the change

This PR fixes the issue of RocksDBKeyedStateBackend snapshot using incorrect key serializer if reconfigured during restore in RocksDBKeyedStateBackendBuilder, which is a regression caused by FLINK-10043. We also reenforce the UT cases to cover the issue scenario.

Brief change log

Remove keySerializer from AbstractKeyedStateBackendBuilder and use keySerializerProvider.currentSchemaSerializer() instead in RocksDBKeyedStateBackendBuilder. For the test case part, perform an additional snapshot after state migration.

Verifying this change

This change added tests and can be verified as follows:

  • Reenforced StateBackendMigrationTestBase, adding an additional snapshot after state migration to verify the snapshot part.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (yes)
    • This change assures to use correct serializer in snapshot
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
    • This change prevents possible corruption of snapshot data
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

…correct key serializer if reconfigure happens during restore
@flinkbot
Copy link
Collaborator

flinkbot commented Mar 29, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ✅ 1. The [description] looks good.
  • ✅ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ✅ 4. The change fits into the overall [architecture].
  • ✅ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@tzulitai
Copy link
Contributor

Just to clarify:
the problem is that the state backend uses the correct key serializer for state access, but the wrong one is snapshotted in checkpoints, is that correct?

Otherwise, the title / description implies that it is using the wrong key serializer for runtime state access.

@carp84 carp84 changed the title [FLINK-12064] [core, State Backends] RocksDBKeyedStateBackend uses incorrect key serializer if reconfigure happens during restore [FLINK-12064] [core, State Backends] RocksDBKeyedStateBackend snapshot uses incorrect key serializer if reconfigure happens during restore Mar 30, 2019
@tzulitai
Copy link
Contributor

tzulitai commented Apr 1, 2019

Thanks for the fix @carp84.

I've verified that the modified tests fail without the fixes to the key serializer access.
+1 to merge, will proceed to merge this to release-1.8 and master.

@tzulitai
Copy link
Contributor

tzulitai commented Apr 1, 2019

@flinkbot approve all

@asfgit asfgit closed this in c5e4acc Apr 1, 2019
asfgit pushed a commit that referenced this pull request Apr 1, 2019
…ts uses incorrect key serializer if reconfigure happens during restore

This closes #8076.
@carp84
Copy link
Member Author

carp84 commented Apr 1, 2019

Thanks for the review! @tzulitai

HuangZhenQiu pushed a commit to HuangZhenQiu/flink that referenced this pull request Apr 22, 2019
…ts uses incorrect key serializer if reconfigure happens during restore

This closes apache#8076.
sunhaibotb pushed a commit to sunhaibotb/flink that referenced this pull request May 8, 2019
…ts uses incorrect key serializer if reconfigure happens during restore

This closes apache#8076.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants