Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-21505] Enforce common/unified savepoint format at operator level #14982

Closed

Conversation

aljoscha
Copy link
Contributor

@aljoscha aljoscha commented Feb 22, 2021

Before, we were relying on the fact that all keyed backends would use the same strategy for savepoints.

Now, we're forcing them at the API level to provide a SavepointResources that we can then use to create a unified savepoint using SavepointSnapshotStrategy.

Verifying this change

This is a refactoring and is covered by existing tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@aljoscha aljoscha marked this pull request as draft February 22, 2021 10:27
@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 69e3b6f (Mon Feb 22 10:29:32 UTC 2021)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!
  • Invalid pull request title: No valid Jira ID provided

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 22, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@dawidwys
Copy link
Contributor

Generally speaking I like the idea. I think it is a nice way to ensure all state backends produce a common unified format.

I will go over the PR and do a proper review.

"Asynchronous full Savepoint",
savepointSnapshotStrategy,
closeableRegistry,
ASYNCHRONOUS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to hardcode ASYNCHRONOUS here?

@aljoscha aljoscha force-pushed the flink-xxx-savepoints-at-operator-level branch from da7d208 to 99de033 Compare February 24, 2021 15:37
@aljoscha aljoscha force-pushed the flink-xxx-savepoints-at-operator-level branch from 56576b4 to 1738289 Compare February 25, 2021 15:04
@aljoscha aljoscha changed the title [FLINK-XXXXX] Enfore common/unified savepoint format at operator level [FLINK-21505] Enfore common/unified savepoint format at operator level Feb 25, 2021
@aljoscha aljoscha marked this pull request as ready for review February 25, 2021 15:07
@aljoscha aljoscha changed the title [FLINK-21505] Enfore common/unified savepoint format at operator level [FLINK-21505] Enforce common/unified savepoint format at operator level Feb 25, 2021
@aljoscha aljoscha force-pushed the flink-xxx-savepoints-at-operator-level branch from 1738289 to 912c9b4 Compare February 25, 2021 15:30
… level

Before, we were relying on the fact that all keyed backends would use
the same strategy for savepoints.

Now, we're forcing them at the API level to provide a SavepointResources
that we can then use to create a unified savepoint using
SavepointSnapshotStrategy.
Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! I really like the flow now!

@aljoscha
Copy link
Contributor Author

@flinkbot run azure

1 similar comment
@aljoscha
Copy link
Contributor Author

@flinkbot run azure

@dawidwys
Copy link
Contributor

I think the problem with azure is that it is not rebased on the fixed master. I have rebased the PR on current master and I am running it on my private azure: https://dev.azure.com/wysakowiczdawid/Flink/_build/results?buildId=707&view=results

@dawidwys dawidwys closed this in 7f09235 Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants