New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-34210][Runtime/Checkpointing] Fix DefaultExecutionGraphBuilder#isCheckpointingEnabled return wrong value when checkpoint disabled #24173
base: master
Are you sure you want to change the base?
Conversation
528f202
to
d8ac577
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pr.
LGTM.
@flinkbot run azure |
@flinkbot run azure |
…#isCheckpointingEnabled return wrong value when checkpoint disabled
d8ac577
to
ee2d9ee
Compare
@flinkbot run azure |
@masteryhx |
…eckpointCoordinatorConfiguration.DISABLED_CHECKPOINT_INTERVAL when enableCheckpointing
@flinkbot run azure |
@mayuehappy I noticed that the creation of the CheckpointCoordinator currently depends on the isCheckpointingEnabled condition. This implies that if we disable checkpointing, the entire coordinator won't get created. Are we certain this won't pose a problem? For instance, could this also prevent triggering savepoints? cc @masteryhx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the current status of this PR is approved, but the PR skips the creation of the checkpoint coordinator when disabling the checkpoint scheduling, we might need some double confirmation here. Therefore, I'm changing the status to request changes before further clarification.
@JunRuiLee Thanks for joining the disscussion .
Yes, actually I think that's also why we made this change. Because Flink does not require the creation of components such as CheckpointCoordinator or CheckpointIDCounter in many scenarios, such as OLAP or Flink Batch. In these scenarios, I believe it is reasonable not to create a CheckpointCoordinator.
Yes. If disabling Checkpoint, the savepoint cannot be triggered after Change. But I think this issue is indeed worth discussing. In my opinion, if the user disable Checkpoint, does it mean that the user can accept state loss. I can't imagine a scenario where users don't need Checkpoint but need Savepoint. So i thinks it's fine here or we need notice the user savepoint can only be triggered while Checkpoint is enabled . |
@mayuehappy The key point is that if the users set the job with streaming execution mode and set the checkpoint interval with Long.MAX_VALUE or a small value less than 10, the savepoint will be not supported at all. |
In my opinion, disabling checkpoints does not necessarily equate to disabling savepoints or other functionalities of the Checkpoint Coordinator, such as restoring from state. Some users may prefer to avoid the automatic triggering of checkpoints, yet still retain the option to initiate savepoints manually when needed. |
@JunRuiLee @WencongLiu Thanks for replying
Maybe that makes sense, although I haven't used it this way. |
Thanks @mayuehappy . In adaptive batch scenarios, the creation of a CheckpointCoordinator is definitely not necessary. In other scenarios, we may need to further sort out whether we need the CheckpointCoordinator carefully. Although Flink's execution modes are currently limited to streaming and batch, the definitions between these two modes may become more blurred in the future with the progress of unified stream and batch processing. In this background, retaining the CheckpointCoordinator at least has no downside. |
Thanks @JunRuiLee and @WencongLiu pointing out this. |
Not supporting savepoint on checkpoint disabled is indeed a breaking change from my side. Usu. jobs without checkpoints don't need savepoints as well, but users or platforms may already rely on the stop-with-savepoint feature for gracefully stopping a Flink job no matter it enables checkpoint or not. It's worth further discussion, but maybe a FLIP is needed. |
…#isCheckpointingEnabled return wrong value when checkpoint disabled
What is the purpose of the change
Fix DefaultExecutionGraphBuilder#isCheckpointingEnabled return wrong value when checkpoint disabled
Brief change log
reusing the result of jobGraph.isCheckpointingEnabled() when Calling DefaultExecutionGraphBuilder#isCheckpointingEnabled
Verifying this change
This change is a trivial rework
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation