-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-4754] [checkpoints] Make number of retained checkpoints user configurable #3374
Conversation
Hi @tony810430 thank you for the pull request! My feeling is, though, that the number of checkpoints to retain is something that we want rather in the configuration of the JobManager, than in the programs snapshot settings. Think of it like that: There are often two roles, developer and ops.
Having multiple retained checkpoints is something that concerns more the ops person. What do you think? |
Hi @StephanEwen thanks for your feedback. I totally agree your opinion. I will make this setting be configured in Besides, I have some questions for the following implementations.
Looking forward to having your opinion. Thank you. |
I think having it only in the configuration is probably fine. I think we do not need both paths here. It would be nice to have a "configuration validator" early, but we do not have something like that currently. |
Hi @StephanEwen Thanks for your comment and I make some change on this PR. I would appreciate it if you have time to review it. |
Looks good in general. I would suggest two improvements:
|
Hi @StephanEwen Thank you for the review. I have rebased and done those improvements in your suggestion. |
Looks good, thanks. |
@tony810430 I am adding small followups. Most notably, I renamed the config parameter to Please comment if you have objections. |
@StephanEwen I think |
@StephanEwen |
…onfigurable This closes apache#3374
…onfigurable This closes apache#3374
…onfigurable This closes apache#3374
I add
CheckpointConfig.setMaxNumberOfCheckpointsToRetain
to expose user the configuration for number of retained checkpoints, and update the constructor ofJobSnapshottingSettings
to pass the value toCheckpointRecoveryFactory
.However, I didn't make this value lazily in the checkpoint store implementations.
It is useful to make it lazily if there is a need to reconfigure it during job is running, but I think that should be another issue.