Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-33057][Checkpointing] Add options to disable creating job-id subdirectories under the checkpoint directory #23509

Closed
wants to merge 1 commit into from

Conversation

Zakelly
Copy link
Contributor

@Zakelly Zakelly commented Oct 11, 2023

What is the purpose of the change

By default, Flink creates subdirectories named by UUID (job id) under checkpoint directory for each job. It's a good means to avoid collision. However, it also bring in some effort to remember/find the right directory when recovering from previous checkpoint, as well as cleaning the old subdirectories. According to previous discussion (Yun Tang's and Stephan Ewen's ), I think it would be useful to add an option to disable creating the UUID subdirectories under the checkpoint directory. For compatibility considerations, we create the subdirectories by default.

Brief change log

  • Provide a new configuration named 'state.checkpoints.create-subdir' to control the behavior of creating job-id subdirectories under the checkpoint directory or not. This is an advanced option, which is well documented and warned.

Verifying this change

This change is already covered by newly added tests, a parameterized AbstractFileCheckpointStorageAccessTestBase#testCreateCheckpointSubDirs .

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 11, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@Zakelly
Copy link
Contributor Author

Zakelly commented Oct 31, 2023

@masteryhx @Myasuka What do you think about adding this option?

@masteryhx masteryhx self-requested a review November 6, 2023 09:03
Copy link
Contributor

@masteryhx masteryhx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, PTAL my comments.
Could you also add an IT to test switching this option which is important for users to use it out of the box ?

@Myasuka Myasuka self-requested a review November 9, 2023 13:17
Copy link
Member

@Myasuka Myasuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR, please take a look at my comments.

@Zakelly
Copy link
Contributor Author

Zakelly commented Dec 13, 2023

@masteryhx @Myasuka Thanks for your review! I have addressed your comments, would you please take another look? Thanks!

Copy link
Contributor

@masteryhx masteryhx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update.
Current PR LGTM % the conflict.
Could you rebase all commits into master to resolve the conflict ?
BTW, please remember to rename the title with component tag.

…ubdirectories under the checkpoint directory
@Zakelly Zakelly changed the title [FLINK-33057] Add options to disable creating job-id subdirectories under the checkpoint directory [FLINK-33057][Checkpointing] Add options to disable creating job-id subdirectories under the checkpoint directory Jan 14, 2024
@Zakelly
Copy link
Contributor Author

Zakelly commented Jan 14, 2024

Thanks for the update. Current PR LGTM % the conflict. Could you rebase all commits into master to resolve the conflict ? BTW, please remember to rename the title with component tag.

Already done, thanks @masteryhx

Copy link
Contributor

@masteryhx masteryhx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

masteryhx pushed a commit that referenced this pull request Jan 15, 2024
@masteryhx masteryhx closed this Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants