-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-13027][streaming]: StreamingFileSink bulk-encoded writer supports customized checkpoint policy #10653
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit b36738f (Sat Dec 21 00:12:28 UTC 2019) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
CI report:
Bot commandsThe @flinkbot bot supports the following commands:
|
@flinkbot run travis |
Hi @yxu-valleytider , thanks for the PR. I am on holidays, so I do not think I will have time to give it a thorough review but from a first look, I think that the In addition, given that this PR introduces a new feature, you should also update the documentation section of the What do you think? |
@kl0u thanks for the quick comments. Yes I agree the main intention of the PR is to allow As for the documentation, there's a significant amount of changes in master. Compared with the stable version, there is no mentioning of wording such as: |
@yxu-valleytider I think this is an improvement compared to what we have now. As I said, I am on holidays and I will not be able to make a thorough pass on the PR but I hope that someone else will, or I will do it when I am back. |
@kl0u not an issue. Not urgent and please take time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the contribution!
@yxu-valleytider are you going to submit the documentation changes as a follow-up PR? |
@tweise Thanks for the review. I can follow up with a separate PR addressing the documentation issue. |
Created FLINK-15476 |
…ed checkpoint policy
4da3119
to
f82f5eb
Compare
What is the purpose of the change
This PR allows bulk-encoded
StreamingFileSink
to instantiate from a generic family of rolling policies that roll files at checkpoint time. A base CheckpointRollingPolicy class is defined, which is extended by the existingOnCheckpointRollingPolicy
and a new rolling policyFSizeCheckpointRollingPolicy
. The latter policy rolls file not only at the checkpoint time, but also possibly before file size reaches a certain limit, which is useful for preventing file sizes from growing too big. Recurrent builder pattern described in [1] and [2] are used to instantiate the rolling policies whenever appropriate, making individual rolling policy also extensible.Brief change log
CheckpointRollingPolicy
FSizeCheckpointRollingPolicy
StreamingFileSink
CheckpointRollingPolicy
during instantiation.OnCheckpointRollingPolicy
is still the default, but won't be the only option.Verifying this change
This change is an interface change and already covered by existing tests, such as BulkWriterTest. A new test case on the new
FSizeCheckpointRollingPolicy
has also been added to theRollingPolicyTest
.Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes)StreamingFileSink
.Documentation