Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-9609] [connectors] Add bucket ready mechanism for BucketingSin… #6375

Closed
wants to merge 1 commit into from

Conversation

rice668
Copy link

@rice668 rice668 commented Jul 19, 2018

What is the purpose of the change

Currently, BucketingSink only support notifyCheckpointComplete. However, users want to do some extra work when a bucket is ready. It would be nice if we can support BucketReady mechanism for users or we can tell users when a bucket is ready for use. For example, One bucket is created for every 5 minutes, at the end of 5 minutes before creating the next bucket, the user might need to do something as the previous bucket ready, like sending the timestamp of the bucket ready time to a server or do some other stuff.

Here, Bucket ready means all the part files suffix name under a bucket neither .pending nor .in-progress. Then we can think this bucket is ready for user use. Like a watermark means no elements with a timestamp older or equal to the watermark timestamp should arrive at the window. We can also refer to the concept of watermark here, or we can call this BucketWatermark if we could.

Brief change log

Add an interface BucketReady .

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (not documented)

}
}

@Override
public boolean isBucketReady(Set<Path> bucketPathes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this implementation of the method has logic problem? shall we return true; after iterating the for loop and remove the return true; above the catch?

Is the purpose of the method for checking all the bucketPathes never end with pending or inProgress suffix?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we finish the for loop and haven't returned true yet, then return false. I think this is reasonable. If a file end with pending or inProgress, This means that the file is not ready. and we can not use it.

@rice668
Copy link
Author

rice668 commented Jul 20, 2018

Hi, @tillrohrmann Could you please take a look on this PR ? Actually, I don't know which committer is more familiar with BucketingSink. Thank you so much.

return false;
}
}
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this return statement, can not verify all the bucket path is ready, right? because the loop is not finished.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh. Oops! Yes, you are very right! :)

@zentol
Copy link
Contributor

zentol commented Nov 11, 2022

The BucketingSink has been removed.

@zentol zentol closed this Nov 11, 2022
@yanghua
Copy link
Contributor

yanghua commented Nov 12, 2022

The BucketingSink has been removed.

The lazy review of the community wastes the huge enthusiasm and energy of early contributors. This is the reason for the decline or the homogenization of the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants