-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pubsublite: Reduce commit logspam #22762
pubsublite: Reduce commit logspam #22762
Conversation
Assigning reviewers. If you would like to opt out of this review, comment R: @kennknowles for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work. Would you mind providing more context in the description or is there a linked issue? What kind of "log spam" we current have?
There is not a linked issue. Whenever dataflow scales up or down, it closes the reader while there are still outstanding checkpointmarks that will be committed asynchronously and fail, logging a bunch of warnings. CheckpointMark has no lifecycle methods- hence the only way to ensure the committer lives long enough for all the CheckpointMarks to be used (while also not requiring that they are used) is with finalize. |
Reminder, please take a look at this pr: @kennknowles @Abacn |
...-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsublite/internal/Refcounted.java
Outdated
Show resolved
Hide resolved
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @robertwb for label java. Available commands:
|
waiting on author |
applied minimal suggestion- please merge if there's nothing else? |
…should substantially reduce logspam.
…should substantially reduce logspam.
…should substantially reduce logspam.
…should substantially reduce logspam.
b739711
to
eb45a9a
Compare
…should substantially reduce logspam.
…should substantially reduce logspam.
Run Java PostCommit |
...-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsublite/internal/Refcounted.java
Outdated
Show resolved
Hide resolved
...-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsublite/internal/Refcounted.java
Outdated
Show resolved
Hide resolved
This is somewhat beyond my understanding. I'd prefer to not use a depreciated method if possible, but for the release I believe this is ok. Was there a reason you can't simply add a finalize() method to CheckpointMarkImpl directly, avoiding this indirection? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (non-committer).
Ideally we could put a TODO to remember to refactor it to use java.lang.ref.Cleaner instead of finalize()
if we ever move to JDK 9+.
Yes, we only want to close when no checkpointMarks exist that are using this, not when each goes out of scope. |
…should substantially reduce logspam.
…should substantially reduce logspam.
* Only close committers after all CheckpointMarks have gone away. This should substantially reduce logspam.
* Only close committers after all CheckpointMarks have gone away. This should substantially reduce logspam.
* Only close committers after all CheckpointMarks have gone away. This should substantially reduce logspam.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.