New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue #2879 : let bookie quit if journal thread exit #2887
Conversation
We don't need a new configuration for this. This kind of thing should be a hidden implementation detail, there is no benefit to allowing users to configure an interval. Another approach would be something like passing a single CountdownLatch to each journal that can be waited on from the bookie thread. Also note that any dependency changes need to be added to Gradle or won't compile. |
@Vanlightly |
I'm not familiar with gradle and I'll check the dependency problem first. |
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
@Vanlightly PTAL, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JournalAliveListener approach looks good. Should it be named JournalDeathListener though? We do need to check that concurrent invocations to shutdown() either directly due to BookieServer shutdown or a journal exiting is handled. There is already handling for multiple invocations of triggerBookieShutdown, but not from that and a BookierServer shutdown. BookieServer.shutdown() should block on an already in-progress BookieImpl.shutdown() to avoid the Java process exiting too soon.
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
@aloyszhang I mentioned in a previous comment that we need to ensure that if a shutdown is trigger by an OS signal, but a shutdown is in progress, then it needs to wait for that shutdown to complete, else the process may exit before the shutdown is complete. I didn't see that the BookieImpl.shutdown() method is synchronized but it is, and so should do that blocking for us. Being synchronized also means that we shouldn't need that AtomicBoolean So that leads us to using an explicit lock on the shutdown method to replace Note that I have done a test of a crashing journal followed by an OS sigint and the bookie does terminate before the shutdown was completed, so this is something that does need addressing. |
Yes, the original
Do you mean that there is a common shutdown in progress which invoked by the stop command I think the bad case is that if a journal crash first and already triggers |
@aloyszhang Exactly, if the |
PR Validation failed. |
@aloyszhang run this to get the report: |
@Vanlightly Validation issue has been resolved, PTAL, thanks! |
@aloyszhang An exception thrown during the shutdown would leave the CountdownLatch at 1, blocking the second caller. I would replace the AtomicBoolean and CountdownLatch with a simple lock, that is acquired before |
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Outdated
Show resolved
Hide resolved
@Vanlightly PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@eolivelli @nicoloboschi PTAL
CC @eolivelli PTAL, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
can you please resolve the conflicts ?
the patch is ready to be merged
after rebasing the master branch, there're some tests that failed, I'll resolve this soon. |
@aloyszhang Please rebase/resolve merge conflicts |
Sorry for late, I'll do this soon |
ping @aloyszhang, Would you please rebase the master? |
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java
Show resolved
Hide resolved
@aloyszhang Could you rebase the master and address the comment? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function looks nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
Descriptions of the changes in this PR: fix #2879 This pull request let bookie quit when there's journal thread exit ### Motivation As described in #2879, now if a bookie has multi journal directories means it has multi journal thread. Once a journal thread exits, the bookie will be unhealthy due to the block of all bookie-io threads, and then the bookie will not work but progress is still alive. This pull request tries to fix this problem. ### Changes check the journal thread alive in a fixed interval, let bookie quit once there's a journal thread exit (cherry picked from commit 67208fb)
### Motivation PR #2887 introduced the feature of shutdown the bookie service after any Journal thread exits, so we don't need to wait in BookieImpl for all Journal threads to exit before shutdown the Bookie service, because this cannot happen.
Descriptions of the changes in this PR:
fix #2879
This pull request let bookie quit when there's journal thread exit
Motivation
As described in #2879, now if a bookie has multi journal directories means it has multi journal thread. Once a journal thread exits, the bookie will be unhealthy due to the block of all bookie-io threads, and then the bookie will not work but progress is still alive.
This pull request tries to fix this problem.
Changes
check the journal thread alive in a fixed interval, let bookie quit once there's a journal thread exit