Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce lifecycle components for managing components in bookie server #508

Closed
zhaijack opened this issue Sep 13, 2017 · 0 comments
Closed

Comments

@zhaijack
Copy link
Contributor

FEATURE REQUEST

  1. Please describe the feature you are requesting.

In a bookie server, we have multiple components to run, including:

  • stats provider
  • bookie server (both storage and netty server)
  • autorecovery
  • http endpoint

It becomes a bit messy on managing components run in a bookie. This issue attempts to introduce lifecycle component for each service component. so we can run these components in a clear way.

This is also help with #491 when we introduce a metadata rpc component.

  1. Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

should-have

  1. Provide any additional detail on your proposed use case for this feature.

N/A

@eolivelli eolivelli added this to the 4.6.0 milestone Sep 19, 2017
sijie added a commit to sijie/bookkeeper that referenced this issue Jul 14, 2018
…exit the BookieProcess

 ### Motivation

Fixes the issue at apache#1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue apache#508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after apache#508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack
sijie added a commit that referenced this issue Jul 23, 2018
… exit the BookieProcess

Descriptions of the changes in this PR:

 ### Motivation

Fixes the issue at #1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue #508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after #508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack

Master Issue: #1540

Author: Sijie Guo <sijie@apache.org>

Reviewers: Andrey Yegorov <None>, Charan Reddy Guttapalem <reddycharan18@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #1543 from sijie/fix_lifcycle_components, closes #1540
sijie added a commit that referenced this issue Jul 23, 2018
… exit the BookieProcess

Descriptions of the changes in this PR:

 ### Motivation

Fixes the issue at #1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue #508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after #508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack

Master Issue: #1540

Author: Sijie Guo <sijie@apache.org>

Reviewers: Andrey Yegorov <None>, Charan Reddy Guttapalem <reddycharan18@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #1543 from sijie/fix_lifcycle_components, closes #1540

(cherry picked from commit 50f29ed)
Signed-off-by: Sijie Guo <sijie@apache.org>
reddycharan pushed a commit to reddycharan/bookkeeper that referenced this issue Jul 24, 2018
…hutdown will fail to end exit the BookieProcess

Descriptions of the changes in this PR:

 ### Motivation

Fixes the issue at apache#1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue apache#508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after apache#508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack

Master Issue: apache#1540

Author: Sijie Guo <sijie@apache.org>

Reviewers: Andrey Yegorov <None>, Charan Reddy Guttapalem <reddycharan18@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#1543 from sijie/fix_lifcycle_components, closes apache#1540
reddycharan pushed a commit to reddycharan/bookkeeper that referenced this issue Jul 24, 2018
…hutdown will fail to end exit the BookieProcess

Descriptions of the changes in this PR:

 ### Motivation

Fixes the issue at apache#1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue apache#508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after apache#508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack

Master Issue: apache#1540

Author: Sijie Guo <sijie@apache.org>

Reviewers: Andrey Yegorov <None>, Charan Reddy Guttapalem <reddycharan18@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#1543 from sijie/fix_lifcycle_components, closes apache#1540
reddycharan pushed a commit to reddycharan/bookkeeper that referenced this issue Aug 2, 2018
…hutdown will fail to end exit the BookieProcess

Descriptions of the changes in this PR:

 ### Motivation

Fixes the issue at apache#1540.

If Bookie/BookieServer components are shutdown internally because of any fatal errors
(ExitCode - INVALID_CONF, SERVER_EXCEPTION, ZK_EXPIRED, ZK_REG_FAIL, BOOKIE_EXCEPTION) then
it will go through shutdown method logic and shutdowns components internal to Bookie/BookieServer
but it will not succeed in bringing down the bookie process.

This is because in BookieServer.main / server.Main.doMain it would wait for the startComponent
future to complete
http://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/server/Main.java#L227 .
The startComponent future will be market complete only in runtime shutdownhook -
https://github.com/apache/bookkeeper/blob/master/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L66.

But the problem is nowhere in Bookie/BookieProcess shutdown we are calling System.exit() and hence
the runtime shutdownhook is not executed to mark the startComponent future to complete. Hence
Main.doMain will wait forever on this future though Bookie/BookieServer components are shutdown
because of known fatal errors.

 ### Regression

Issue apache#508 introduced this regression. Before this change, the main thread is blocking using `BookieServer#join()`.
When bookie is dead for any reason, the DeathWatchThread will kill the bookie and bookie server. so the main thread will quite.
However after apache#508 is introduced, the lifecycle management is disconnected from the bookie and bookie server. so when they are dead,
lifecycle management is unaware of the situation and the main thread doesn't quite.

 ### Changes

- Add `UncaughtExceptionHandler` to lifecycle components
- When a lifecycle component hits an error, it can use `UncaughtExceptionHandler` to notify lifecycle component stack to shutdown the whole stack

Master Issue: apache#1540

Author: Sijie Guo <sijie@apache.org>

Reviewers: Andrey Yegorov <None>, Charan Reddy Guttapalem <reddycharan18@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#1543 from sijie/fix_lifcycle_components, closes apache#1540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants