Fix auditor shutdown logic and move decommision tests out of BookKeeperAdminTest#1099
Fix auditor shutdown logic and move decommision tests out of BookKeeperAdminTest#1099sijie wants to merge 1 commit into
Conversation
…erAdminTest - the auditor shutdown logic is problematic. most of the tests can finish quickly however it spend more 30 seconds on shutting down. because the shutdown logic will be blocked until `awaitTermination` timed out. - most of the tests in BookKeeperAdminTest don't need 6 bookies. so move the decommission tests to a separate class.
|
An example jenkins job on how long each test takes. Most of the tests run more than 1 minutes. |
|
@jvrao @reddycharan can you guys take the change here? I think the decommission test was contributed by you guys. It would be good if you guys can check to ensure my change doesn't change the testing purpose. |
|
will look into this. Here you have reduced numofbookies to 2 and moved decomm tests to a new testsuite with 6 bookies. Thats it, right? |
|
@reddycharan yes |
| LOG.info("Shutting down auditor"); | ||
| submitShutdownTask(); | ||
|
|
||
| executor.shutdown(); |
There was a problem hiding this comment.
wondering why did you make this change? why are you replacing submitShutdownTask call?
There was a problem hiding this comment.
I explained this in the description:
"the auditor shutdown logic is problematic. most of the tests can finish quickly however it spend more 30 seconds on shutting down.
because the shutdown logic will be blocked until awaitTermination timed out."
if Auditor is running, it might be blocking in the thread, submit shutdown task will never be executed, then it will have to wait for 30 seconds timeout (awaitTermination) and issue shutdownNow. this change issues shutdown, which will interrupt the auditor task and shutdown the executor thread.
Hope this make things clear
There was a problem hiding this comment.
yeah I get that, but I'm wondering why it wasn't like this before? why did the original author - e5b0dd0 went through this extra round of hassle for shutdown deliberately. Is it ok now to not to have this extra scrutiny with the shutdown process?
There was a problem hiding this comment.
No idea why it was done in that way. @ivankelly can chime in since he made that change.
There was a problem hiding this comment.
Long time ago, but I think it was to avoid killing a running audits before it was done, since it's a long running task.
There was a problem hiding this comment.
Okay so the question is can we change it to shutdown. Because even we don’t shutdown, it will be shutdown in 30 seconds.
There was a problem hiding this comment.
@ivankelly can you check if it is okay for us to change it to shutdown?
There was a problem hiding this comment.
@reddycharan - can you review this PR again after Ivan's reply?
| setAutoRecoveryEnabled(true); | ||
| } | ||
|
|
||
| @FlakyTest("https://github.com/apache/bookkeeper/issues/502") |
There was a problem hiding this comment.
I have a subsequent change to remove FlakyTest at #1100
|
LGTM +1 |
|
retest this please |
|
Can one of the admins verify this patch? |
|
IGNORE CI (All CI passed except "Integration Tests". This change is purely on changing tests. So it doesn't need "Integration Tests" to verify it) |
|
IGNORE CI |
Descriptions of the changes in this PR:
because the shutdown logic will be blocked until
awaitTerminationtimed out.