Skip to content

Fix auditor shutdown logic and move decommision tests out of BookKeeperAdminTest#1099

Closed
sijie wants to merge 1 commit into
apache:masterfrom
sijie:improve_admin_tests
Closed

Fix auditor shutdown logic and move decommision tests out of BookKeeperAdminTest#1099
sijie wants to merge 1 commit into
apache:masterfrom
sijie:improve_admin_tests

Conversation

@sijie
Copy link
Copy Markdown
Member

@sijie sijie commented Feb 1, 2018

Descriptions of the changes in this PR:

  • the auditor shutdown logic is problematic. most of the tests can finish quickly however it spend more 30 seconds on shutting down.
    because the shutdown logic will be blocked until awaitTermination timed out.
  • most of the tests in BookKeeperAdminTest don't need 6 bookies. so move the decommission tests to a separate class.

…erAdminTest

- the auditor shutdown logic is problematic. most of the tests can finish quickly however it spend more 30 seconds on shutting down.
  because the shutdown logic will be blocked until `awaitTermination` timed out.
- most of the tests in BookKeeperAdminTest don't need 6 bookies. so move the decommission tests to a separate class.
@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 1, 2018

@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 1, 2018

@jvrao @reddycharan can you guys take the change here? I think the decommission test was contributed by you guys. It would be good if you guys can check to ensure my change doesn't change the testing purpose.

@reddycharan
Copy link
Copy Markdown
Contributor

will look into this. Here you have reduced numofbookies to 2 and moved decomm tests to a new testsuite with 6 bookies. Thats it, right?

@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 1, 2018

@reddycharan yes

Copy link
Copy Markdown
Member

@jiazhai jiazhai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

LOG.info("Shutting down auditor");
submitShutdownTask();

executor.shutdown();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering why did you make this change? why are you replacing submitShutdownTask call?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained this in the description:

"the auditor shutdown logic is problematic. most of the tests can finish quickly however it spend more 30 seconds on shutting down.
because the shutdown logic will be blocked until awaitTermination timed out."

if Auditor is running, it might be blocking in the thread, submit shutdown task will never be executed, then it will have to wait for 30 seconds timeout (awaitTermination) and issue shutdownNow. this change issues shutdown, which will interrupt the auditor task and shutdown the executor thread.

Hope this make things clear

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I get that, but I'm wondering why it wasn't like this before? why did the original author - e5b0dd0 went through this extra round of hassle for shutdown deliberately. Is it ok now to not to have this extra scrutiny with the shutdown process?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea why it was done in that way. @ivankelly can chime in since he made that change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long time ago, but I think it was to avoid killing a running audits before it was done, since it's a long running task.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so the question is can we change it to shutdown. Because even we don’t shutdown, it will be shutdown in 30 seconds.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivankelly can you check if it is okay for us to change it to shutdown?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reddycharan - can you review this PR again after Ivan's reply?

Copy link
Copy Markdown
Contributor

@ivankelly ivankelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm +1

setAutoRecoveryEnabled(true);
}

@FlakyTest("https://github.com/apache/bookkeeper/issues/502")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it still flaky?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a subsequent change to remove FlakyTest at #1100

@reddycharan
Copy link
Copy Markdown
Contributor

LGTM +1

@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 14, 2018

retest this please

@asfgit
Copy link
Copy Markdown

asfgit commented Feb 16, 2018

Can one of the admins verify this patch?

@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 16, 2018

IGNORE CI (All CI passed except "Integration Tests". This change is purely on changing tests. So it doesn't need "Integration Tests" to verify it)

@sijie
Copy link
Copy Markdown
Member Author

sijie commented Feb 16, 2018

IGNORE CI

@sijie sijie closed this in c43d8c4 Feb 16, 2018
@sijie sijie deleted the improve_admin_tests branch July 16, 2018 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants