Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15036: Add a test case for controller failover #13777

Closed
wants to merge 3 commits into from

Conversation

dengziming
Copy link
Member

@dengziming dengziming commented May 30, 2023

More detailed description of your change
We introduced a bug when updating handleApiVersionRequest but it's not detected by CI, the bug has been fixed and we should add a test case for controller failover.

Summary of testing strategy (including rationale)
KRaftClusterTest.testCreateClusterAndRestartControllerNode()

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dengziming
Copy link
Member Author

@showuon @KarboniteKream @hachikuji I think this bug is very serious since it will stop any leader failover from functioning normally, so please take a look as soon as possible.

Copy link

@KarboniteKream KarboniteKream left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed locally and on a deployed cluster that the error is no longer being reproduced

Copy link
Member

@soarez soarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the new test – testCreateClusterAndRestartControllerNode checking for this issue?

Copy link
Member Author

@dengziming dengziming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments @soarez @KarboniteKream , I have fixed them. @cmccabe do you have some suggestions for this change, you are familiar with the changes in QuorumController and ControllerServer.

@@ -96,6 +96,32 @@ class KRaftClusterTest {
}
}

@Test
def testCreateClusterAndRestartControllerNode(): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soarez Yes, we shutdown the active controller, then restart it to make it send ApiVersionRequest to a standby controller.

@dengziming
Copy link
Member Author

this will be fixed in #13799

@dengziming dengziming closed this Jun 2, 2023
@dengziming dengziming changed the title KAFKA-15036: UnknownServerError on any leader failover KAFKA-15036: Add a test case for controller failover Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants