-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-16171 Fix hybrid mode controller race #15238
KAFKA-16171 Fix hybrid mode controller race #15238
Conversation
@@ -139,14 +139,18 @@ public ZkMigrationLeadershipState setMigrationRecoveryState(ZkMigrationLeadershi | |||
|
|||
@Override | |||
public ZkMigrationLeadershipState claimControllerLeadership(ZkMigrationLeadershipState state) { | |||
this.state = state; | |||
return state; | |||
if (state.zkControllerEpochZkVersion() == ZkMigrationLeadershipState.EMPTY.zkControllerEpochZkVersion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just compare against -2 (or at least have a constant for this like ZkMigrationLeadershipState.UNKNOWN_ZK_VERSION
?)
@@ -635,12 +635,24 @@ class BecomeZkControllerEvent extends MigrationEvent { | |||
public void run() throws Exception { | |||
if (checkDriverState(MigrationDriverState.BECOME_CONTROLLER, this)) { | |||
applyMigrationOperation("Claiming ZK controller leadership", zkMigrationClient::claimControllerLeadership); | |||
if (migrationLeadershipState.zkControllerEpochZkVersion() == -1) { | |||
if (migrationLeadershipState.zkControllerEpochZkVersion() == -2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a constant for this like ZkMigrationLeadershipState.UNKNOWN_ZK_VERSION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @mumrah . Left two minor comments.
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
committed, thanks! |
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch causes the active KRaftMigrationDriver to reload the /migration ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to /migration after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on /migration. This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via claimControllerLeadership would not retry. Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch causes the active KRaftMigrationDriver to reload the
/migration
ZK state after electing itself as the leader in ZK. This closes a race condition where the previous active controller could make an update to/migration
after the new leader was elected. The update race was not actually a problem regarding the data since both controllers would be syncing the same state from KRaft to ZK, but the change to the znode causes the new controller to fail on the zk version check on/migration
.This patch also fixes a as-yet-unseen bug where the active controllers failing to elect itself via
claimControllerLeadership
would not retry.