Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15109 Ensure the leader epoch bump occurs for older MetadataVersions #13910

Merged
merged 3 commits into from Jun 27, 2023

Conversation

mumrah
Copy link
Contributor

@mumrah mumrah commented Jun 23, 2023

This patch fixes a bug introduced in the earlier KAFKA-15109 commit (#13890) where the leader epoch bump would be skipped for older MetadataVersion-s.

@mumrah mumrah requested review from jsancio and cmccabe June 23, 2023 14:08
@@ -207,7 +211,7 @@ public void testTriggerLeaderEpochBumpIfNeeded() {
createFooBuilder()
.setTargetIsrWithBrokerStates(
AlterPartitionRequest.newIsrToSimpleNewIsrWithBrokerEpochs(Arrays.asList(2, 1, 3, 4)))
.setBumpLeaderEpochOnIsrShrink(true),
.enableBumpLeaderEpochOnIsrShrink(true),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Shouldn't this test fail? Let's add a comment explaining what this test is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is an ISR expansion, so we don't expect an epoch bump (I added a comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Outside the scope of this PR but it looks like we are missing some validation. This is adding 4 to the ISR but 4 is not a replica for this partition. cc @ahuang98

@cmccabe
Copy link
Contributor

cmccabe commented Jun 26, 2023

It's confusing to have a function named enableBumpLeaderEpochOnIsrShrink that sometimes doesn't actually enable (or disable) the thing, based on some other variables. If we're going to go this route, then I'd say just add back zkMigrationInProgress as a flag.

The other approach would be to have a function like forceBumpLeaderEpochOnIsrShrink() with no arguments that just does what it says. After all, bumping the epoch is always safe so we'd expect code like:

if (zkMigration) {
  builder.forceBumpLeaderEpochOnIsrShrink();
}

By NOT providing a function to disable leader epoch bumping, we can avoid the bug we encountered here.

I think either of these approaches could work. Let me know what you think.

@mumrah mumrah changed the title KAFKA-15109 Don't skip leader epoch bump for older MetadataVersions KAFKA-15109 Ensure the leader epoch bump occurs for older MetadataVersions Jun 26, 2023
Copy link
Contributor

@cmccabe cmccabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor suggestions. Looks good otherwise.

record.setLeader(partition.leader);
} else if (bumpLeaderEpochOnIsrShrink && !Replicas.contains(targetIsr, partition.isr)) {
} else if (!Replicas.contains(targetIsr, partition.isr) && bumpLeaderEpochOnIsrShrink) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check bumpLeaderEpochOnIsrShrink first as it is faster to compute. It avoids a search through the ISR arrays when it is false (most cases after MV 3.6).

@@ -207,7 +211,7 @@ public void testTriggerLeaderEpochBumpIfNeeded() {
createFooBuilder()
.setTargetIsrWithBrokerStates(
AlterPartitionRequest.newIsrToSimpleNewIsrWithBrokerEpochs(Arrays.asList(2, 1, 3, 4)))
.setBumpLeaderEpochOnIsrShrink(true),
.enableBumpLeaderEpochOnIsrShrink(true),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Outside the scope of this PR but it looks like we are missing some validation. This is adding 4 to the ISR but 4 is not a replica for this partition. cc @ahuang98

@jsancio jsancio added the kraft label Jun 27, 2023
@jsancio jsancio self-assigned this Jun 27, 2023
Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mumrah mumrah merged commit fc7d912 into apache:trunk Jun 27, 2023
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants