Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka-16540: Update partitions if min isr config is changed. #15702

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

CalvinConfluent
Copy link
Contributor

https://issues.apache.org/jira/browse/KAFKA-16540
If the min isr config is changed, we need to update the partitions with ELR if possible.

@CalvinConfluent
Copy link
Contributor Author

@mumrah Can you help take a look?

Comment on lines 2367 to 2370
void maybeTriggerMinIsrConfigUpdate(Optional<String> topicName) throws InterruptedException, ExecutionException {
appendWriteEvent("partitionUpdateForMinIsrChange", OptionalLong.empty(),
() -> replicationControl.getPartitionElrUpdatesForConfigChanges(topicName)).get();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling .get() on an appendWriteEvent doesn't look right to me. If I understand correctly, the appendWriteEvents are handled in the quorum controller event loop thread.

We would expect replay() to also be called in the event loop thread. so if we trigger an appendWriteEvent and block waiting for the result, it would always time out, since we are blocking the processing thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, we basically only need to call the appendWriteEvents and do not wait for the replay().

Copy link
Contributor

@splett2 splett2 Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think a better way to think about it is that we want to append the min ISR config update atomically with the partition change records. Appending the partition change records once the config change is replayed is difficult to reason about and possibly incorrect. Thinking a bit more about it, triggering a write event from the replay() for the config change record means that every time we reload the metadata log, we would replay the config change record and generate new partition change records.

Perhaps one example to look at is ReplicationControlManager.handleBrokerFenced. When a broker is fenced, we generate a broker registration change record along with the leaderAndIsr partition change records. I assume we want to follow a similar model with the topic configuration change events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, I have some misunderstanding about the controller events. Will update. Thanks!

Comment on lines +338 to +346
if (configRecord.name().equals(TopicConfig.MIN_IN_SYNC_REPLICAS_CONFIG)) {
minIsrRecords.add(configRecord);
if (Type.forId(configRecord.resourceType()) == Type.TOPIC) {
if (configRecord.value() == null) topicMap.put(configRecord.resourceName(), configRecord.value());
else configRemovedTopicMap.put(configRecord.resourceName(), configRecord.value());
}
}
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the behavior if the default broker config for min.insync.replicas is changed?
I am not actually sure how that impacts the min.insync.replicas for existing topics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If min.insync.replicas is not set on the topic config level, the effective min.insync.replicas of a topic will change if default broker config is updated.

Comment on lines 363 to 365
for (ConfigRecord record : minIsrRecords) {
replayInternal(record, configDataCopy, localSnapshotRegistry);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we calling replay here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the implementation challenge part of this PR. To find the effective min ISR value, it requires checking topic config -> dynamic broker config -> default broker config -> ...
Let's say the user updates the default broker config:

  1. All the topics could be affected.
  2. The effective min ISR values should be recalculated.
  3. We need to generate the partition change records along with the config change records, which means the ReplicationControlManager can't use the regular methods for the effective min ISR value. The value should be determined by the config records and the current configs.

I found it easier to make a copy of the configs and apply the min ISR updates on the copy. Then let the ReplicationControlManager check all the partitions with the config copy.

@@ -66,6 +69,7 @@ public class ConfigurationControlManager {
private final TimelineHashMap<ConfigResource, TimelineHashMap<String, String>> configData;
private final Map<String, Object> staticConfig;
private final ConfigResource currentController;
private final MinIsrConfigUpdatePartitionHandler minIsrConfigUpdatePartitionHandler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more of a question for someone with more code ownership of the quorum controller code, but I wonder if it would be preferable to handle generating the replication control manager records in the QuorumController.incrementalAlterConfigs. That would also make it a bit easier to handle validateOnly which we are not currently handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants