Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix the UTO upgrade issue reported in strimzi#9470.
This issue is caused by stale metadata of one or more brokers after restarting the cluster (no risk of data loss). Using the reproducer, we can see that the UTO fails at 14:27:39 with UnknownTopicOrPartitionException (retriable), while one of the brokers first knows about my-topic at 14:27:44. This triggers topic creation logic which fails with TopicExistsException. UTO log: 2023-12-17 14:27:39,55262 TRACE [kafka-admin-client-thread | strimzi-topic-operator-a93c1635-76c3-4c9f-b61f-68c1a6ac98c3] BatchingTopicController:754 - Admin.describeTopics([__strimzi_store_topic, strimzi.cruisecontrol.partitionmetricsamples, __strimzi-topic-operator-kstreams-topic-store-changelog, timer-topic, connect-cluster-status, strimzi.cruisecontrol.modeltrainingsamples, strimzi.cruisecontrol.metrics, my-topic, __consumer_offsets, connect-cluster-offsets]) failed with java.util.concurrent.CompletionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. Broker log: 2023-12-17 14:27:44,209 TRACE [Broker id=1000] Cached leader info UpdateMetadataPartitionState(topicName='my-topic', partitionIndex=0, controllerEpoch=1, leader=2000, leaderEpoch=4, isr=[1001, 2000, 2001], zkVersion=7, replicas=[2000, 2001, 1001], offlineReplicas=[]) for partition my-topic-0 in response to UpdateMetadata request sent by controller 1001 epoch 2 with correlation id 0 (state.change.logger) [control-plane-kafka-request-handler-0] I'm proposing to catch and ignore the TopicExistsException, wich is also what BTO does. If the topic was created by a third party before the UTO, the next reconciliation will try to revert any configuration drift. Signed-off-by: Federico Valeri <fedevaleri@gmail.com>
- Loading branch information