[Bug]: Wrong error message on resource conflicts due to same topicName #9270

fvaleri · 2023-10-19T15:25:10Z

Bug Description

This is related to the UnidirectionalTopicOperator (UTO).

It may happen that two topics with the same topicName are created at the exact same creationTime. Nevertheless, only the first one is reconciled, while the other reports a ResourceConflict in its status (single resource principle: 1 kt -> 1 topic -> 1 cluster).

The problem is that, when we create the condition message for the failing KafkaTopic status, we use creationTime to determine the name of the reconciled topic, so it can happen to get the wrong name (i.e. KubeRef of the failing KafkaTopic).

https://github.com/strimzi/strimzi-kafka-operator/blob/main/topic-operator/src/main/java/io/strimzi/operator/topic/v2/BatchingTopicController.java#L209

Example:

$ kubectl get kt -o yaml -n test 
apiVersion: v1
items:
- apiVersion: kafka.strimzi.io/v1beta2
  kind: KafkaTopic
  metadata:
    creationTimestamp: "2023-10-19T14:46:47Z"
    generation: 1
    labels:
      bar: BAR
      foo: FOO
    name: bar
    namespace: test
    resourceVersion: "56920"
    uid: 89947e54-f9b8-4bf7-a287-1e1b3427fe2b
  spec:
    config:
      compression.type: snappy
    partitions: 1
    replicas: 1
    topicName: foo
  status:
    conditions:
    - lastTransitionTime: "2023-10-19T14:46:47.311487120Z"
      message: Managed by Ref{namespace='test', name='bar'}
      reason: ResourceConflict
      status: "False"
      type: Ready
    observedGeneration: 1
    topicName: foo
- apiVersion: kafka.strimzi.io/v1beta2
  kind: KafkaTopic
  metadata:
    creationTimestamp: "2023-10-19T14:46:47Z"
    finalizers:
    - strimzi.io/topic-operator
    generation: 1
    labels:
      bar: BAR
      foo: FOO
    name: foo
    namespace: test
    resourceVersion: "56918"
    uid: f7e7bae1-3bd6-4afa-aa2e-45349e926190
  spec:
    partitions: 1
    replicas: 1
  status:
    conditions:
    - lastTransitionTime: "2023-10-19T14:46:47.230152899Z"
      status: "True"
      type: Ready
    observedGeneration: 1
    topicName: foo
kind: List
metadata:
  resourceVersion: ""

Steps to reproduce

Run the flaky test TopicControllerIT.shouldFailIfNumPartitionsDivergedWithConfigChange multiple times.

Expected behavior

No response

Strimzi version

0.38.0

Kubernetes version

1.27.3

Installation method

No response

Infrastructure

No response

Configuration files and logs

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

fvaleri · 2023-10-19T16:13:18Z

@tombentley fyi

tombentley · 2023-10-25T20:30:26Z

Can we sort first by creationTimestamp and then by (e.g.) uid and change the logic around lines 214 to 230?

ppatierno · 2023-11-02T16:16:46Z

Discussed on the Community call on 02.11.2023: @fvaleri is already working on it.

tombentley · 2023-11-21T23:25:03Z

I've looked at this and #9303 in more detail. The current algorithm is certainly wrong. The question is really how best to fix it. Fundamentally the problem revolves around the fact that Kubernetes doesn't provide us with an efficient way to query the KafkaTopics to find those trying to manage the same topic in Kafka. It's for this reason that we maintain the BatchingTopicController#topics map.

That map isn't being maintained correctly. In particular, it is added-to only as a side-effect of a reconciliation. This means that when the UTO starts up (when there are pre-existing KafkaTopics), it has partial information for a time. But the UTO will still act on this partial information as it is complete.

I think the algorithm we want would prioritise the existing KafkaTopic which "owns" the topic in Kafka, even if it's not as old as some other KafkaTopic for the same topic. In other words, consistency over ownership is more important than age. Its only when there is no existing owner that we should consider the creation time. And when we do use the creation time it's only really best effort (i.e. we shouldn't assume that the modifications to #topics are driven by observations where creation time is monotonic). It looks something like this:

find any existing owner, defined as a ready KafkaTopic resource managing the topic in Kafka
If there is an existing owner
1. If this resource is the existing owner, then return right(true) (i.e. reconcile it)
2. If this resource is not the existing owner then return left(ResourceConflict) (i.e. mark it as ready=false, reason=ResourceConflict)
Otherwise there is no existing owner. Find all the resources managing this topic and order them first by increasing creation date and then by something that cannot vary over time (metadata.uid will do).
1. If the first resource in this list is this resource then return right(true)
2. Otherwise return left(ResourceConflict)

Important properties of this algorithm are:

Only a single KafkaTopic will be reconciled with ready=true (i.e. the owner is unique)
The owner cannot change, even if the operator doesn't encounter KafkaTopic resources in order of increasing creation time, and even when there are several oldest resources.
The algorithm works consistently no matter how KafkaTopics get batched (a consequence of 2).

One necessary precondition of this algorithm is that we build BatchingTopicController#topics before processing any events, so that when we do process events we're using complete information. Once built, it's enough to maintain the field as a side effect of reconciliations.

This is of sufficient complexity that it's probably worth pulling the management of #topics in to a separate class (OwnerIndex?) that can easily be understood in isolation and unit tested independently.

Thoughts @fvaleri ?

fvaleri · 2023-11-24T11:25:09Z

TBH, I don't see any risk of ownership change. The problem is limited to the error message on the failed KafkaTopic CR, which sometimes reports the wrong KafkaTopic name that is managing the Kafka topic. This is a corner case that can happen only when there are two KafkaTopics with the same topicName and creationTime processed in the same batch.

As an example, let's say we have t1 and t2 with the same topicName that are created at about the same time (t1.topicName == t2.topicName, t1.creationTime == t2.creationTime). If the UTO process them in the same batch, then it can happen that t1 is reconciled and t2 fails with an error that says it is already managed by t2, when it should say t1. This is a random behavior determined by the wrong ordering in validateSingleManagingResource.

My last commit in #9303 addresses exactly this corner case, introducing the concept of processing order. Do you see anything wrong with that logic?

fvaleri added bug topic operator needs-triage labels Oct 19, 2023

fvaleri mentioned this issue Oct 29, 2023

[UTO] Fix wrong error message on resource conflicts #9303

Closed

fvaleri self-assigned this Oct 29, 2023

ppatierno removed the needs-triage label Nov 2, 2023

fvaleri mentioned this issue Jan 31, 2024

Fix #9270: Incorrect TO ResourceConflict message #9617

Merged

8 tasks

tombentley closed this as completed in d916d4d Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Wrong error message on resource conflicts due to same topicName #9270

[Bug]: Wrong error message on resource conflicts due to same topicName #9270

fvaleri commented Oct 19, 2023

fvaleri commented Oct 19, 2023

tombentley commented Oct 25, 2023

ppatierno commented Nov 2, 2023

tombentley commented Nov 21, 2023

fvaleri commented Nov 24, 2023 •

edited

[Bug]: Wrong error message on resource conflicts due to same topicName #9270

[Bug]: Wrong error message on resource conflicts due to same topicName #9270

Comments

fvaleri commented Oct 19, 2023

Bug Description

Steps to reproduce

Expected behavior

Strimzi version

Kubernetes version

Installation method

Infrastructure

Configuration files and logs

Additional context

fvaleri commented Oct 19, 2023

tombentley commented Oct 25, 2023

ppatierno commented Nov 2, 2023

tombentley commented Nov 21, 2023

fvaleri commented Nov 24, 2023 • edited

fvaleri commented Nov 24, 2023 •

edited