New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue when creating new topics with TopicOperator #5943
Comments
I suspect this is the same issue as #5691. Each reconciliation the topics are fetched from Kafka to check the updates. When there is a large number of topics, a lot of requests are sent to Kafka broker and it basically refuses any new ones. The |
There is a difference between 40 topics and 3200 topics. So which one is it you are using? |
We have a total of 3200+ topics and the issue is coming when we create new topics which are 40. Out of 40, 5-10 topics are not getting created. After some time when I delete the Custom Kafka resource for those 5-10 topics and create them again, the creation works. |
I guess that could be as suggested by @sknot-rh => you might be reaching the limits of the system. Maybe increasing the resources for the Kafka cluster or for the Topic operator might help. But it is not exact science ... so you would need to give it a try. |
Ok, let me try increasing that and share the results. |
Describe the bug
I am getting an issue where some of the topics are not getting created by the Topic Operator. I am creating around 40 topics (16 partitions, 3 replicas) and having a Kafka cluster of 3 brokers.
The issue is intermittent as sometimes all the topics get created but other times some of them (5-10 topics) are not getting created. The Kubernetes custom resource (KafkaTopic) is there but the actual kafka topic is not available
To Reproduce
Steps to reproduce the behavior:
kubectl get kt -n kafka --context=testing -o json | jq -r '[.items[] |select(.status.conditions[0].type != "Ready")| .metadata.name]'
Expected behavior
All the topics should be created without any issue.
Environment (please complete the following information):
logs
I am not getting any substantial information from the Topic Operator logs as well, this is the status of the topic which is in NotReady
state-
Additional context
I have a couple of questions here-
1- We have around 80 test customers with 40 topics each (16 partitions, 3 replicas) which makes it around 150k partitions in the cluster, is that enough to be handled by a Kafka cluster of 3 brokers?
2- How can we start multiple instances of Entity Operator so that the Topic management load is distributed and we don't end up in this kind of race condition?
This does seem related to #1775. Let me know if more details are required.
The text was updated successfully, but these errors were encountered: