-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPE when incorrect topologySpreadConstraints are applied on a fresh cluster #7007
Comments
Can you please share the full Kafka CR to reproduce this? Without the full operator, it is also quite hard to understand what exactly happened. Are you saying that by using this |
Yes Debug level logs leading up to the NPE: Kafka CR: Note: Some sensitive info has been redacted from CR/logs |
Great, thanks. I will have a look. |
This still seems to be a problem in the current main when StatefulSets are used. Just the stacktrace changed a bit:
When StrimziPodSets are used, this particular Kafka CR also ends up with other errors. But even then, other situations might trigger this. When the pod does not exist, we for sure con't need to reconfigure or restart it. So we should probably wait for readiness? When the pod does nto exist, it will anyway fail sooner or later on readiness. Other alternative might be to just kill the reconciliation right away and throw some error? Also, before it runs into the rolling part, it is trying to connect for 5 minutes to collect the configuration which seems also unnecessary. |
Might be best to fix after #6663 is merged to avoid conflicts. The NPEs are not nice, but technically, this would anyway end in an error, just a different one. So I don't think it matter that much if it is fixed now or in few days. |
Describe the bug
I was trying out
topologySpreadConstraints
on our testing environment. I added the following configuration for broker:The second item in the constraints array was a mistake and was supposed to be applied to zookeeper. Nonetheless, this is what I observed:
NPE stacktrace:
This was no longer the issue once the zookeeper topology constraint was removed from broker config, but I think we should handle this error better instead of throwing an NPE.
The text was updated successfully, but these errors were encountered: