-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User and Permission stuck in Terminating #324
Comments
Hey there! Thank you for submitting this issue and PR #325. To add a bit more context: the controller does not block owner deletion since e09d8df because we found incompatibilities with Openshift when it was set to I like the approach proposed 👍 I'll have a look at the PR and leave some later today. |
hey @LucasBoisserie I also met this issue, I thought I made a mistake. Thanks for the PR 👍 |
Fixed by PR #325 |
Describe the bug
We use ArgoCD to deploy our code in Kubernetes. We create/delete multiple review-apps each hour, they include: exchanges, queues, permissions and users.
We notice that sometimes User and Permission are stuck in Terminating phase.
In the operator log, we found the following error message:
failed to retrieve user credentials secret from status
To Reproduce
Steps to reproduce the behavior:
We notice sometimes than step 3 stay stuck also in Terminating phase.
Expected behavior
All components are removed
Version and environment information
Additional context
After some investigations into the logs and the source code of the operator, we find two issues:
Permission with userReference stuck in Terminating if User is removed first
If the User is removed before the Permission, the Permission can’t be removed because the operator can’t found rabbitmq username stored inside User secret credentials.(https://github.com/rabbitmq/messaging-topology-operator/blob/main/controllers/permission_controller.go#L63)
It throws an error and retriggers the object.
As a work around, we added a sync-wave to force ArgoCD to remove Permission before User. Although,if someone manually delete a User (before it’s Permission) we will have the problem again.
We can make PR to fix the test inside the permission_controller.
User deletion need user credentials secret
We deploy/destroy multiple review-apps each hour and we noticed that sometimes User deletion is triggered as expected but stuck on Terminating.
We looked at user credentials secret and we found that ownerReference.blockOwnerDeletion is set to false (https://github.com/rabbitmq/messaging-topology-operator/blob/main/controllers/user_controller.go#L169).
ArgoCD uses foreground deletion by default and sometimes User credentials secret is removed before User. The User CR doesn’t have rabbitmq username because it’s generated inside the secret.
The issue result is the same #277, where the secret is removed before the User
To fix that we can add it to same PR.
We proposed to set in the User status part a field with the real username to not depend on the secret
Did you have better recommendations ?
The text was updated successfully, but these errors were encountered: