Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing a Redpanda cluster with the same name as the Operator breaks the Operator #105

Open
voutilad opened this issue Apr 1, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@voutilad
Copy link

voutilad commented Apr 1, 2024

Found this using Terraform, but that seems unrelated.

Install an Operator instance using Helm with the name "redpanda" in a namespace of your choosing.

Create a Redpanda CR with the name "redpanda" in the same namespace and apply it.

The Operator will trash its own Helm release, removing critical resources like Roles.

JIRA Link: K8S-207

@voutilad
Copy link
Author

voutilad commented Apr 1, 2024

I've confirmed things still break when not using Terraform and the helm_release resource from the helm provider, but things seem to just never deploy. Once the Operator is installed with the name redpanda it's unable to deploy any clusters regardless of the name of the Redpanda CR.

@alejandroEsc
Copy link
Contributor

This is probably naming collision of objects on release. There may be some things we can do to resolve this, like using :
helm.sh/resource-policy: keep
However, i do not recommend this.

Another approach is to prepend objects that the operator needs with something lie {releasename}-operator-role.yaml but you will get funny names like operator-operator.yaml

Ultimately if this is happening maybe we should just declare this a known issue and simply not allow an operator release name of "redpanda" in general.

@chrisseto
Copy link
Contributor

We could have the operator check to see if there's a helm managed release with a conflicting name by looking for the secrets that helm uses for managing releases before creating the HelmRelease CR. Kinda weird that Flux doesn't do something like that but it's not worth digging into the internal of flux.

@chrisseto
Copy link
Contributor

@voutilad what did you expect to happen OOC? I would expect something to go wrong but I write the software. If you expected that the operator would "take over" the existing release, we might want to bump this up in priority as I expect most users would share your point of view and not ours.

If memory serves there's a doc that talks about migrating from helm to the operator. Does it recommend doing this?

@voutilad
Copy link
Author

voutilad commented Apr 3, 2024

I’d have expected it to fail to deploy the Redpanda cluster and keep the Operator in a functioning state or generate a release name that doesn’t conflict and allows me to name my Redpanda CR how I want.

The way I view this is: how am I (the user) supposed to know I’m actually causing Helm release name collisions? From my point of view, I’m giving a name to a Redpanda custom resource.

I understand how things work underneath, but that’s only because I’ve used the Operator enough and read all the docs.

I can also envision a scenario in which someone doesn’t actually know the Helm release name for the Operator install.

@chrisseto chrisseto added the bug Something isn't working label Apr 10, 2024
@chrisseto
Copy link
Contributor

Makes sense! I propose that we make the operator fail to deploy the cluster if there's an existing helm release with a conflicting name as that's the most straightforward and least fragile solution. We'll still need to double check the migration docs before starting work on this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants