Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental upgrade of the k8ssandra-operator #1035

Open
dnugmanov opened this issue Aug 7, 2023 · 0 comments
Open

Incremental upgrade of the k8ssandra-operator #1035

dnugmanov opened this issue Aug 7, 2023 · 0 comments
Labels
assess Issues in the state 'assess' enhancement New feature or request

Comments

@dnugmanov
Copy link
Contributor

Hello,

Currently, I am facing an issue with the incremental upgrade of the k8ssandra-operator.

My use case is as follows:

  • The operator is running in cluster-scope and manages all entities across all namespaces (NS).
  • Namespaces and k8ssandraCluster entities are dynamically created through an external API based on client requests.
  • When a k8ssandraCluster is created via the external API, the operator reconciles the custom resource definition (CRD) and launches a Cassandra cluster along with its associated components.
  • I need to upgrade both the k8ssandra-operator and cass-operator from version A to version B. However, since upgrades often trigger a rolling restart of all k8ssandraClusters (mostly due to podTemplateSpec changes), this carries significant risks.
  • I want to reduce these risks by deploying version B of the operator in only one(or several) namespace. Meanwhile, I need to exclude all elements from namespace where version B of the operator is installed from reconciliation by version A of the cluster-scoped operator.

Available options:

Option 1:

  • Deploy version B of the operator in namespace scope.
  • For version A of the operator, set the WATCH_NAMESPACE environment variable with a list of all existing namespaces, excluding the test namespace.
  • However, a problem arises because namespaces and k8ssandraClusters are created dynamically through the external API, and there may be situations where a k8ssandraCluster is created in the new namespace but neither version A nor version B of the operator reconciles it.

Option 2:

  • ???

Desired implementation of this use case:

  • Deploy the namespace-scoped version B operator in the test namespace.
  • For the cluster-scoped version A operator, add an exception to exclude the reconciliation of the test namespace.
  • All newly created namespaces and k8ssandraClusters should be handled by the version A operator (cluster-scoped).
  • To implement this approach, it is necessary to add a predicate function that will exclude specified namespaces from reconciliation, possibly through the env variable K8SSANDRA_SKIP_NS.

How do you feel about me submitting a PR for an optional ENV variable, which, when set, will add an additional predicate to implement the functionality described above? Or perhaps you could suggest alternative options to achieve the desired behavior described above without making any additional code changes

This way, we can achieve a safe operator upgrade approach on large production clusters.

@dnugmanov dnugmanov added the enhancement New feature or request label Aug 7, 2023
@adejanovski adejanovski added ready-for-review Issues in the state 'ready-for-review' assess Issues in the state 'assess' and removed ready-for-review Issues in the state 'ready-for-review' labels Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assess Issues in the state 'assess' enhancement New feature or request
Projects
Status: Assess/Investigate
Development

Successfully merging a pull request may close this issue.

2 participants