Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architectural design recommendation and documentation #485

Closed
jdonenine opened this issue Mar 3, 2021 · 5 comments
Closed

Architectural design recommendation and documentation #485

jdonenine opened this issue Mar 3, 2021 · 5 comments

Comments

@jdonenine
Copy link
Contributor

As the project has evolved and attempted to solve increasingly complex challenges with deployment management, we have found limitations with our helm-focused approach. Some thoughts on the challenges we're facing have been documented in https://docs.google.com/document/d/1SYidr3kFLyRY4ttnqBGNO_peDol7nRfcOPtnCcdio6w/edit?usp=sharing and this issue intends to continue that conversation and push for a decision point about the long-term implementation approach that the project will take.

@jsanda
Copy link
Contributor

jsanda commented Mar 3, 2021

That Google Doc is not publicly visible so I will port the relevant parts to this ticket.

@jsanda
Copy link
Contributor

jsanda commented Mar 10, 2021

Where Helm falls short

Leading up to the 1.0 release we went through a chart refactoring in which we implemented more of an umbrella chart design. Originally we had two charts - k8ssandra and k8ssandra-cluster. We now have the k8ssandra parent chart and several child charts on which it depends. The child charts can be deployed independently of the k8ssandra parent chart.

The motivations for this refactoring include:

  • It simplifies how we enable/disable a component.
    • When a component is part of the k8ssandra chart, we need to wrap each of its templates in if blocks to check whether the component is enabled. Those checks go away when the component is moved into its own chart.
  • It is easier and more intuitive to install the component on its own when there is a separate chart vs installing k8ssandra and disabling everything you do not want.
  • Most importantly, moving a component into its own chart simplifies naming.

This refactoring along with some other work have brought to light some areas where Helm falls short.

Programming language support

Helm and its template engine is not a full blown programming language (i.e., it is not Turing Complete). For situations that involve a lot of complex logic, having a full language (e.g., Go, Java, Python) at your disposal not only makes things easier but also less error prone.

Tool Support

This is an extension of the programming language point. IDEs provide great support for navigating through a codebase. Code navigation is much more limited with Helm template code.

We have a good number of unit tests, but I am not sure if we will be able to generate test coverage reports for them.

We lack debugger support.

Extensibility

Helm has a large library of built-in template functions. Obviously those functions do not and cannot cover every use case. I spent time researching how to build custom functions. I do not think it is possible. If it is, it is not documented and I have not been able to find any examples.

Keep it DRY

I actually like using and developing with Helm a lot; so, it pains me to say that Helm does a poor job of making it possible to keep chart code DRY. I will illustrate with a couple of examples.

Parent charts can override child properties and push them down to child charts. Child charts cannot access parent chart values. If there are a set of properties we want the parent chart to push down to a child chart, those properties must be defined in the child chart as well. Consider Cassandra authentication/authorization as an example. We declare and configure Cassandra auth in the k8ssandra chart. In particular, we declare if it is enabled. That configuration has to be duplicated in any child chart that needs support auth. Duplication cannot be avoided even when using global properties.

# k8ssandra/values.yaml
#
# We want to expose this setting to child charts. The reaper chart for
# example will need to know whether or not to configure Reaper to use
# credentials for CQL connections. This could apply to other, future
# charts (e.g., nosqlbench) as well.
cassandra:
  auth:
    enabled: true

#############
# reaper    #
#############
# Properties in this section configure the reaper chart
#
reaper:
  cassandra:
    auth:
      enabled: true

############
# stargate #
############
# Properties in this section configure the stargate chart
stargate:
  cassandra:
    auth:
      enabled: true

This is a scaled down example that demonstrates how properties are pushed down from the parent chart to child charts.

Here is an analogy to help explain how this works. Let's say that the charts are Java objects. The parent chart/object needs a common interface with which to communicate to child charts/objects. By declaring the auth properties, the child charts implement an interface that the parent chart knows about.

Unfortunately you cannot use variables or templates in values.yaml. This means that if we want to enable/disable Cassandra we need to update the enabled flag for each chart. Suffice it to say, this would be very error prone.

There has been some discussion around using global variables to work around this issue. I investigated this. Global variables solve part of the problem while introducing others. They potentially eliminate the need to update the enabled flag across charts. For this to work though requires additional function calls in the child chart templates. Those function calls have to be duplicated in each child chart template where the global variables are needed.

This issue manifests itself with Cassandra topology as well. Reaper and Stargate need to know about the cluster size, the cluster name, and racks. It will likely get even worse when we add multi-DC support. We will need to introduce control flow logic to iterate over the datacenters specified in values.yaml. The control flow logic will have to be duplicated across templates (where it is needed).

Variable scoping

Helm allows you to define variables within a template. This can be particularly useful to encapsulate and eliminate duplication of complex logic expressions. Unfortunately variables are scoped to the template in which they are defined. If we need to reuse a complex logic expression in another template, we have to violate the DRY principle.

CRDs

Helm 3 takes a hands off approach with CRDs. If a CRD is not installed, Helm will install it; otherwise, Helm ignores it. This means that Helm provides no support for managing CRD changes. This is why we have to need a pre-upgrade hook (see here for details).

Hooks

Helm provides chart hooks to perform actions during the lifecycle of a release (see here). We already have a couple of hooks and could certainly wind up with more. Hooks themselves present some challenges though.

Depending on the deletion policy of the hook you may not be able to access the logs if it fails because Helm will delete it. This is how the cleaner hook is configured. This is the behavior we want in general so as to avoid extra resources lying around after uninstalling a release.

Things can get complicated if you have multiple hooks running in the same phase. They are run in a particular order, but the hooks themselves need to be aware of that ordering if there is any dependency between them.

The path forward

From a pure engineering perspective, the issues outlined above are significant enough to merit asking the following question:

Should we continue to rely so heavily on Helm? The answer is no. Just to be clear, I am not suggesting that we abandon Helm.

We need to build an operator for K8ssandra. The operator might consist of multiple CRDs and multiple controllers.

K8ssandra would still be packaged and installed via Helm. It will still allow you to enable/disable each of the components. Rather than relying on Helm to configure and create many resources as we currently do, the Helm charts will instead focus on configuring and deploying the various operators. The custom controllers will take over the responsibility of creating and configuring the resources that k8ssandra creates and deploys.

K8ssandra CRD

A K8ssandra custom resource might look something like this:

apiVersion: k8ssandra.k8ssandra.io/v1alpha1
kind: K8ssandra
metadata:
  name: test
spec:
  cassandra:
    clusterName: "test"
    version: "4.0.0" 
    auth:
      enabled: true
    datacenters:
    - name: dc1
      size: 3
    - name: dc2
      size: 3
    - name: dc3
      size: 3
  reaper:
    enabled: true
  stargate:
    enabled: true
  medusa:
    enabled: true
  prometheus:
    enabled: true
  grafana:
    enabled: true

You may have noticed that the structure looks very similar to the values.yaml file of the k8ssandra chart. This is not by coincidence.

K8ssandra custom controllers

The k8ssandra operator might consist of controllers for each of the following:

  • Cassandra
  • Stargate
  • Reaper
  • Prometheus
  • Grafana
  • Medusa

Each controller would for the most part delegate work to the respective operators (note that a Stargate operator does not exist today).

If most of the work is delegated to the respective operators, couldn't this all be done with a single controller? Yes, this could be done with a single controller. This approach however is more modular and promotes loose coupling. This will pay dividends as the code base grows and matures.

Additional Capabilities

Unified status

With the current implementation there is no single object whose status you can check to see if everything is in a healthy state.

With an operator and CRD we can provide much better status reporting. For example, there could be a single flag that indicates if the whole stack is in the ready state.

Multi-DC support #524

Due to the challenges with Helm previously discussed, it would be difficult to implement multi-DC support with the current Helm-based implementation.

Configure replication factor of system keyspaces #485

Due to a bug in how we currently set the replication factor for system keyspaces, changing the cluster size will trigger a cluster rolling restart after nodes are added/removed.

Fixing this in such a way where we still automatically compute the RF for system keyspaces will be difficult with the current Helm-based implementation. It will be much easier with an operator.

Install from a backup #525

This feature request ticket describes installing k8ssandra where the C* cluster is provisioned from a backup. This also will be much easier to implement with an operator.

Conclusion

Helm presents enough challenges that we need to build out more of an operator-based solution. We should start working on this very soon and target a 2.0 release.

@jdonenine
Copy link
Contributor Author

jdonenine commented Mar 10, 2021

Lots of great background and discussion here @jsanda , thanks!

One thing you don't mention in the later section is ingress. Safe to assume that by omission you'd want to continue deploying our ingress rules via helm? Or, would that be the responsibility of each operator/controller to handle for that specific area of the stack?

Another question that comes to mind is related to the separation of the operators - which I like. But, how would that work in the case of Prometheus/Grafana where we use the kube-prometheus-stack to deploy those resources? Would we instead use their individual operators/CRDs to deploy them now?

@jsanda
Copy link
Contributor

jsanda commented Mar 10, 2021

I would handle ingress with the operators. Since we set up ingress for several components I expect that we would end up with some helper functions or a library that is reused by the various operators.

kube-prometheus-stack uses prometheus-operator to deploy and manage Prometheus. We would have a prometheus controller that is responsible for creating and managing Prometheus objects.

Before we switched to kube-prometheus-stack we were using grafana-operator. If it makes sense we could go back to it and like with Prometheus we would have a dedicated controller for Grafana.

@jsanda
Copy link
Contributor

jsanda commented Mar 22, 2021

The purpose of the ticket was to document challenges around the current Helm-based implementation and to propose a solution. I am going to close the ticket since that information has been documented.

@jsanda jsanda closed this as completed Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants