Skip to content

Commit

Permalink
renamed to HIP-0011, reformatted, and addressed some comments
Browse files Browse the repository at this point in the history
Signed-off-by: Matt Butcher <matt.butcher@microsoft.com>
  • Loading branch information
technosophos committed Mar 26, 2021
1 parent 078d94c commit 018e65a
Showing 1 changed file with 49 additions and 26 deletions.
75 changes: 49 additions & 26 deletions architecture/crds.md → hips/hip-0011.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,40 @@
# CRD Handling in Helm
---
hip: 0011
title: "CRD Handling in Helm 3"
authors: [ "Matt Butcher <matt.butcher@microsoft.com>" ]
created: "2021-03-26"
type: "informational"
status: "draft"
---

This document talks about the problems the Helm team has had dealing with CRDs, and lays out criterion for how to move forward. While it discusses a few solution _paths_, it does not provide a single solution. It is also an instrument for ruling out solutions that do not match Helm's guiding principles.
## Abstract

The most intractable problem in Helm's history has been how to handle Kubernetes CRDs. We've tried a variety of approaches, none of which has proven satisfactory to all users. Our current solution, while not comprehensive, is designed to privilege safety over flexibility. We are considering options for a Helm 4 time frame, and this document is a first step in that exercise. (Backward-compatible features for CRDs could still be merged into Helm 3.)
This document talks about the problems the Helm team has had dealing with CRDs, and lays out criteria for how to move forward. While it discusses a few solution _paths_, it does not provide a single solution. It is also an instrument for ruling out solutions that do not match Helm's guiding principles.

## The Core Problem
The most difficult problem in Helm's history has been how to handle Kubernetes CRDs. We've tried a variety of approaches, none of which has proven satisfactory to all users. Our current solution, while not comprehensive, is designed to privilege safety over flexibility. We are considering options for a Helm 4 time frame, and this document is a first step in that exercise. (Backward-compatible features for CRDs could still be merged into Helm 3.)

## Rationale

This section, which makes up the bulk of this informational HIP, describes why we made the decisions we have made thus far. It highlights the challenges that we think any suitable implementation must address. And it may serve as a guide for those who wish to tackle the problem.

### The Core Problem

The core problem is that CRDs (being a globally shared resource) are fragile. Once a CRD is installed, we typically have to assume (all other things being equal) that it is shared across namespaces and groups of users.

For that reason, installing, modifying, and deleting CRDs is a process that has ramifications for all users and systems of that cluster.

In spite of this, the Kubernetes API server is permissive about these CRDs. CRDs are mutable (even without a version change). When they are deleted, all instances of them are deleted without warning. They can be upgraded to be entirely incompatible with previous versions. And there is no way to programmatically inspect the CRDs in a cluster and determine whether they are used, how they are used, and by what.

## Users First: A Core Principal
### Users First: A Core Principal

Over the years, several proposals have surfaced and been rejected for one straightforward reason: They did not protect the user from badly written charts. This section explains the reasoning process behind our decision-making.
Over the years, several proposals have surfaced and been rejected for one reason: They did not protect the user from badly written charts. This section explains the reasoning process behind our decision-making.

Helm distinguishes between at least two roles:

- A chart author: A person filling this role _creates_ and _maintains_ Helm charts
- A Helm user: This person installs and manages instances of charts

### The Chart Author Role
#### The Chart Author Role

We assume that a _chart author_ has three specific areas of domain knowledge:

Expand All @@ -33,15 +46,15 @@ In this case, we assume the chart author role includes knowledge of Kubernetes k

We do not assume that chart authors have knowledge about the clusters into which their charts are deployed, knowledge of the source code for the packages they install, or knowledge of the extended toolchains that their users have employed. Furthermore, we do not assume that chart authors will always follow best practices or accommodate use cases that may be important to some class of users. In fact, the Helm security model urges us to include bad actors in the class of chart authors. (That is, there is or may be a small subclass of chart authors that have intentions counter to the desires of their target Helm users.)

### The Helm User Role
#### The Helm User Role

Our assumptions about the base level of _Helm users_ are more modest. While some users may be experts, we do not assume that a Helm user _must_ be at that level. We do not assume they know much about Kubernetes or Helm--perhaps only enough to follow the Quickstart guide for Helm. With the base Helm user, we do not assume that they know what a Pod is, let alone a CRD. While we do assume that they know a little about YAML, we make no assumptions that they know about the Kubernetes flavor of YAML.

### The Importance of This Distinction
#### The Importance of This Distinction

Over time, our assumptions have shown themselves true. Many (perhaps even most) Helm users are new to Kubernetes, and we hear repeatedly that people have learned Kubernetes via Helm. Our issue queue is replete with examples of people who have installed Helm charts in production, but who are not Kubernetes experts by any measure. Chart authoring, on the other hand, has remained the domain of experienced Kubernetes users, and the questions we receive from chart authors indicate a high degree of comfort with Kubernetes itself.

Some have attempted to argue that "really," the chart developer and the Helm user are the same person -- that most of the time, people build their own charts. Our usage statistics show otherwise. The usage pattern we see most often with Helm is that a chart author is a different person than a Helm user. That is, most of the time, one group of persons creates charts, and another group of (non-overlapping) persons use the charts.
Some have attempted to argue that "really," the chart developer and the Helm user are the same person -- that most of the time, people build their own charts. Our usage information shows otherwise. The usage pattern we see most often with Helm is that a chart author is a different person than a Helm user. That is, most of the time, one group of persons creates charts, and another group of (non-overlapping) persons use the charts.

As a consequence of this, we can assume that the _person using the chart does not know or understand the internals of the packages they install_. This is not merely a statement that they have not read the templates, but that they would not understand them even if they did, because they have not had to (nor should they have to) become fluent in the chart system to use the charts.

Expand All @@ -53,7 +66,7 @@ This same justification has been used to develop many of Helm's core features. I

We will not ignore it for CRDs.

## CRDs and How They Are Used
### CRDs and How They Are Used

CRDs are shared global resources. A "shared global resource" is a resource that can be installed only once (globally, not within a namespace) and which may be used by multiple different things. CRDs have one canonical record, which covers all the different versions of that CRD.

Expand Down Expand Up @@ -81,21 +94,21 @@ The following facts about CRDs should be kept in mind throughout this paper:
- CRDs are mutable. An operator can update versions, schemas, names, etc. on a CRD _ad hoc_. The API server will not enforce restrictions like it does on Deployments or other objects.
- When a CR (CRD instance) is written to Kubernetes (on an update, for example), it will be _rewritten_ to the version that the CRD has marked as default. This means that backward compatibility can break merely by updating the default version on a CRD object. [See this section of the docs](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#writing-reading-and-updating-versioned-customresourcedefinition-objects), which reveals a few other edge cases.
- During a read operation, Kubernetes can rewrite an object version without rewriting the body of the CR. So you can get a resource that is marked with a version that it actually is not.
- On a write operation, a version field may be permanently rewritten to a version other than the version given in the object to be written, but the body is not updated
- On a write operation, a version field may be permanently rewritten to a version other than the version given in the object to be written, but the body is not updated
- A retrieved object may not match the schema of the version that is in its apiVersion field because of the above.
- A developer can install a Kubernetes webhook that will auto-convert CRD fields. The controller does not have visibility into this conversion: It happens before the event triggers inside of Kubernetes.

A quick glance through this section should reveal one stark fact: Everything that uses a CRD should (and perhaps must) use the same _version_ of the CRD as the one marked as the default for that cluster. This is the single most fragile aspect of the entire CRD system.

## How This Impacts Helm
### How This Impacts Helm

Helm has had a difficult time dealing with CRDs since the beginning. And over time, the decisions made on Kubernetes CRDs have made it more (not less) difficult for us to work with them.

Originally, we believed that CRDs would be used as they were originally intended: As descriptors of controllers that added Kubernetes functionality. As such, we initially thought a CRD could simply be treated as regular resources _because it would always only ever be bundled with a single controller that ran cluster-wide_. But that has proven not to be the case. Furthermore, the (anti-)pattern of distributing a CRD with multiple pre-populated CR instances is now a regularly encountered phenomenon. As such, we have been forced to treat CRDs as a special class of resource because a CRD must be loaded into the Kubernetes API server before a CR can reference that CRD. As we have seen the usage of CRDs expand well beyond the original intent, the patterns listed in the previous section are not anomalies, but standard practices in the community. Thus, our original designs for CRD handling have been completely re-thought--first in Helm 2 with the addition of CRD hooks, and then again in Helm 3 with special `crd/` directories in Helm charts.

Our current solution (Helm 3) supports installing CRDs, but does not support modifying or deleting CRDs. Those two operations currently must be done out of band. The sections below explain this decision.

### Installing CRDs
#### Installing CRDs

There are a number of well-described issues with installing CRDs. Users must have broad permissions. A CRD must be fully installed before a manifest that references that CRD can be accepted by the Kubernetes API server. (However, a CR can be accepted before there is anything available to handle the CR event.) It is entirely possible for two different applications to require different versions of the same CRD, but no clear way to support or declare that need within Kubernetes. This is exacerbated by the fact that while a CRD is a global object, the controllers that listen for CRD events may be namespace-bound. This means that two instances of the same controller (in different namespaces) can use the same CRD definition. There is no way to query Kubernetes to discover this fact.

Expand All @@ -107,15 +120,15 @@ Of course, users are unhappy with this for a host of reasons. They want CRDs tem

> NOTE: There is no requirement that CRDs can only be placed in the `crd/` directory. They can be put along side other resources in the `templates/` directory. This was an intentional design choice to preserve backward compatibility.
## Deleting CRDs
### Deleting CRDs

We'll delay talking about upgrades for just a moment, and skip to the easiest one: Deleting CRDs.

Helm currently does not delete CRDs. The reasoning behind this decision is the trivial confluence of two aspects of CRDs: global resources and cascading deletes.

The next two subsections explain these.

### Shared Global Resources
#### Shared Global Resources

Earlier, we looked at how Kubernetes CRDs are _shared global resources_. That is, only one resource describes the CRD and all of its versions, and that one resource is shared among all namespaces on a cluster.

Expand All @@ -137,7 +150,7 @@ Even more simply, there is no guarantee that Team A even knows that their chart

The frequent response to this point is to say, "users deserve to experience the outage if they uninstall a CRD." We do not think this is fair or accurate. Many times (especially in multi-tenant clusters), the team that uninstalls the CRD is not the group of people harmed. It's the _other_ cluster users. It is patently unfair to say "If Team A makes a mistake, Team B should pay the consequences even if they did nothing wrong." See the "Users First" section above.

### Cascading Deletions
#### Cascading Deletions

Another frightening aspect of CRDs that has prompted us to not support CRD deletion is the cascading effect: When a CRD is deleted, all of the instances of that CRD (the CRs) are deleted as well.

Expand All @@ -152,7 +165,7 @@ Now imagine that deleting entails removing a CRD that has hundreds of in-product

Again, this is not a story friendly to our "users first" philosophy.

## Upgrading and Modifying CRDs
### Upgrading and Modifying CRDs

This is perhaps the most vexed part of CRD handling. While the ramifications of deletion are straightforward, modifying CRDs is nuanced and complicated.

Expand All @@ -164,7 +177,7 @@ Because CRDs are mutable, it is possible for CRD manifests to change the behavio

Finally, as stated previously, there is no way for Helm (or any other process) to see which things on a cluster consume a given CRD. And there is certainly no way to determine what particular version of a CRD they consume. At best, there are weak inductive methods that could be used to say "during a given period, container X made a request to the API server that makes it look like it understands version X of CRD Y." But those methods are definitely non-trivial and non-exhaustive. This is something we view as a serious design flaw in Kubernetes itself.

### Upgrading CRDs
#### Upgrading CRDs

Given the issues called out above, the core problem with upgrading a CRD instance is that Helm cannot determine whether upgrading a CRD will break other things in the cluster. It is worth calling out this fact again: CRDs are cluster-wide. A user may have no idea that by updating a Helm chart, it breaks other parts of the cluster that the user does not even have access to. For example, a user might have permissions to create CRDs, and permissions to do anything on that user's namespace, but not have even read access to any other namespaces on the cluster. Yet by updating the CRD, the user may break other things on the cluster to which the user has no access.

Expand All @@ -178,21 +191,21 @@ In addition, there are some rather complex issues with the templating system tha

Finally, note that flags on `helm upgrade` such as `--force`, `--wait`, and `--atomic` introduce additional complexity. For example, `--atomic` could allow a CRD to become active for a period, and then roll it back, but without repairing any CRs that were altered during the window that the CRD was active. In the worst cases, some suggested changes to Helm might actually cause the CRD to be _deleted and recreated_ during an upgrade, which would have the side-effect of deleting all CRs for that CRD. This, of course, could have dire unintended consequences.

### Rollbacks
#### Rollbacks

Rollbacks in Helm are a special kind of upgrade in which an old release is replayed over a newer release. Along with all of the drawbacks of upgrades, rollbacks present a few special challenges.

If a CRD is rolled back, then the old version will overwrite the new version.

There are two important cases here: (a) the older chart _does not have the CRD_, or (b) the older chart has an _older version of the CRD.
There are two important cases here: (a) the older chart _does not have the CRD_, or (b) the older chart has an _older version of the CRD_.

#### Rollback target does not have the CRD
##### Rollback target does not have the CRD

Consider the case where we roll back from revision 2 to revision 1 of a release, and revision 2 introduced a CRD that was not present in revision 1.

In this case, the _proper_ behavior for Helm is to _delete the CRD_ from revision 2 when rolling back to revision 1. As previously mentioned, that will cause a cascading delete of all CRs for that CRD. In some circumstances, this is as desired. But in other circumstances, this could destroy resources belonging to other consumers of that CRD, or even destroy data that was needed during a recovery (e.g. roll back to one version, then upgrade from there). Thus, there are multiple avenues in which undesirable destruction of data may occur.

#### Rollback target has an older version of the CRD
##### Rollback target has an older version of the CRD

In this case, both revision 1 and revision 2 have the CRD, but the version of the CRD changes between revisions. During a rollback, the proper behavior would be to roll back the CRD to its older version.

Expand All @@ -205,7 +218,9 @@ But this is where Kubernetes' behavior gets interesting:

Thus, rollbacks have a few special cases where the uncertainty of Kubernetes' behavior could cause issues that are exceedingly hard to debug.

## Proposed Solutions for Helm CRD Management
## Rejected Ideas

Proposed Solutions for Helm CRD Management

In this section, we turn from enumerating problems to evaluating potential solutions. We understand that Helm users want a way to hide the complexity of CRDs and be able to do "simple things" like upgrade a CRD from one version to another without accidentally destroying data or harming other parts of the cluster. This has been a difficult goal to achieve though.

Expand Down Expand Up @@ -292,7 +307,15 @@ As things currently stand, there is no way for us to implement this.

Basically every problem raised in this document is unsolved for this particular proposal. Helm can reliably determine if a CRD already exists, and can install a CRD if it does not exist and if the Helm user has sufficient permissions. But we have been unable to devise any way to solve the myriad upgrade, rollback, and deletion problems as well as the upgrade-on-install case for a CRD that already exists.

## Nobody is "Blocked" on This
## How To Teach This

A critical challenge for Helm 3 has been to educate the Helm community about why the maintainers have made the choices we have made.

Thee first step in teaching this has been to make the rationale generally accessible.
We hope the present HIP accomplishes this. But beyond that, we may need to head off some
common misconceptions directly. The remainder of this section discusses those.

### Nobody is "Blocked" on This

It has been claimed that this issue is a "blocker" to using Helm. While we hear this claim on occasion, it stems from a misunderstanding.

Expand All @@ -315,7 +338,7 @@ You are free to handle CRDs however you want. To get you started, here are some

Helm maintainers are not trying to prevent usage of CRDs. As we've stated above, though, our quality bar is high because we have a lot of people depending on us. But we have provided ways for you to achieve your own goals without meeting our standards or objectives. And perhaps others would find use in the plugins or tools you provide.

## Conclusion
### Helm Is Still Working On Thus, But We Cannot Solve It Alone

CRDs continue to present a huge challenge to Kubernetes users in general. Helm perhaps has it worse than other projects, as Helm is a generic solution to a generic problem (Kubernetes package management). Helm knows nothing of a cluster's stability, intent, architecture, or the social organization around it. Thus, to Helm, a development cluster with one user is no different than a thousand-node multi-tenant cluster with rigid RBACs.

Expand Down

0 comments on commit 018e65a

Please sign in to comment.