Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial KEP for immutable fields #1099

Closed
wants to merge 10 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
204 changes: 204 additions & 0 deletions keps/sig-api-machinery/20190603-immutable-fields.md
@@ -0,0 +1,204 @@
---
title: Immutable Fields
authors:
- "@apelisse"
owning-sig: sig-api-machinery
participating-sigs:
- sig-api-machinery
reviewers:
approvers:
editor: TBD
creation-date: 2019-06-03
last-updated: 2019-06-03
status: provisional
see-also:
- "/keps/sig-api-machinery/0006-apply.md"
---

# Immutable fields

## Release Signoff Checklist

- [ ] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
- [ ] KEP approvers have set the KEP status to `implementable`
- [ ] Design details are appropriately documented
- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [ ] Graduation criteria is in place
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

## Summary

A lot of fields in APIs tend to be "immutable", they can't be changed after
creation. This is true for example for many of the fields in pods. There is
currently no way to declaratively document that fields are immutable, and one
has to rely on either built-in validation for core types, or have to build a
validating webhooks for CRDs.

Providing a new `// +immutable` marker would make the API more descriptive to
users and help API developers by verifying these assertions automatically.

## Motivation

There are resources in Kubernetes which have immutable fields by design,
i.e. after creation of an object, those fields cannot be mutated anymore. E.g. a
pod's specification is mostly unchangeable once it is created. To change the
pod, it must be deleted, recreated and rescheduled. Users want to implement the
same kind of read-only semantics for CustomResources, for example:
https://github.com/kubernetes/kubernetes/issues/65973

### Goals

The goals are mostly twofold:
- Provide a mechanism to document APIs (both built-in types and CRDs) by having
that additional marker provide information to users AND tools (including
generated documentation, kubectl explain),
- Automatically provide logic for developers (of core-types and CRDs) to
guarantee immutability of specific fields.

### Non-Goals

There are a few non-goals:
- OpenAPI has a notion of "readOnly", but this is only meant to simplify schema,
and one is never supposed to send a readOnly value. This is different because
we want "writeOnce", not "readOnly".

## Proposal

This proposal attempts to create a concept of immutable and to visit some of
what this can mean, but doesn't intend to propose all the possible
semantics. The proposal should not close the door to further improvements.

### Semantics

We'll define immutable as "writeOnce", which means that the fields can only be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior art, same idea but they opted to name the concept "CreateOnly" which I personally found non-confusing.

set at creation time, and can never be updated after. Attempts to update an
immutable field will result in an error (as opposed to being ignored), though we
could potentially add that semantics later.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add the following, to make it explicit:


Clients implement idempotent creation by retrying creation, and they currently do not get errors if the same creation is tried twice. This KEP will not change that. Specifically, two consecutive POSTs of the same object, which are mutated the same way by mutating admission controller (which is the typical case), will both succeed. If an error were returned the second time, and the client did not see the first success response, the client would be confused.


It's also important to note that this concept is orthogonal to field ownership
and list associativity. Ideally we would manage to keep the concepts entirely
separate. e.g. An atomic list is a different concept from an immutable list.

There are different semantics for a field to be immutable, especially if we
consider non-leaf fields like lists and maps.
apelisse marked this conversation as resolved.
Show resolved Hide resolved

#### Field selection
apelisse marked this conversation as resolved.
Show resolved Hide resolved

Scalar fields have an obvious "selection" mechanism: they are either immutable
or not. Things get more complicated for lists and maps, since we need to know if
the flag applies to the list/map itself, to its members, or both.

A few possible semantics are described here:

- Non-recursive: For a list or a map, one can not remove or add new items, but
existing items can be modified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to implement an object with the same semantics as Pod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref #1099 (comment) and #1099 (comment) for more detail on how Pod immutability can be supported

- Recursive: means that none of their field can change, and new fields are not
accepted.
- Recursive with addition and/or deletion:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this can be done by making the list/map mutable, but the values of the list/map immutable.

- New/deleted fields in maps and lists are tolerated.
- The added/deleted type would probably be "Recursive" after the first level
(addition/delete is not recursive).
- For associative lists, keys are already "immutable" in a way (similar to how
names are "immutables" for objects). Changing the keys would mean deleting
and creating a new item in the list, which would be permitted by these
rules.
- Mutable: A field set as mutable could cancel the recursive immutability.

#### On mutation of selected fields

- Error: mutation of an immutable field returns an error and the request is
rejected. No mutation is performed at all.
- Ignored: mutation of the field (and/or sub-fields) are completely ignored, and
apelisse marked this conversation as resolved.
Show resolved Hide resolved
not persisted. They do not trigger an error.
apelisse marked this conversation as resolved.
Show resolved Hide resolved

#### Equality

Equality of two fields is very important:
- Scalar fields are compared directly: fields and values must exactly match from old
object to new object.
- Lists fields are compared using the same logic currently used for native
kubernetes types: Empty and null lists are equal. A missing list is NOT equal
to a null or empty list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like to have a way in CRD OpenAPI to specify a normalization behaviour for lists and maps:

x-kubernetes-normalize-null: empty
x-kubernetes-normalize-null: undefined
x-kubernetes-normalize-empty: null
x-kubernetes-normalize-empty: undefined

And then we should be strict here in the mutation test.

Also note that in Proto we have neither null nor empty by default. I still think we need a more broad discussion of null, undefined, empty values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, I don't know if this should be blocking?

Copy link
Contributor

@sttts sttts Jun 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least we have to leave it open here and have a plan before marking it implementable. But I would not like to move forward with implementation before being very clear about this topic and where it is heading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in other words: whatever we decide here will drive the API design of the whole CRD ecosystem. We should take the time it needs to understand the topic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make a decision and enforce it rather than offer too many complicated options. We currently consider empty and null to be equal for existing type, and I think that we should just tell people that's how things are going to work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main problem with any non-strict behaviour for CRDs is that it feels very odd to have different objects in etcd, but consider them equal.

For native resource with proto encoding this is not the because proto unifies undefined, null and empty all to one value und unmarshal those to null for comparison.

I wonder whether we should talk about normalization for CRDs before talking about equality:

  • we have one normalization already: undefined => default value
  • we don't have normalization for empty
  • we don't have normalization for null

With normalization happening as part of our deserialization stack, we can separate both concerns and use strict equality in immutability checks.

Today we use semantic equality mostly because JSON and Proto have different normalization behaviours (partly driven by omitempty as well).

normalization JSON omitempty
undefined undefined
null undefined
empty undefined
normalization JSON
undefined null
null null
empty empty
normalization Proto omitempty/non-omitempty
undefined undefined
null undefined
empty undefined
normalization CRD omitempty/non-omitempty
undefined undefined
null null
empty empty
normalization CRD default=null
undefined null
null null
empty empty
normalization CRD default=empty
undefined empty
null null
empty empty

Copy link
Contributor

@jpbetz jpbetz Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding all the normalization rules right, then for native types we have something like this?

normalization native (Proto in etcd), client speaking JSON non-omitempty omitempty
undefined empty undefined
null empty undefined
empty empty undefined
normalization native omitempty/non-omitempty, client speaking Proto
undefined undefined
null undefined
empty undefined

And we don't have the concept of omitempty empty for CRDs?

Would it make sense to normalize the same way for CRDs as we do for native types (using proto)? I.e. normalize null and empty to undefined. We'd need to normalize all data received from clients and all data read from etcd (since we have non-normalized data at rest). If we did that, we should be able to do strict equality in the api-server code.

I'm guesses that to be consistent we should also normalize native types stored as json like we do for CRDs?

(edit: Adding the normalization would be breaking to any CRD users that expect normalization to not occur.. )

Copy link
Contributor

@sttts sttts Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the first table, what are the values?

And we don't have the concept of omitempty empty for CRDs?

There is no Golang type that could have omitempty in the request handler logic. JSON unmarshalling for CRDs is faithful, i.e. does not swallow or normalize anything.

I am not convinced we want to start putting up fuzzy glasses for CRDs by adding normalization by default. Moreover, we cannot even do that without breaking compatibility, at least not in general (we could do it for equality only).

I don't like the idea of having the normal REST logic with strict empty-null-undefined logic, but immutability to act differently. That's pretty counter-intuitive.

I tend to think about normalization as a property of the fields, i.e. the schema properties. For example we can express nullable fields (which preserve null in all cases, and empty and undefined) using special Golang types (like extra pointers) and at worst custom marshallers. What we don't do today is publishing of this as part of OpenAPI, and in general we never formalized that. But my point is that there are native types which do not normalize prior to persisting values and before comparing, it is just not what we get by default with naive or lazy Golang type definitions and our serializer stack.

In other words, we might want to formalize the normalization of any slice and map in the schemas, for native types and for CRDs. CRD schemas could get a x-kubernetes-normalize-empty: undefined vendor extensions (not necessarily for 1.17, could be added later). Equality should be strict, relative to the defined normalization (which is not the indenty normalization for native types then).

Also for later introduction of protobuf support for a CRDs, we can require a certain normalization that is compatible with proto. I.e. the CRD validation of a CRD instance with proto enabled, would validate the proto numerals are all specified, and it would validate that normalization is compatible.

Copy link
Contributor

@sttts sttts Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

properties:
  foo:
    x-kubernetes-proto: 6
    x-kubernetes-normalize-empty: empty
    x-kubernetes-normalize-null: empty # if nullable
    x-kubernetes-normalize-undefined: undefined
    nullable: true
    x-kubernetes-immutable: true

validation error: properties.foo.x-kubernetes-normalize-undefined must be empty if x-kubernetes-proto is set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. I don't think I appreciated the extend our existing golang type definitions play into the situation. That's good to know.

From a usability perspective, it wish we could provide a sane default normalization rule that users could get by default and not have to specify any normalization settings explicitly, but I agree we can't just start doing that for CRDs since it breaks backward compatibility.

I really like the idea of specifying the normalization we expect then validating our encodings and go types are all aligned with that.

- Same rule for list applies to maps and struct: An empty map and a null map are
considered equal. The keys in the map must be the same. Values are compared
using this same set of rules.

#### Proposal

The behavior on mutation of immutable field is orthogonal to selection of
immutable fields. If we accepted these two semantics at the same time, we would
have to either have two different markers (one for selection and one for
behavior) or have a single field with multiple options.

For the moment, we understand that "Recursive" is the most important and simple
selection mechanism, and "error" is the most natural way to handle
mutations. For this reason, this proposal focuses on providing those two options
only, with the opportunity to extend this functionality later on.

### Go IDL

The proposal is to add a new marker with the following (only) format:

```
// The name can not be changed after creation.
// +immutable
Name string

// The list of containers can not change AT ALL after creation.
// No single field in existing containers can be changed, added or deleted,
// no new containers can be added, no existing container can be removed.
// +immutable
Containers []Containers
```

That immutable tag would be extensible in the future by adding extra-parameters if needed:
```
// +immutable=ignore
```

### OpenAPI Extension

Since the semantics of the "readOnly" tag in OpenAPI is not the one we're trying
to have here. We propose a new Kubernetes specific extensions, which has to
allow for further changes in the semantics of our immutable marker:
```
"x-kubernetes-immutable": ["recursive", "error"]
```

### Mutating admission chain

Mutating admission chain would have the exact same effects as user changes,
meaning that they wouldn't be able to change an object after creation. This is
very similar to what is done today since validation for update is run AFTER all
mutations. On creation, mutation are permitted.

### Where does this happen

This process is meant to happen right before the update validation and after
mutating and validating webhooks, and only run on updates. This will allow us to
keep the exact same behavior while removing the validation code that checks the
immutability of fields.

### Risks and Mitigations

One risk would have been to block updates to some metadata fields. But CRD
validation already guarantees that this is not possible for CRDs. For
core-types, we would have to assert that `metadata` fields are not immutable.

## Design Details

### Test Plan

Not ready yet

### Graduation Criteria

Because of the very limited risk and existing infra-structure set-up by
server-side apply, this feature will start as Beta and be promoted to GA a few
cycles later, based on the received feedback.

## Implementation History

N/A