-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Update guidance on required field serialization #8486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update guidance on required field serialization #8486
Conversation
Thinking through different types we might see, and the implications of this suggestion NullableAnything that needs to be a pointer to allow the zero value to be set, that also has StringAny required string should have a minimum length. If that minimum length is 0, or unset, then the string should be a pointer+omitempty to allow the empty string to be an explicit choice. Except when the required string is not actually a string.
Integer/FloatAs per example in the decsription, if the 0 is in the valid range of values (based on minimum/maximum markers) then pointer and omitempty should be used to allow BoolAny required bool should be a pointer+omitempty unless validation only allows the value ListsA required list should have a minimum length validation ( If the list wasn't a pointer, but had omitempty, the structured client would not be able to explicitly set an empty list. MapsThe same as lists? Not sure what the appropriate marker is for the minimum size of a map (it's ObjectsFor the zero value to be valid, an object should have no required fields, and have no validation that requires a minimum number of fields be set (e.g. If the object does have a required field, but that required field would marshal the zero value and that would be accepted by validation, then the zero value of the struct would also be considered valid. I think this would happen generally on a poorly validated set of fields. E.g. a string that doesn't have an explicit minimum length, but has not been added based on the rules outlined here. The object |
@@ -843,10 +843,10 @@ having the `+optional` comment tag is highly recommended. | |||
Required fields have the opposite properties, namely: | |||
|
|||
- They do not have an `+optional` comment tag. | |||
- They do not have an `omitempty` struct tag. | |||
- They are not a pointer type in the Go definition (e.g. `AnotherFlag SomeFlag`). | |||
- They mark themselves as required explicitly with a `+required` comment tag, or implicitly by not setting an `omitempty` struct tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough context, but could we simplify this guidance to required fields are only marked via the +required marker (given that's how it should be)
Or at least recommend it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - explicit FTW. If declarative validation flies, I want explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also believe in being explicit, but I know some tooling is implicitly assuming this, and we have a long history of that implicitness.
If we say they should be explicitly marked, should we also add an N.B.
that explains that historically this has been implicitly detected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we say they should be explicitly marked, should we also add an N.B. that explains that historically this has been implicitly detected?
I was thinking something like this, yes. Recommend to be explicit and then mention in some way how some tools behave (probably these tools (thinking about controller-gen as an example) also won't change to avoid breaking changes)
/hold I've been through this with @liggitt and we need to make some more updates to the guidance. Will look at creating a decision tree to help folks understand whether or not they should be setting fields as pointers/omitempty etc |
(notes from our discussion in https://docs.google.com/document/d/1k3XxUTEjcQifSXjoy5CcQY9QttdhQ-EG-R3vrKq6SUc/edit?pli=1&resourcekey=0-5xMYchKvoisABPw9tsR3nA&tab=t.0#bookmark=id.ks1ziis4vgzz - shared with https://groups.google.com/a/kubernetes.io/g/dev, can join that to get read access) |
- The API server should not allow POSTing or PUTing a resource with this field | ||
unset. | ||
- They _typically_ do not use pointer types in the Go definition (e.g. `AnotherFlag SomeFlag`), though required fields where the zero value is a valid value may use pointer types, likely paired with an `omitempty` struct tag to avoid spurious null serializations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"may" or "must" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is must
and likely paired
, should remove likely
c2ee563
to
8856289
Compare
22f7099
to
a40a583
Compare
@JoelSpeed thanks for this PR, this clarifications are really helpful! |
- The API server should not allow POSTing or PUTing a resource with this field | ||
unset. | ||
- They _typically_ do not use pointer types in the Go definition (e.g. `AnotherFlag SomeFlag`), though required fields where the zero value is a valid value must use pointer types, paired with an `omitempty` struct tag to avoid spurious null serializations. | ||
|
||
For more details on how to use pointers and `omitempty` with fields, see [Serialization of optional/required fields](#serialization-of-optionalrequired-fields). | ||
|
||
Using the `+optional` or the `omitempty` tag causes OpenAPI documentation to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: should this (openAPI documentation) change now that we can have omitempty on required fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to make sure that it respects +required
tags. If the field doesn't have +required
and has omitempty
, then I think there are many tools that still assume optional and we shouldn't change that.
In the long run, we need to make sure all fields are marked as +optional
xor +required
explicitly
This is expected to change in the future and having the `+optional` comment tag is | ||
now required on new optional fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe rephrase this to avoid interleaving optional ... required ... optional
wording (even though it refers to different contexts)
Maybe something like this:
This is expected to change in the future, and new fields should explicitly set either an
+optional
or+required
comment tag.
B -- No --> D{Is the field optional, or required?}; | ||
D -- Optional --> E[Does the field have a built-in nil value?]; | ||
E -- Yes --> F[Do not use a pointer, use omitempty]; | ||
E -- no --> C; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E -- no --> C; | |
E -- No --> C; |
A[Start] --> B{Is the zero value a valid user choice?}; | ||
B -- Yes --> C[Use a pointer and omitempty]; | ||
B -- No --> D{Is the field optional, or required?}; | ||
D -- Optional --> E[Does the field have a built-in nil value?]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D -- Optional --> E[Does the field have a built-in nil value?]; | |
D -- Optional --> E[Does the field type have a built-in nil value?]; |
graph TD; | ||
A[Start] --> B{Is the zero value a valid user choice?}; | ||
B -- Yes --> C[Use a pointer and omitempty]; | ||
B -- No --> D{Is the field optional, or required?}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
B -- No --> D{Is the field optional, or required?}; | |
B -- No --> D{Is the field optional or required?}; |
A[Start] --> B{Is the zero value a valid user choice?}; | ||
B -- Yes --> C[Use a pointer and omitempty]; | ||
B -- No --> D{Is the field optional, or required?}; | ||
D -- Optional --> E[Does the field have a built-in nil value?]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be simpler to inline the types which do have built-in nil values? "Does the field type have a built-in nil value (map or slice)?")
- When a field is required, and the zero value is not valid, a structured client who has not expressed an explicit choice will have their request rejected by the API server based on the invalid value, rather than the field being unset. | ||
- For example, a string with a minimum length of 1; Validation would not understand if the field was unset, or set to the empty string deliberately, but would still reject the request because it did not meet the length requirements. | ||
- Technically, using a pointer in these cases is also acceptable, but not advised as it makes coding more complex, and increases the risk of nil pointer exceptions. | ||
- In these cases, not using `omitempty` provides the same result, but pollutes the marhsalled object with zero values and is not recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In these cases, not using `omitempty` provides the same result, but pollutes the marhsalled object with zero values and is not recommended. | |
- In these cases, not using `omitempty` provides the same result, but pollutes the marshaled object with zero values and is not recommended. |
|
||
- It can be difficult for implementors to anticipate all cases where an empty | ||
value might need to be distinguished from a zero value. | ||
- Structs are not omitted from encoder output even where `omitempty` is specified, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about omitzero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Omitzero should work, but as far as I know, we haven't tested this yet, or explored what would happen re-protobuf for built-in types. For now I'd rather not mention omitzero until we've had a chance to test it.
I'd see that as a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this -- we should be clear that "encoding" means a lot more than just JSON :)
To determine whether a field should be a pointer, consider the following: | ||
|
||
```mermaid | ||
graph TD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not in sync with Kal with whenrequired setting, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. KAL's optionalfields implementation of WhenRequired
is intended for CRD authors where the behaviour differs slightly.
For CRDs, because the validation of required vs optional happens at the openapi stage, and for any field where the zero value is not valid, openapi would pick this up before a structured go client were to see the object, we can safely assume for CRDs that for a field where the zero value isn't valid, that the user did not specify the zero value when we see the zero value in the Go struct.
With built-in types, all of the validation above is done using Go. And as such, we must be able to distinguish between unset and the zero value to be able to enforce that the zero value is not valid, hence a pointer is always required in these cases.
I think we should probably explain this somewhere as an addendum to this advice.
@@ -847,8 +847,8 @@ See https://github.com/kubernetes/kubernetes/issues/34641. | |||
|
|||
Note that for backward compatibility, any field that has the `omitempty` struct | |||
tag, and is not explicitly marked as `+required`, will be considered to be optional. | |||
This is expected to change in the future and having the `+optional` comment tag is | |||
now required on new optional fields. | |||
This is expected to change in the future, and new fields should explicitly set eith |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: eith => either
@@ -893,10 +893,22 @@ There are several implications of the above: | |||
- When a field is required, and the zero value is not valid, a structured client who has not expressed an explicit choice will have their request rejected by the API server based on the invalid value, rather than the field being unset. | |||
- For example, a string with a minimum length of 1; Validation would not understand if the field was unset, or set to the empty string deliberately, but would still reject the request because it did not meet the length requirements. | |||
- Technically, using a pointer in these cases is also acceptable, but not advised as it makes coding more complex, and increases the risk of nil pointer exceptions. | |||
- In these cases, not using `omitempty` provides the same result, but pollutes the marhsalled object with zero values and is not recommended. | |||
- In these cases, not using `omitempty` provides the same result, but pollutes the marhsaled object with zero values and is not recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: marhsaled
64c654f
to
566dd72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few tweaks, looks good overall
|
||
#### Serialization in CRDs | ||
|
||
When CRDs are admitted by the API server, openapi validation is applied to the object _prior_ to any structured client observing the object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When CRDs are admitted by the API server, openapi validation is applied to the object _prior_ to any structured client observing the object. | |
When custom resources are admitted by the API server, openapi validation is applied to the object _prior_ to any structured client observing the object. |
- For structs, the zero value can only be valid when the struct has no required fields, and does not require at least one property to be set. | ||
- Required structs should use `omitzero` to avoid marshalling the zero value. | ||
|
||
#### Serialization in CRDs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#### Serialization in CRDs | |
#### Serialization of custom resources |
which is messy; | ||
- having a pointer consistently imply optional is clearer for users of the Go | ||
language client, and any other clients that use corresponding types | ||
For a field where the zero value is not valid, the openapi validation will reject the object if the field is set, and is set to the zero value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a field where the zero value is not valid, the openapi validation will reject the object if the field is set, and is set to the zero value, | |
For a field where the zero value is not valid, the openapi validation will reject the object if the field is present and set to the zero value, |
|
||
Therefore, we ask that pointers always be used with optional fields that do not | ||
have a built-in `nil` value. | ||
This means that there is no need to be able to distinguish, in a CRD, between unset and the zero value for fields where the zero value is not valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that there is no need to be able to distinguish, in a CRD, between unset and the zero value for fields where the zero value is not valid. | |
This means that there is no need to distinguish in a custom resource between unset and the zero value for fields where the zero value is not valid. |
A[Start] --> B{Is the zero value a valid user choice?}; | ||
B -- Yes --> C[Use a pointer and omitempty]; | ||
B -- No --> D{Is the field optional or required?}; | ||
D -- Optional --> E[Does the field type have a built-in nil value (map or slice)?]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the (map or slice)?
bit is confusing mermaid... might need to escape that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you quote the whole string this is fixed
566dd72
to
48f3ac9
Compare
/lgtm |
tagging other chairs |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dims, JoelSpeed, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel I think we are good to move forward |
When the field type has a built-in `nil` value, such as a map or a slice, and | ||
your use case means that you need to be able to distinguish between | ||
"field not set" and "field set to an empty value", you should use a pointer to | ||
the type, even though it has a built-in `nil` value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we do not have any pointer-to-{slice,map} fields and I have told people in the past that we DO NOT want them. This makes it sound like it would be OK, but I don't think it would (I strongly suspect some codegen would break). I was actually a bit surprised that JSON unmarshal handles it properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One use case I have seen is *[]corev1.Taint
. Where the use case will default "I have no opinion" to include a set of default taints, but, if you specifically do not want those taints, it documents that you should explicitly provide an empty list []
for the field.
Do you think that would not actually be possible then? What would you recommend instead for this use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it doesn't work as described. We tried to do this in network policy and gave up because it did not work.
The only answer I know, and the way we did it in straight proto (caveat: it's been a while) is to put a (pointer to) struct in between. Nil struct means default. Non-nul means the list is present. All languages can decode that into something intelligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about builtin types, for CRDs it looks like controller-gen works (we generate deep copy, conversion and CRD YAMLs). I just replaced a custom MarshalJSON func for []corev1.Taint (to preserve []
) with *[]corev1.Taint
and it worked out of the box
|
||
- It can be difficult for implementors to anticipate all cases where an empty | ||
value might need to be distinguished from a zero value. | ||
- Structs are not omitted from encoder output even where `omitempty` is specified, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this -- we should be clear that "encoding" means a lot more than just JSON :)
D -- Required --> F | ||
``` | ||
There are several implications of the above: | ||
- For lists and maps where the zero valid is a valid user choice, this means that `[]` and `{}` have a semantically different meaning than unset, in this case it is appropriate to use `*[]T` or `*map[T]S` respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above - I am not sure we want to allow this. IIRC protobuf doesn't have a way to represent this without an intermediate struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm... you're probably right about proto-persisted things.
For typed clients that use go but are backed on the server by a CRD, that means that serialization is always json and this is representable. I agree a pointer to a map or slice is weird. I think it's useful to describe this implication / edge, but would be fine discouraging making use of it ("avoid map or slice fields where {}
or []
is valid and treated differently from an absent value")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have one use case in #8486 (comment), but yes, when I've thought through this before, I've never worked out an issue with doing this in CRDs. If it's an issue for proto, then perhaps the right thing to do here is update the wording to say never do this for built-in (?) types, but it can work for custom resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps the right thing to do here is update the wording to say never do this for built-in (?) types
There are three areas rationale is based on:
- client-side behavior... making sure a client can transmit on the wire what it means
- server-side validation / defaulting behavior ... making sure a server can distinguish what the client sent in ways that let it validate and default sensibly
- server-side persistence / round-tripping behavior ... making sure a server can persist distinctions that matter
2 and 3 are definitely different for custom-resources vs built-in/server-typed resources, and custom resources are actually more predictable since they store as json and don't conflate absent and present-with-zero-value on the server side. I guess we could branch the guidance early on if we think it would lead to different/simpler/better recommendations for one case or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that the word "pointer" has no real meaning in CRD and in fact our API docs should describe a language-neutral data model. We happen to use Go to implement that model, and while Go has the concept of array, nil-slice, empty-slice, and pointer-to-slice, those are not part of our data model.
The question then is whether CRD should be able to do things that proto can't, because proto is static and CRD less so? I am fairly sure we could put tag numbers into CRDs and synthesize a proto descriptor, but we don't.
I think our data model should be limited to the common subset (basically "what proto can do"), for CRD /and/ built-in, otherwise we end up with divergent API patterns depending on implementation. Blech.
We support list (aka repeated) fields. They can have 0 or more items. They are not present or absent, they just have a size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We happen to use Go to implement that model, and while Go has the concept of array, nil-slice, empty-slice, and pointer-to-slice, those are not part of our data model.
Do we have any data on how many custom resource APIs for K8s are written not in Go? My assumption has always been that the majority are, and therefore having guidance that applies to the majority, and helps prevent them from making serialization mistakes has value. Perhaps the missing part is that we don't talk about these being Go specific in the guidance?
@liggitt and I were discussing serialization of required fields in a thread and realised that there are valid cases where a required field should be a pointer and should be omitempty.
Take the example:
Here, the value
0
is a valid user choice.For a structured Go client, if the field were not a pointer, and not omitempty per the existing guidance, then the structured client is not actually required to set a value for the field, as the json marshalling process will marshal
foo: 0
and set a choice for them.This defeats the purpose of the field being required for structured clients, and creates a discrepancy in behaviour between structured and unstructured clients.
Would appreciate thoughts of other API reviewers on this
CC @thockin @deads2k