New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The metadata.generation of a Custom Resource is always incremented #69059

Merged
merged 1 commit into from Oct 29, 2018

Conversation

@caesarxuchao
Member

caesarxuchao commented Sep 25, 2018

The old behavior is that the metadata.generation never changes if the user doesn't participate the spec/status convention. See the release note for the new behavior.

We changed when the `metadata.generation` of a custom resource (CR) increments.

If the CR participates the spec/status convention, the metadata.generation of the CR increments when there is any change, except for the changes to the metadata or the changes to the status.

If the CR does not participate the spec/status convention, the metadata.generation of the CR increments when there is any change to the CR, except for changes to the metadata.

A CR is considered to participate the spec/status convention if and only if the "CustomResourceSubresources" feature gate is turned on and the CRD has `.spec.subresources.status={}`.

/assign @nikhita @mbohlool
/sig api-machinery

@neolit123

This comment has been minimized.

Member

neolit123 commented Sep 26, 2018

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug and removed needs-kind labels Sep 26, 2018

@@ -87,7 +87,7 @@ func (a customResourceStrategy) PrepareForCreate(ctx context.Context, obj runtim
// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (a customResourceStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
if !utilfeature.DefaultFeatureGate.Enabled(apiextensionsfeatures.CustomResourceSubresources) || a.status == nil {
if !utilfeature.DefaultFeatureGate.Enabled(apiextensionsfeatures.CustomResourceSubresources) {

This comment has been minimized.

@nikhita

nikhita Sep 26, 2018

Member

A CR can have the .status field, while opted-out of the /status endpoint. User would expect the metadata.generation to auto-increment.

My understanding is that .metadata.generation should be incremented only when the user specifically says that they want to follow spec/status conventions - which means they need to explicitly set the .subresources.status field in the CRD and opt-in to /status. Enabling the feature gate does not guarantee a status subresource because it could also mean that they only want to enable /scale.

Related: #58778 (comment)

This comment has been minimized.

@caesarxuchao

caesarxuchao Sep 26, 2018

Member

I almost forgot I had this early involvement in the 58778 :)

I think our disagreement is what signals the CR is following the spec/status convention. I think a CR that has a top level spec and status field is ready to follow the spec/status convention. But the current code assumes that's not enough. The current code assumes the signal is the enablement of the /status subresource.

Taking one step back, what's wrong if the generation is always auto-incremented, even if the CR doesn't follow the spec/status convention? The field isn't writable by users anyway.

This comment has been minimized.

@enisoc

enisoc Sep 26, 2018

Member

@caesarxuchao Do you have a use case in mind for wanting generation to increment even if the CRD author did not explicitly opt into the full spec/status convention (by enabling /status subresource)?

We intentionally chose not to infer that the CRD author wants to use the spec/status convention just because they have fields named spec and/or status. That would make it impossible to use those field names without participating in the convention, which would be a backwards-incompatible change.

Or are you just saying it's a confusing UX that you have to turn on a subresource to opt into the spec/status convention? If that's the case, perhaps there could be a separate field to enable generation for spec without enabling the /status subresource, with validation that you can't have the subresource without having generation too. I'm not aware of a strong use case for wanting those things separately, though.

This comment has been minimized.

@caesarxuchao

caesarxuchao Sep 26, 2018

Member

Maybe users just want to see how many times the spec has been changed.

The explanation in the second paragraph makes sense.

It's a confusing UX that I have to turn on a subresource to make apiserver manages my generation. Adding a specific field to turn on generation management is better than the status quo. But why is it bad to always let the apiserver manage the generation field as long as the CR has the spec field?

This comment has been minimized.

@nikhita

nikhita Sep 27, 2018

Member

I almost forgot I had this early involvement in the 58778 :)

:)

If that's the case, perhaps there could be a separate field to enable generation for spec without enabling the /status subresource, with validation that you can't have the subresource without having generation too.

IMHO adding a new field is too much complexity for this.

But why is it bad to always let the apiserver manage the generation field as long as the CR has the spec field?

Thinking about this more, this makes sense but I still have one concern.

The current behaviour is to increment the generation if the user opts in to /status i.e. they opt-in to spec and status conventions. We did this because the general expectation is that metadata.generation will be useful in a controller with a corresponding status.observedGeneration (which implies that the user should also opt-in to status). This is also what #58778 (comment) said.

But metadata.generation is technically a "per object sequence number" (taking the terminology from #7328). The sequence number of an object should not depend on the presence of status.observedGeneration. It should increment with every spec change, irrespective of whether it will prove useful.

This means that we just require the user to opt-in to spec conventions. Opting in to status conventions should not be a requirement. .spec holds special meaning for both /status and /scale. So, if one of these is enabled, I think we should increment generation.

My only concern is that just enabling the feature without opting in to /status or /scale doesn't guarantee that I want to follow spec conventions. Right now, enabling the feature gate does not hold much meaning unless the user specifically opts-in to either one of the subresources.

That would make it impossible to use those field names without participating in the convention, which would be a backwards-incompatible change.

With this change, we redefine the feature gate to also mean that .spec will participate in the spec conventions. I'm concerned that it will be a backwards-incompatible change.

(But this doesn't really help your use case :/)

@nikhita

This comment has been minimized.

Member

nikhita commented Sep 26, 2018

/area custom-resources

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Sep 27, 2018

My only concern is that just enabling the feature without opting in to /status or /scale doesn't guarantee that I want to follow spec conventions.

I agree that the CustomResourceSubresource feature flag doesn't imply if user wants to follow the spec conventions. I updated the code to always increment generation as long as spec is changed, no matter if the CustomResourceSubresource feature flag is enabled, and no matter if /status is installed. This unconditional increment is similar to PrepareForCreate, where the apiserver always sets generation to 1.

RE. backwards compatibility, I think unconditional increment is ok, because I can't find any reason why a user would depend on .generation always being 1.

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Sep 27, 2018

/retest

@enisoc

This comment has been minimized.

Member

enisoc commented Sep 27, 2018

I could buy that it makes sense to always increment generation, but if the user does not explicitly opt into spec/status convention, I argue we should increment if anything changes, rather than treating a duck-typed spec field specially.

The API reference defines generation as:

A sequence number representing a specific generation of the desired state. [emphasis added]

If a resource does not participate in the spec/status convention, which for backwards compatibility we have to assume if they do not explicitly opt in, then the default meaning of "desired state" is the entire object.

The only reason we treat spec specially is because, in the spec/status convention, status is set aside as a part of the object that does not represent desired state. If there's no status, there's nothing special about spec; the entire object is desired state. That's why it makes sense to tie "ignore status when incrementing generation" together with "turn on the /status subresource that only touches status", and "prevent status modification via the main resource".

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Sep 27, 2018

@enisoc, I agree with your reasoning in #69059 (comment).

However, I'm concerned with UX. The effect of crd.spec.subresources.status becomes very subtle. It decides the meaning of generation. Maybe my view is biased, this is how I came across this issue: I was experimenting with CR, I created it with .spec and .status. I didn't need a /status subresource initially. I would be surprised if changes to .status caused metadata.generation to increment. It's actually more surprising than the current behavior. OTOH, introducing crd.spec.follow-spec-convention sounds overly complicated.

@enisoc

This comment has been minimized.

Member

enisoc commented Oct 3, 2018

I created it with .spec and .status. I didn't need a /status subresource initially.

Even if you didn't need it, or if a hypothetical user doesn't know they need it, we wanted to force you to have a /status subresource before taking any action that would imply you have spec/status support in your custom resource.

One of our design goals was to treat "spec/status convention" as a holistic set of behaviors, which includes a /status subresource and special treatment of metadata.generation (modifying the default assumption that the whole object is part of "desired state"). We wanted to discourage people from opting in to only bits and pieces of the overall convention.

In retrospect, maybe we should have made it controlled by a top-level field like spec.useSpecStatus instead of burying it in spec.subresources.status, but I still would not want people to think they are participating in spec/status ("I magically got the special metadata.generation behavior, so I guess I'm good") without having a /status subresource.

I would be surprised if changes to .status caused metadata.generation to increment.

You're much more familiar with the inner workings of core controllers, though. If I try to imagine I'm a typical user, and I see this is how metadata.generation is defined:

A sequence number representing a specific generation of the desired state.

I would argue that typical user would be surprised if certain fields were treated specially by default. There's no mention in that definition of spec or status. Why isn't generation incrementing when I update my top-level .image field?

@smarterclayton

This comment has been minimized.

Contributor

smarterclayton commented Oct 6, 2018

We don’t require that all objects have spec and status, nor do we require that status be paired with spec, but we very strongly suggest both.

However, we do not bump generation when metadata is changed, so neither labels and annotations can be input to generation, even though in practice many controllers may rely on that action.

I would probably be inclined to have generation be bumped on CRDs be bumped when any non metadata field is updated, and if status is defined, any non-metadata, non-status field is updated.

@k8s-ci-robot k8s-ci-robot added size/L and removed size/S labels Oct 12, 2018

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 12, 2018

@enisoc PTAL. The new behavior is close to what you and Clayton described. Thanks.

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 12, 2018

/retest

1 similar comment
@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 12, 2018

/retest

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 12, 2018

The failed integration test has nothing to do with this PR:

I1012 17:48:16.994] apps.sh:515: FAIL!
I1012 17:48:16.994] Get pods -l "tier=frontend" {{range.items}}{{(index .spec.containers 0).name}}:{{end}}
I1012 17:48:16.994]   Expected: php-redis:php-redis:php-redis:
I1012 17:48:16.995]   Got:  

I'll retry later.

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 12, 2018

The flake is #69376
/retest

},
},
}
utilfeature.DefaultFeatureGate.Set(fmt.Sprintf("%s=true", apiextensionsfeatures.CustomResourceSubresources))

This comment has been minimized.

@nikhita

nikhita Oct 15, 2018

Member

We can use:

func SetFeatureGateDuringTest(t *testing.T, gate feature.FeatureGate, feature feature.Feature, value bool) func() {

which will make sure to restore the feature gate value once the test is finished.

This comment has been minimized.

@caesarxuchao

caesarxuchao Oct 15, 2018

Member

Thanks. Done.

@nikhita

This comment has been minimized.

Member

nikhita commented Oct 15, 2018

Overall looks great. 👍

if utilfeature.DefaultFeatureGate.Enabled(apiextensionsfeatures.CustomResourceSubresources) && a.status != nil {
// If the /status subresource endpoint is installed, it
// indicates that the user wants to participate the spec/status
// convention, then only changes to `.spec` increments the

This comment has been minimized.

@enisoc

enisoc Oct 16, 2018

Member

This is slightly different from what @smarterclayton suggested. Is there a reason for that?

I was expecting that in this case, we would delete metadata like below, then also delete status, and increment if anything else changed. This only matters in the weird case that someone has spec, status, and some other top-level field, but I agree with @smarterclayton that it's the most consistent definition of what Generation means.

} else {
// except for the changes to `metadata`, any other changes
// cause the generation to increment.
newObjCopy := newCustomResourceObject.DeepCopy()

This comment has been minimized.

@enisoc

enisoc Oct 16, 2018

Member

Are the scalability folks still policing new deep copies as much as they used to? I can think of a couple ways we might avoid them but it's not pretty. For example, you could do a shallow copy while omitting metadata, or make a helper that knows how to ignore metadata while checking for mismatches in other top-level fields on either side of the diff. I wouldn't think it's worthwhile, but I've been told in the past to avoid deep copy whenever possible.

oldAccessor, _ := meta.Accessor(oldCustomResourceObject)
newAccessor, _ := meta.Accessor(newCustomResourceObject)
newAccessor.SetGeneration(oldAccessor.GetGeneration() + 1)
}
// If the /status subresource endpoint is installed, update is not allowed to set status.

This comment has been minimized.

@enisoc

enisoc Oct 16, 2018

Member

It seems like if you do this before the if/else branch that sets incrementGeneration, you can delete everything in the if branch and just do the stuff in else every time, assuming that you accept the suggestion from @smarterclayton on what Generation means when the resource has opted into spec/status convention.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Oct 16, 2018

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 16, 2018

@enisoc I updated the code to what you suggested, PTAL. Thanks.

@enisoc

enisoc approved these changes Oct 16, 2018

/lgtm

incrementGeneration = true
}
if incrementGeneration {

This comment has been minimized.

@enisoc

enisoc Oct 16, 2018

Member

Nit: The bool var doesn't seem necessary anymore. Can't you just put this block under if !DeepEqual?

This comment has been minimized.

@caesarxuchao

caesarxuchao Oct 16, 2018

Member

Done. Mind applying the lgtm again? Thanks.

@k8s-ci-robot k8s-ci-robot added lgtm and removed lgtm labels Oct 16, 2018

@enisoc

This comment has been minimized.

Member

enisoc commented Oct 16, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Oct 16, 2018

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 17, 2018

/retest

if (ok1 && ok2 && !apiequality.Semantic.DeepEqual(oldSpec, newSpec)) || (ok1 && !ok2) || (!ok1 && ok2) {
// except for the changes to `metadata`, any other changes
// cause the generation to increment.
newObjCopy := newCustomResourceObject.DeepCopy()

This comment has been minimized.

@sttts

sttts Oct 17, 2018

Contributor

no need for a deepcopy here. Do a shallow-copy of the top-level map instead.

This comment has been minimized.

@caesarxuchao
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
)
func generation1() map[string]interface{} {

This comment has been minimized.

@sttts

sttts Oct 17, 2018

Contributor

wow, two 5 line helpers for expressions that are one-liners: map[string]interface{}{"generation": int64(1)}.

This comment has been minimized.

@caesarxuchao

caesarxuchao Oct 17, 2018

Member

I would trade 15 long one-liners with 2 five-line helpers :)

@sttts

This comment has been minimized.

Contributor

sttts commented Oct 17, 2018

Looks good overall. Some nits.

Changes when .metadata.generation of a CR increments.
If the custom resource participates the spec/status convention, the
metadata.generation of the CR increments whenever there is any change,
except for the changes to the metadata or the changes to the status.

If the CR does not participate the spec/status convention, the
metadata.generation of the CR increments whenever there is any change,
except for the changes to the metadata.

A CR is considered to participate the spec/status convention if and only if the
"CustomResourceSubresources" feature gate is turned on and when the CRD
has `.spec.subresources.status={}`.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Oct 17, 2018

@nikhita

This comment has been minimized.

Member

nikhita commented Oct 22, 2018

Lgtm from my side. Can't tag because GH is down.

@nikhita

This comment has been minimized.

Member

nikhita commented Oct 22, 2018

/lgtm

@nikhita

This comment has been minimized.

Member

nikhita commented Oct 23, 2018

@sttts can you approve?

@caesarxuchao

This comment has been minimized.

Member

caesarxuchao commented Oct 25, 2018

@sttts can you approve? Thanks.

@sttts

This comment has been minimized.

Contributor

sttts commented Oct 29, 2018

/approve

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Oct 29, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nikhita

This comment has been minimized.

Member

nikhita commented Oct 29, 2018

/retest

@k8s-ci-robot k8s-ci-robot merged commit 5cf9079 into kubernetes:master Oct 29, 2018

18 checks passed

cla/linuxfoundation caesarxuchao authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment