Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing resource version when updating the scale subresource of custom resource #80572

Merged
merged 2 commits into from Nov 14, 2019

Conversation

@knight42
Copy link
Contributor

knight42 commented Jul 25, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
Not to clear the resource version of custom resource when saving it to etcd.

Which issue(s) this PR fixes:
Fixes #80515

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Scale custom resource unconditionally if resourceVersion is not provided

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jul 25, 2019

Hi @knight42. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knight42 knight42 changed the title fix: be able to scale CustomResource fix: be able to scale CustomResource using kubectl Jul 25, 2019
@k8s-ci-robot k8s-ci-robot requested review from juanvallejo and rootfs Jul 25, 2019
@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 28, 2019

/assign @apelisse

@apelisse

This comment has been minimized.

Copy link
Member

apelisse commented Jul 28, 2019

I don't think that we should be doing a GET before the request. Scale is an imperative command and unless you specify the resource-version, it should just be executed as specified.

This comment from #80515 in particular lets me think that there is something wrong:

On that note, I also noticed that if you want to update a CR by crafting the request manually and using curl, you also need to specify the resource version or you will get the same error. I'm not sure if this is expected and related, but I'm writing it down as it might be related. As far as I know, this is not required for some native resources at all, but I'm not sure is this the case for CRDs.

I'd like to understand why that is the case, I suspect it may be a bug/behavior we don't understand in the CRD handler.

@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 28, 2019

@apelisse It makes sense, I' ll try to deep dive into it.

@knight42 knight42 changed the title fix: be able to scale CustomResource using kubectl [WIP] fix: be able to scale CustomResource using kubectl Jul 28, 2019
@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 28, 2019

@apelisse I thought I have found out the root cause.

When updating an object, apiserver will check if the object could be updated unconditionally:

https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L481-L489

If not, the validation will fail:

https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L531-L537

And custom resource could not be updated unconditionally now:

https://github.com/kubernetes/apiextensions-apiserver/blob/102230e288fd77afcf1a6e7258eac008a891d885/pkg/registry/customresource/strategy.go#L158-L161

@apelisse Do you have a clue why custom resource could not do an unconditional update?

@apelisse

This comment has been minimized.

Copy link
Member

apelisse commented Jul 29, 2019

@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 29, 2019

@apelisse Yeah it is, but the problem is still related to https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L531-L537.

Here is how it goes:
When the handler is handling the update request of scale subresource, the scale variable defined here is actually the request body sent from kubectl, which contains no resource version. Later the handler executes cr.SetResourceVersion(scale.ResourceVersion), the resource version of CR is still not set. As a result, r.store.Update, which invokes https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L453, returns an error because of missing resource version

So I guess the fix may be simply removing this line:

since the CR got in the handler contains resource version. What do you think?

@k8s-ci-robot k8s-ci-robot added size/XS and removed size/S labels Jul 31, 2019
@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 31, 2019

@apelisse PTAL

@knight42 knight42 changed the title [WIP] fix: be able to scale CustomResource using kubectl fix: be able to scale CustomResource using kubectl Jul 31, 2019
@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Jul 31, 2019

/assign @lavalamp

@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Nov 10, 2019
@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Nov 10, 2019

one comment update, and one additional test, then this LGTM

@knight42 knight42 force-pushed the knight42:fix/scale-cr branch from 61d07f1 to af755f2 Nov 11, 2019
@knight42

This comment has been minimized.

Copy link
Contributor Author

knight42 commented Nov 11, 2019

@liggitt I found that the "retry on conflicts" mechanism may not be optimal. As shown in https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/80572/pull-kubernetes-bazel-test/1193803978668773382, in extreme cases, such as frequent concurrent patches, we may keep retrying until timeout then return the last error. So I decided to switch to patching the replicas filed on the server side, how is that sound to you?

@knight42 knight42 force-pushed the knight42:fix/scale-cr branch from e231b22 to 73e46f7 Nov 12, 2019
@knight42 knight42 force-pushed the knight42:fix/scale-cr branch from 73e46f7 to dc2d639 Nov 13, 2019
@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Nov 13, 2019

two comments on the test, then lgtm

Signed-off-by: knight42 <anonymousknight96@gmail.com>
@knight42 knight42 force-pushed the knight42:fix/scale-cr branch from dc2d639 to da24601 Nov 14, 2019
@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Nov 14, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 14, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knight42, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Nov 14, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit a6f51da into kubernetes:master Nov 14, 2019
15 checks passed
15 checks passed
cla/linuxfoundation knight42 authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-kind Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
@knight42 knight42 deleted the knight42:fix/scale-cr branch Nov 14, 2019
@aermakov-zalando

This comment has been minimized.

Copy link

aermakov-zalando commented Jan 24, 2020

Any chance for this to be backported to 1.15/1.16? @liggitt

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Jan 24, 2020

This is a more invasive change than is typically backported. Note that #81342 made it into 1.16 and modifies kubectl to use patch when scaling, which works with custom resources in 1.15/1.16 servers.

@aermakov-zalando

This comment has been minimized.

Copy link

aermakov-zalando commented Jan 24, 2020

@liggitt I see, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.