Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow update/patch of CRs while CRD is terminating #60542

Merged
merged 1 commit into from Mar 1, 2018

Conversation

@liggitt
Copy link
Member

liggitt commented Feb 28, 2018

Fixes #60538

Update/patch need to be allowed so finalizers can act on custom resources for terminating CRDs

Fixes potential deadlock when deleting CustomResourceDefinition for custom resources with finalizers
@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Feb 28, 2018

@liggitt liggitt force-pushed the liggitt:terminating-crd branch from 8e0cb60 to df7f96f Feb 28, 2018

@k8s-ci-robot k8s-ci-robot added size/S and removed size/XS labels Feb 28, 2018

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Feb 28, 2018

Can we restrict the mutations to finalizer changes?

@ncdc

This comment has been minimized.

Copy link
Member

ncdc commented Feb 28, 2018

@sttts is there an easy way to check for that?

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Feb 28, 2018

Can we restrict the mutations to finalizer changes?

I don't think so... part of the finalizer's work could involve moving other bits of the object through a workflow that involves changing fields.

Is there a concrete reason to disallow update/patch operations while terminating? Disallowing create makes a lot of sense.

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Feb 28, 2018

We quite certainly do not want spec (if we ever support e.g. versions), status and scale changes. I can imagine that there are other metadata changes.

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Feb 28, 2018

An alternative: we could also delay the termination (= removal of CRs) until customresourcecleanup.apiextensions.k8s.io is the last remaining finalizer, then mark it read-only and start removing objects.

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Feb 28, 2018

in this case, it was finalizers on the custom resource, not the CRD, that were the issue. customresourcecleanup.apiextensions.k8s.io was already the only finalizer remaining on the CRD

@sttts sttts changed the title Allow update/patch of CRD while terminating Allow update/patch of CR while CRD is terminating Feb 28, 2018

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Feb 28, 2018

in this case, it was finalizers on the custom resource, not the CRD, that were the issue

right, updated the title of the issue.

@sttts sttts changed the title Allow update/patch of CR while CRD is terminating Allow update/patch of CRs while CRD is terminating Feb 28, 2018

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Feb 28, 2018

After we are talking about the same thing, I tend to agree that we can allow updates and patches. @deads2k you introduced this restriction originally. What was the motivation?

@deads2k

This comment has been minimized.

Copy link
Contributor

deads2k commented Feb 28, 2018

After we are talking about the same thing, I tend to agree that we can allow updates and patches. @deads2k you introduced this restriction originally. What was the motivation?

Didn't want new things as we cleaned up and I didn't want to risk any conflicts. I don't object to allowing update and patch as long as the cleanup tolerates retrying.

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Mar 1, 2018

I don't object to allowing update and patch as long as the cleanup tolerates retrying.

Yeah, deletecollection doesn't hit conflicts, and if there are pending finalizers, the CRD finalizer updates the condition saying it couldn't verify zero remaining instances and requeues

@k8s-cherrypick-bot

This comment has been minimized.

Copy link

k8s-cherrypick-bot commented Mar 1, 2018

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@ncdc

This comment has been minimized.

Copy link
Member

ncdc commented Mar 1, 2018

/lgtm

@deads2k

This comment has been minimized.

Copy link
Contributor

deads2k commented Mar 1, 2018

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Mar 1, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, liggitt, ncdc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deads2k

This comment has been minimized.

Copy link
Contributor

deads2k commented Mar 1, 2018

/status approved-for-milestone

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Mar 1, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Mar 1, 2018

Automatic merge from submit-queue (batch tested with PRs 60542, 60237). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 571b1e2 into kubernetes:master Mar 1, 2018

14 of 15 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
cla/linuxfoundation liggitt authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-unit Context retired. Status moved to "pull-kubernetes-integration".
pull-kubernetes-verify Job succeeded.
Details
@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Mar 1, 2018

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@liggitt liggitt deleted the liggitt:terminating-crd branch Mar 12, 2018

k8s-github-robot pushed a commit that referenced this pull request Mar 14, 2018

Kubernetes Submit Queue
Merge pull request #60649 from liggitt/automated-cherry-pick-of-#6054…
…2-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #60542: Allow update/patch of CRD while terminating

Cherry pick of #60542 on release-1.9.

#60542: Allow update/patch of CRD while terminating

satyasm pushed a commit to satyasm/kubernetes that referenced this pull request Mar 27, 2018

Kubernetes Submit Queue
Merge pull request kubernetes#60592 from xmudrii/crd-tests
Automatic merge from submit-queue (batch tested with PRs 61402, 61143, 61427, 60592). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apiextensions-apiserver: TestFinaliazationAndDeletion integration test

**What this PR does / why we need it**: This PR adds an integration test for [behavior described in the issue kubernetes#60538 (comment)](kubernetes#60538 (comment)).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Partially addresses kubernetes#60538 -- tests

**Special notes for your reviewer**: This is based of the PR kubernetes#60542, so this shouldn't be merged before that one. Tests are failing on the line 155, due to the timeout/non-nil error. I've tried to debug it furtherly, and found out that the CRD is still terminating. 

Actually, the CRD gets deleted, but it takes some time. For example, if you add `time.Sleep(x * time.Second)` on the line 152, the test would pass. The problem is this isn't predictable. With `x` of 6-10 seconds, I manage to get this test passing, but there's zero guarantees the CRD will get deleted for that 10 seconds.

I have found the `wait.Poll` function in some tests, but I'm not sure can it fix the problem I have.

This is my first PR to the Kubernetes project, so I could be missing something. Any idea is appreciated. :)

**Release note**:

```release-note
NONE
```

/sig api-machinery
/area custom-resources
/cc nikhita sttts liggitt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.