Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JSON patch for many VolumeSnapshot and VolumeSnapshotContent updates #526

Merged

Conversation

ggriffiths
Copy link
Member

@ggriffiths ggriffiths commented May 26, 2021

Signed-off-by: Grant Griffiths ggriffiths@purestorage.com

What type of PR is this?
/kind bug

What this PR does / why we need it:
This PR addresses the "object has been modified" issue we see a lot in the snapshot-controller/snapshotter.

Before this PR, we were seeing ~30-50 occurrences of this error in the snapshot-controller/snapshotter logs.

This PR only addresses the problematic areas found in the e2e PR tests. If we continue to see more of these errors, we can address replace the remaining Update/UpdateStatus calls in a separate PR.

Which issue(s) this PR fixes:
Fixes #525

Special notes for your reviewer:
n/a

Does this PR introduce a user-facing change?:

Replaces many VolumeSnapshot/VolumeSnapshotContent Update/UpdateStatus operations with Patch. This lowers the probability of the "object has been modified" update API errors occurring. This change introduces a dependency on two new RBAC rules for the snapshotter: volumesnapshotcontents:patch, volumesnapshotcontents/status:patch and four new RBAC rules for the snapshot-controller: volumesnapshotcontents:patch, volumesnapshotcontents/status:patch, volumesnapshots:patch, and volumesnapshots/status: patch.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 26, 2021
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 26, 2021
@ggriffiths
Copy link
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2021
@ggriffiths
Copy link
Member Author

ggriffiths commented May 26, 2021

50 error occurrences with MergePatchType as expected with ccfe3c7

$ curl -s https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-csi_external-snapshotter/526/pull-kubernetes-csi-external-snapshotter-1-21-on-kubernetes-1-21/1397628215157067776/artifacts/kube-system/snapshot-controller-58f54d8f6c-btxgf/snapshot-controller.log | grep "object has been modified" | wc -l
50

Trying with JSONPatch now as suggested.

@ggriffiths
Copy link
Member Author

JSONPatch seems to bring this count down for the one area I added it (updateContentErrorStatusWithEvent)

$ curl -s https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-csi_external-snapshotter/526/pull-kubernetes-csi-external-snapshotter-1-21-on-kubernetes-1-21/1397649661208039424/artifacts/kube-system/snapshot-controller-58f54d8f6c-s46br/snapshot-controller.log | grep "object has been modified" | wc -l
34

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 26, 2021
@ggriffiths ggriffiths force-pushed the updatecontentstatuserror_patch branch 4 times, most recently from ab6307e to a635fdc Compare May 27, 2021 02:35
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2021
@ggriffiths ggriffiths force-pushed the updatecontentstatuserror_patch branch from abd5ce6 to ab4503b Compare July 1, 2021 23:42
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2021
@ggriffiths ggriffiths closed this Jul 21, 2021
@ggriffiths ggriffiths force-pushed the updatecontentstatuserror_patch branch from ab4503b to e6e14c1 Compare July 21, 2021 19:57
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 21, 2021
@ggriffiths ggriffiths reopened this Jul 21, 2021
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 21, 2021
@ggriffiths ggriffiths force-pushed the updatecontentstatuserror_patch branch from 85a2612 to 294be83 Compare July 21, 2021 20:20
@ggriffiths
Copy link
Member Author

@xing-yang the latest PR tests have zero occurrences of the object has been modified API errors! Ready for review.

Thanks @bswartz for the JSON patch suggestion!

@ggriffiths
Copy link
Member Author

/assign @xing-yang

@xing-yang
Copy link
Collaborator

Thanks @ggriffiths! This is great progress!

- Also address PR feedback re: avoid a deepCopy for annotations patch

Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
@ggriffiths
Copy link
Member Author

Will squash these commits once an initial LGTM is given. Smallers commits have been useful while testing whether or not individual patch additions break the build.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Sep 30, 2021

@ggriffiths: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-csi-external-snapshotter-1-18-on-kubernetes-1-18 ab4503b link /test pull-kubernetes-csi-external-snapshotter-1-18-on-kubernetes-1-18

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ggriffiths
Copy link
Member Author

Ready for re-review @xing-yang

Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
@ggriffiths ggriffiths force-pushed the updatecontentstatuserror_patch branch from 0c60d1e to ad7dfe6 Compare October 5, 2021 22:39
@xing-yang
Copy link
Collaborator

Good job @ggriffiths!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggriffiths, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 5, 2021
@k8s-ci-robot k8s-ci-robot merged commit 8ba6ab7 into kubernetes-csi:master Oct 5, 2021
@xing-yang xing-yang added kind/bug Categorizes issue or PR as related to a bug. and removed kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Dec 30, 2021
jsafrane added a commit to jsafrane/cluster-csi-snapshot-controller-operator that referenced this pull request Jan 25, 2022
The operand now requires permissions to patch VolumeSnapshot +
VolumeSnapshotContent (+ their statuses).

See kubernetes-csi/external-snapshotter#526
@ggriffiths ggriffiths mentioned this pull request Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VolumeSnapshotContent update fails with "object has been modified"
3 participants