Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd3 store: retry with live object on conflict if there was a suggestion #43152

Conversation

ncdc
Copy link
Member

@ncdc ncdc commented Mar 15, 2017

Retry with a live object instead of the cached version if the watch
cache receives a conflict trying to do the update.

Fixes #41892

Fix spurious resource version conflict error during `kubectl replace`.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 15, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Mar 15, 2017
@ncdc ncdc added kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed release-note-label-needed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 15, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Mar 15, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@ncdc
Copy link
Member Author

ncdc commented Mar 15, 2017

Was the change to use the watch cache introduced in 1.5 or 1.6? I'm wondering if we need a release note.

@ncdc
Copy link
Member Author

ncdc commented Mar 15, 2017

If we want this for 1.6, we should move #41892 back to the v1.6 milestone too

@ncdc ncdc force-pushed the watch-cache-retry-live-object-on-conflict branch from 75bcb7c to 8e72586 Compare March 15, 2017 19:51
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 15, 2017
return nil
}

func TestGuaranteedUpdate(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give this test with a more descriptive name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiang90 TestGuaranteedUpdateUsesLiveObjectOnConflict?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DropSuggestionsOnConflict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@xiang90
Copy link
Contributor

xiang90 commented Mar 15, 2017

@ncdc

I feel it would be cleaner just do a pass through inside storage's GuaranteedUpdate func. But I guess we might argue to keep the actual storage implement simple for potential other backends.

I do not have strong feeling here.
So LGTM.

@ncdc
Copy link
Member Author

ncdc commented Mar 15, 2017

@xiang90 I don't understand what you're suggesting. Could you please clarify?

@liggitt
Copy link
Member

liggitt commented Mar 15, 2017

I'm trying to put together a test to satisfy myself that this does the right thing for Updates on AllowCreateOnUpdate items

@xiang90
Copy link
Contributor

xiang90 commented Mar 15, 2017

I don't understand what you're suggesting. Could you please clarify?

The storage implementation with curObj here: https://github.com/kubernetes/kubernetes/pull/43152/files#diff-dc17590f6b960d2fea2b44de5c0fbc10R529

The storage implementation can be smarter about dropping the suggested curObj and try again inside the func. So we do not need to explicitly drop the curObj outside the storage implementation.

@ncdc
Copy link
Member Author

ncdc commented Mar 15, 2017

@xiang90 ok I understand, thanks. I'll wait for more review comments before I make any adjustments.

@ncdc
Copy link
Member Author

ncdc commented Mar 15, 2017

@k8s-bot gce etcd3 e2e test this
@k8s-bot cvm gce e2e test this
@k8s-bot kops aws e2e test this
@k8s-bot gci gce e2e test this

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enj, liggitt, ncdc, wojtek-t

Associated issue: 41892

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@wojtek-t
Copy link
Member

@ncdc @liggitt - what do you think about cherrypicking it? I think we should.

@liggitt
Copy link
Member

liggitt commented Sep 15, 2017

@ncdc @liggitt - what do you think about cherrypicking it? I think we should.

absolutely, especially given the etcd bug. I'd like to see this in 1.7 and 1.6

@k8s-cherrypick-bot
Copy link

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@liggitt
Copy link
Member

liggitt commented Sep 15, 2017

separately, it seems like we'd want to pick up an etcd level with a fix for that bug

@liggitt
Copy link
Member

liggitt commented Sep 15, 2017

/retest

1 similar comment
@liggitt
Copy link
Member

liggitt commented Sep 16, 2017

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 52176, 43152). If you want to cherry-pick this change to another branch, please follow the instructions here..

@k8s-github-robot k8s-github-robot merged commit d48611a into kubernetes:master Sep 16, 2017
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Sep 16, 2017

@ncdc: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test bf33df1 link /test pull-kubernetes-bazel-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@liggitt
Copy link
Member

liggitt commented Sep 17, 2017

will prep a pick to 1.7 (along with #48394 which was never picked as well)

k8s-github-robot pushed a commit that referenced this pull request Sep 20, 2017
…#43152-upstream-release-1.6

Automatic merge from submit-queue.

Automated cherry pick of #48394 #43152

Cherry pick of #48394 #43152 on release-1.6.

#48394: GuaranteedUpdate must write if stored data is not canonical
#43152: etcd3 store: retry w/live object on conflict
@wojtek-t wojtek-t added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 2, 2017
k8s-github-robot pushed a commit that referenced this pull request Oct 2, 2017
…#43152-upstream-release-1.7

Automatic merge from submit-queue.

Automated cherry pick of #48394 #43152

Cherry pick of #48394 #43152 on release-1.7.

#48394: GuaranteedUpdate must write if stored data is not canonical
#43152: etcd3 store: retry w/live object on conflict
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet