Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collapse patch conflict retry onto GuaranteedUpdate #63146

Merged
merged 2 commits into from
Apr 26, 2018

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented Apr 25, 2018

xref #63104

This PR builds on #62868

  1. When the incoming patch specified a resourceVersion that failed as a precondition, the patch handler would retry uselessly 5 times. This PR collapses onto GuaranteedUpdate, which immediately stops retrying in that case.

  2. When the incoming patch did not specify a resourceVersion, and persisting to etcd contended with other etcd updates, the retry would try to detect patch conflicts with deltas from the first 'current object' retrieved from etcd and fail with a conflict error in that case. Given that the user did not provide any information about the starting version they expected their patch to apply to, this does not make sense, and results in arbitrary conflict errors, depending on when the patch was submitted relative to other changes made to the resource. This PR changes the patch application to be performed on the object retrieved from etcd identically on every attempt.

fixes #58017
SMP is no longer computed for CRD objects

fixes #42644
No special state is retained on the first attempt, so the patch handler correctly handles the cached storage optimistically trying with a cached object first

/assign @lavalamp

fixed spurious "unable to find api field" errors patching custom resources

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 25, 2018
@lavalamp
Copy link
Member

Looks good so far.

If we add the 3rd mode I discussed in the issue we'd likely want to start by resurrecting some of this code, but I think we can just get it from git history in that case.

@liggitt
Copy link
Member Author

liggitt commented Apr 25, 2018

If we add the 3rd mode I discussed in the issue we'd likely want to start by resurrecting some of this code, but I think we can just get it from git history in that case.

Agree on pulling from git if we need it. Some of the structure might end up similar, but we'd want to be informed by the SMP and stale-first-version issues in how we reintroduced pinning state and computing delta patches to the first version seen.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2018
@liggitt liggitt changed the title WIP - collapse patch conflict retry onto GuaranteedUpdate collapse patch conflict retry onto GuaranteedUpdate Apr 25, 2018
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 25, 2018
@wojtek-t
Copy link
Member

/assign

@liggitt - if I will be slow with review, don't block on me (but I will try to look tomorrow)

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 25, 2018
@liggitt
Copy link
Member Author

liggitt commented Apr 25, 2018

rebased

"k8s.io/kubernetes/test/integration/framework"
)

// Tests that the apiserver retries non-overlapping conflicts on patches
// Tests that the apiserver retries patches
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will admit that this test change is making me a bit nervous, but I think it's correct. Is the behavior when RV is set tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the behavior when RV is set tested?

I added the ability to directly specify patches in the unit tests and added tests for a patch with a stale RV (one that doesn't even match the current object) and a racing RV (one that matches the current object, but encounters a mismatch persisting the update). That test update is in a separate commit.

@@ -549,8 +549,7 @@ func TestPatchResourceWithConflict(t *testing.T) {
startingPod: &example.Pod{},
changedPod: &example.Pod{},
updatePod: &example.Pod{},

expectedError: `Operation cannot be fulfilled on pods.example.apiserver.k8s.io "foo": existing 2, new 1`,
expectedPod: &example.Pod{},
Copy link
Member Author

@liggitt liggitt Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test computes and sends a patch of {"spec":{"nodeName":"there"}} (computed from delta(startingPod, changedPod)), which applies cleanly on top of updatePod, and should succeed in a retry, not fail (see PR description for removal of conflicts-with-delta-from-first-seen-current-object behavior).

As a follow-up, it would be good to refactor these tests to make it clearer what patch is actually being sent and tested

builds on kubernetes#62868

1. When the incoming patch specified a resourceVersion that failed as a precondition,
the patch handler would retry uselessly 5 times. This PR collapses onto GuaranteedUpdate,
which immediately stops retrying in that case.

2. When the incoming patch did not specify a resourceVersion, and persisting to etcd
contended with other etcd updates, the retry would try to detect patch conflicts with
deltas from the first 'current object' retrieved from etcd and fail with a conflict error
in that case. Given that the user did not provide any information about the starting version
they expected their patch to apply to, this does not make sense, and results in arbitrary
conflict errors, depending on when the patch was submitted relative to other changes made
to the resource. This PR changes the patch application to be performed on the object retrieved
from etcd identically on every attempt.

fixes kubernetes#58017
SMP is no longer computed for CRD objects

fixes kubernetes#42644
No special state is retained on the first attempt, so the patch handler correctly handles
the cached storage optimistically trying with a cached object first
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 26, 2018
@liggitt
Copy link
Member Author

liggitt commented Apr 26, 2018

I vote we go ahead with this. Can you squash?

done. original commits are at https://github.com/liggitt/kubernetes/commits/archive/remove-patch-retry

I wonder if we'll see a performance improvement after this merges?

I expect we'll see some. Internal patch retry just got a lot cheaper.

Maybe we should set a release note on this though.

I'll note the bugs fixed, at least. If we can come up with a comprehensible version of "uncoordinated patch requests no longer fail randomly with conflict errors", I'm happy to include it

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Apr 26, 2018
@wojtek-t
Copy link
Member

LGTM (not lgtm-ing to let @lavalamp to look again)

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 26, 2018
@lavalamp
Copy link
Member

/lgtm

In the middle of the night I was worrying about what happens if someone used a precondition other than RV, but I think it is still fine, even in that case, the base object was effectively arbitrary.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 26, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, liggitt, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 6aad80c into kubernetes:master Apr 26, 2018
@liggitt
Copy link
Member Author

liggitt commented Apr 26, 2018

having removed the two test workarounds for TestPatch flakes as part of this, I'll keep an eye on CI triage results over the next couple days:

https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&test=TestPatch

@liggitt liggitt deleted the remove-patch-retry branch April 26, 2018 20:05
liggitt added a commit to liggitt/kubernetes that referenced this pull request Jul 13, 2018
Minimal backport of kubernetes#63146

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object
liggitt added a commit to liggitt/kubernetes that referenced this pull request Jul 13, 2018
Minimal backport of kubernetes#63146

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object
k8s-github-robot pushed a commit that referenced this pull request Jul 21, 2018
Automatic merge from submit-queue.

Remove patch retry conflict detection

Minimal backport of #63146
Fixes #58002

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object

```release-note
fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready
```
k8s-github-robot pushed a commit that referenced this pull request Jul 21, 2018
Automatic merge from submit-queue.

Remove patch retry conflict detection

Minimal backport of #63146
Fixes #58002

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object

```release-note
fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready
```
killwing pushed a commit to qiniu-ava/kubernetes that referenced this pull request Aug 7, 2018
* avoid dobule RLock() in cpumanager

* use ListByResourceGroup instead of List()

* Fix AWS NLB delete error

* Let the kubernetes service reconciler timeout on shutdown

* Skip NodeAffinity validation test.

NodeAffinity validation logic changed in 1.10.2, and the tests were
updated to reflect it. E2E upgrade tests use 1.9 tests against a 1.10
server, though, so this specific test fails.

Fix it by calling framework.Skipf when the server version is >=1.10.2.
This test was deleted in 1.10.2.
See kubernetes#63815

* Always track kubelet -> API connections

* Close all kubelet->API connections on heartbeat failure

* update cadvisor godeps to v0.28.4 to fix container start times

* Scrape service for all nodeports

* fix IsLikelyNotMountPoint func on Windows

* add IsLikelyNotMountPoint test on Windows

fix comments

fix comments on unit test

fix comments

* Only mount subpath as readonly if specified in volumeMount

* Fix panic while provisioning Azure security group rules

* Bump version of prometheus-to-sd.

* Cacher stopLock should be unlocked

* fix typos

* Create system:cluster-autoscaler account & role and introduce it to CA start-up script

* apiextensions: fix concurrent map access copying items' ObjectMeta in Unstructured

The list+get endpoints sets the self-link. If we do not create a (shallow)
copy of ObjectMeta this will mutate the cached objects.

* Fix some log issues in flexvolume

* bump(github.com/json-iterator/go): f2b4162afba35581b6d4a50d3b8f34e33c144682

* v1.9: make json serializer case sensitive

* update staging godeps

* update cAdvisor godeps to v0.28.5

* Disable session affinity for internal kuberntes service

Under following conditions session affinity leads to a deadlock:
  - Self hosted controller-manager, where it talks to API servers
    via kubernetes service ClusterIP
  - default master-count reconcilier is used
  - --apiserver-count is set to >1 according to the help message
  - number of responsive APIServers goes below `apiserver-count`
  - all controller-managers happen to be hashed to APIServers which
    are down.

What then happens is that controller managers never be able to
contact APIServer, despite correctly working APIServer available.

Less serious outages also possible for other consumers of kubernetes
service, such as operators, kube-dns, flannel & calico, etc.  There is
always non zero chance, that given consumer is hashed  to an apiserver
which is down.

Revert "give the kubernetes service client ip session affinity"
This reverts commit e21ebbc.

* Remove item from taint manager workqueue on completion

* Fix scheduler config decoding

* Openstack: Fill size attribute for the V3 API volumes

The getVolume method in OpenStack provider is not
filling the Size for the V3 API type volumes. This
breaks the PV resizing of Cinder volumes.

* update NPD version to v0.5.0 for gci

* etcd: reuse leases for keys in a time window

Reuse leases for keys in a time window, to reduce the overhead to etcd
caused by using massive number of leases

Fixes kubernetes#47532

* fix formatAndMount func issue on Windows

fix comments

* add formatAndMount unit test on Windows

fix comments

fix comments

fix comments

fix comments

* BUGFIX: must use ID, not name, of the node security group when adding rules to it

* In case storage class parameters are empty, create a new map for Portworx volume labels

Fixes kubernetes#64894

Signed-off-by: Harsh Desai <harsh@portworx.com>

* Kubernetes version v1.9.10-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.9.md for v1.9.9.

* Bug fix: Should allow alias range size equals to max number of pods * 2

* Reload systemd config files before starting kubelet.

In some environments, the os image comes with preloaded kubelet.service,
so we need to reload systemctl configs to make changes effective.

* fix smb mount security issue

* Update Cluster Autoscaler version to 1.1.3

* GC: remove CRD and APIService from ignored resources

* Update Calico addon yamls to make it work for both 2.x and 3.x. versions.

Co-authored-by: Casey Davenport <casey@tigera.io>

* A few cleanups (remove duplicated env vars & unnecessary comments) on yaml files.

* Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE.

* Remove unnecessary spaces ahead of custom yaml.

* Fix pod worker deadlock.

Signed-off-by: Lantao Liu <lantaol@google.com>

* Fix NPD preload.

* Compare stateful set updates semantically

* Remove patch retry conflict detection

Minimal backport of kubernetes#63146

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object

* Don't validate HealthzBindAddress in KubeProxyConfiguration if it's empty

* Fix locating resporce-pool for volume provisioning

* feature gate influxdb test

influxdb test has been removed as of 1.11 so it should be behind
a feature gate for older releases

* client-go: fix arg in t.Fatalf

* Kubernetes version v1.9.11-beta.0 openapi-spec file updates

* Add/Update CHANGELOG-1.9.md for v1.9.10.
ypnuaa037 added a commit to ypnuaa037/kubernetes that referenced this pull request Jan 25, 2019
Minimal backport of kubernetes#63146

Fixes spurious patch errors for CRDs
Fixes patch errors for nodes when the watch cache has a persistently stale version of an object

PR: kubernetes#66171
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
5 participants