-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collapse patch conflict retry onto GuaranteedUpdate #63146
collapse patch conflict retry onto GuaranteedUpdate #63146
Conversation
494ecd6
to
70a83a9
Compare
Looks good so far. If we add the 3rd mode I discussed in the issue we'd likely want to start by resurrecting some of this code, but I think we can just get it from git history in that case. |
Agree on pulling from git if we need it. Some of the structure might end up similar, but we'd want to be informed by the SMP and stale-first-version issues in how we reintroduced pinning state and computing delta patches to the first version seen. |
70a83a9
to
9e6d4fa
Compare
9e6d4fa
to
12665fd
Compare
/assign @liggitt - if I will be slow with review, don't block on me (but I will try to look tomorrow) |
12665fd
to
8054065
Compare
rebased |
"k8s.io/kubernetes/test/integration/framework" | ||
) | ||
|
||
// Tests that the apiserver retries non-overlapping conflicts on patches | ||
// Tests that the apiserver retries patches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will admit that this test change is making me a bit nervous, but I think it's correct. Is the behavior when RV is set tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the behavior when RV is set tested?
I added the ability to directly specify patches in the unit tests and added tests for a patch with a stale RV (one that doesn't even match the current object) and a racing RV (one that matches the current object, but encounters a mismatch persisting the update). That test update is in a separate commit.
@@ -549,8 +549,7 @@ func TestPatchResourceWithConflict(t *testing.T) { | |||
startingPod: &example.Pod{}, | |||
changedPod: &example.Pod{}, | |||
updatePod: &example.Pod{}, | |||
|
|||
expectedError: `Operation cannot be fulfilled on pods.example.apiserver.k8s.io "foo": existing 2, new 1`, | |||
expectedPod: &example.Pod{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test computes and sends a patch of {"spec":{"nodeName":"there"}}
(computed from delta(startingPod
, changedPod
)), which applies cleanly on top of updatePod
, and should succeed in a retry, not fail (see PR description for removal of conflicts-with-delta-from-first-seen-current-object behavior).
As a follow-up, it would be good to refactor these tests to make it clearer what patch is actually being sent and tested
builds on kubernetes#62868 1. When the incoming patch specified a resourceVersion that failed as a precondition, the patch handler would retry uselessly 5 times. This PR collapses onto GuaranteedUpdate, which immediately stops retrying in that case. 2. When the incoming patch did not specify a resourceVersion, and persisting to etcd contended with other etcd updates, the retry would try to detect patch conflicts with deltas from the first 'current object' retrieved from etcd and fail with a conflict error in that case. Given that the user did not provide any information about the starting version they expected their patch to apply to, this does not make sense, and results in arbitrary conflict errors, depending on when the patch was submitted relative to other changes made to the resource. This PR changes the patch application to be performed on the object retrieved from etcd identically on every attempt. fixes kubernetes#58017 SMP is no longer computed for CRD objects fixes kubernetes#42644 No special state is retained on the first attempt, so the patch handler correctly handles the cached storage optimistically trying with a cached object first
8054065
to
b526532
Compare
done. original commits are at https://github.com/liggitt/kubernetes/commits/archive/remove-patch-retry
I expect we'll see some. Internal patch retry just got a lot cheaper.
I'll note the bugs fixed, at least. If we can come up with a comprehensible version of "uncoordinated patch requests no longer fail randomly with conflict errors", I'm happy to include it |
LGTM (not lgtm-ing to let @lavalamp to look again) /approve |
/lgtm In the middle of the night I was worrying about what happens if someone used a precondition other than RV, but I think it is still fine, even in that case, the base object was effectively arbitrary. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lavalamp, liggitt, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
having removed the two test workarounds for TestPatch flakes as part of this, I'll keep an eye on CI triage results over the next couple days: https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&test=TestPatch |
Minimal backport of kubernetes#63146 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object
Minimal backport of kubernetes#63146 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object
Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object ```release-note fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready ```
Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object ```release-note fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready ```
* avoid dobule RLock() in cpumanager * use ListByResourceGroup instead of List() * Fix AWS NLB delete error * Let the kubernetes service reconciler timeout on shutdown * Skip NodeAffinity validation test. NodeAffinity validation logic changed in 1.10.2, and the tests were updated to reflect it. E2E upgrade tests use 1.9 tests against a 1.10 server, though, so this specific test fails. Fix it by calling framework.Skipf when the server version is >=1.10.2. This test was deleted in 1.10.2. See kubernetes#63815 * Always track kubelet -> API connections * Close all kubelet->API connections on heartbeat failure * update cadvisor godeps to v0.28.4 to fix container start times * Scrape service for all nodeports * fix IsLikelyNotMountPoint func on Windows * add IsLikelyNotMountPoint test on Windows fix comments fix comments on unit test fix comments * Only mount subpath as readonly if specified in volumeMount * Fix panic while provisioning Azure security group rules * Bump version of prometheus-to-sd. * Cacher stopLock should be unlocked * fix typos * Create system:cluster-autoscaler account & role and introduce it to CA start-up script * apiextensions: fix concurrent map access copying items' ObjectMeta in Unstructured The list+get endpoints sets the self-link. If we do not create a (shallow) copy of ObjectMeta this will mutate the cached objects. * Fix some log issues in flexvolume * bump(github.com/json-iterator/go): f2b4162afba35581b6d4a50d3b8f34e33c144682 * v1.9: make json serializer case sensitive * update staging godeps * update cAdvisor godeps to v0.28.5 * Disable session affinity for internal kuberntes service Under following conditions session affinity leads to a deadlock: - Self hosted controller-manager, where it talks to API servers via kubernetes service ClusterIP - default master-count reconcilier is used - --apiserver-count is set to >1 according to the help message - number of responsive APIServers goes below `apiserver-count` - all controller-managers happen to be hashed to APIServers which are down. What then happens is that controller managers never be able to contact APIServer, despite correctly working APIServer available. Less serious outages also possible for other consumers of kubernetes service, such as operators, kube-dns, flannel & calico, etc. There is always non zero chance, that given consumer is hashed to an apiserver which is down. Revert "give the kubernetes service client ip session affinity" This reverts commit e21ebbc. * Remove item from taint manager workqueue on completion * Fix scheduler config decoding * Openstack: Fill size attribute for the V3 API volumes The getVolume method in OpenStack provider is not filling the Size for the V3 API type volumes. This breaks the PV resizing of Cinder volumes. * update NPD version to v0.5.0 for gci * etcd: reuse leases for keys in a time window Reuse leases for keys in a time window, to reduce the overhead to etcd caused by using massive number of leases Fixes kubernetes#47532 * fix formatAndMount func issue on Windows fix comments * add formatAndMount unit test on Windows fix comments fix comments fix comments fix comments * BUGFIX: must use ID, not name, of the node security group when adding rules to it * In case storage class parameters are empty, create a new map for Portworx volume labels Fixes kubernetes#64894 Signed-off-by: Harsh Desai <harsh@portworx.com> * Kubernetes version v1.9.10-beta.0 openapi-spec file updates * Add/Update CHANGELOG-1.9.md for v1.9.9. * Bug fix: Should allow alias range size equals to max number of pods * 2 * Reload systemd config files before starting kubelet. In some environments, the os image comes with preloaded kubelet.service, so we need to reload systemctl configs to make changes effective. * fix smb mount security issue * Update Cluster Autoscaler version to 1.1.3 * GC: remove CRD and APIService from ignored resources * Update Calico addon yamls to make it work for both 2.x and 3.x. versions. Co-authored-by: Casey Davenport <casey@tigera.io> * A few cleanups (remove duplicated env vars & unnecessary comments) on yaml files. * Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE. * Remove unnecessary spaces ahead of custom yaml. * Fix pod worker deadlock. Signed-off-by: Lantao Liu <lantaol@google.com> * Fix NPD preload. * Compare stateful set updates semantically * Remove patch retry conflict detection Minimal backport of kubernetes#63146 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object * Don't validate HealthzBindAddress in KubeProxyConfiguration if it's empty * Fix locating resporce-pool for volume provisioning * feature gate influxdb test influxdb test has been removed as of 1.11 so it should be behind a feature gate for older releases * client-go: fix arg in t.Fatalf * Kubernetes version v1.9.11-beta.0 openapi-spec file updates * Add/Update CHANGELOG-1.9.md for v1.9.10.
Minimal backport of kubernetes#63146 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object PR: kubernetes#66171
xref #63104
This PR builds on #62868
When the incoming patch specified a resourceVersion that failed as a precondition, the patch handler would retry uselessly 5 times. This PR collapses onto GuaranteedUpdate, which immediately stops retrying in that case.
When the incoming patch did not specify a resourceVersion, and persisting to etcd contended with other etcd updates, the retry would try to detect patch conflicts with deltas from the first 'current object' retrieved from etcd and fail with a conflict error in that case. Given that the user did not provide any information about the starting version they expected their patch to apply to, this does not make sense, and results in arbitrary conflict errors, depending on when the patch was submitted relative to other changes made to the resource. This PR changes the patch application to be performed on the object retrieved from etcd identically on every attempt.
fixes #58017
SMP is no longer computed for CRD objects
fixes #42644
No special state is retained on the first attempt, so the patch handler correctly handles the cached storage optimistically trying with a cached object first
/assign @lavalamp