Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: improve resiliency when conflicts arise when updating the kubeadm-config configmap #76821

Merged

Conversation

@ereslibre
Copy link
Member

commented Apr 19, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
Add the functionality to support CreateOrMutateConfigMap and MutateConfigMap.

  • CreateOrMutateConfigMap will try to create a given config map object; if this configmap
    already exists, a new version of the resource will be retrieved from the server and a
    mutator callback will be called on it. Then, an Update of the mutated object will be
    performed. If there's a conflict during this Update operation, retry until no conflict
    happens. On every retry the object is refreshed from the server to the latest version.

  • MutateConfigMap will try to get the latest version of the configmap from the server,
    call the mutator callback and then try to Update the mutated object. If there's a
    conflict during this Update operation, retry until no conlfict happens. On every retry
    the object is refreshed from the server to the latest version.

Which issue(s) this PR fixes:
Fixes kubernetes/kubeadm#1097

Special notes for your reviewer:
Alternative implementation to #76241, without the need to perform leader election.

Does this PR introduce a user-facing change?:

kubeadm: Improve resiliency when it comes to updating the `kubeadm-config` config map upon new control plane joins or resets. This allows for safe multiple control plane joins and/or resets.
@neolit123
Copy link
Member

left a comment

thanks a lot for your research on this topic @ereslibre
it seems so much better to not use LE and locks for this.

my guess would be that we should add a unit test for this too?

cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 19, 2019

/priority important-soon
@kubernetes/sig-cluster-lifecycle-pr-reviews

@ereslibre
we should add a release note here too.

@rosti
Copy link
Member

left a comment

Thanks @ereslibre !
I rather like this way better than the leader election method.

cmd/kubeadm/app/phases/uploadconfig/uploadconfig.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved

@timothysc timothysc self-assigned this Apr 19, 2019

cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/apiclient/idempotency.go Outdated Show resolved Hide resolved
// to conflicts, and a retry will be issued if the configmap was modified on the server between the refresh and the update (while the mutation was
// taking place)
func MutateConfigMap(client clientset.Interface, cmmeta metav1.ObjectMeta, cmmutator ConfigMapMutator) error {
for {

This comment has been minimized.

Copy link
@timothysc

timothysc Apr 19, 2019

Member

What are the max retries and timeout on this?

This comment has been minimized.

Copy link
@ereslibre

ereslibre Apr 19, 2019

Author Member

Updated to use k8s.io/client-go/util/retry, by default:

wait.Backoff{
	Steps:    5,
	Duration: 10 * time.Millisecond,
	Factor:   1.0,
	Jitter:   0.1,
}

I'm worried about a fixed number of steps, ideally we should retry without a limit until we succeed (if it's a conflict error).

This comment has been minimized.

Copy link
@neolit123

neolit123 Apr 19, 2019

Member

we might want to use Steps: 0 but with a sensibly large cap e.g. Cap: 1 * time.Minute:
https://godoc.org/k8s.io/apimachinery/pkg/util/wait#Backoff
(or more than 1 minute)

EDIT: though this is an interesting problem, what if one decides to add 100 CP nodes at the same time, we need to find a good timeout that is long enough and makes sense in terms of scale.

This comment has been minimized.

Copy link
@detiber

detiber Apr 19, 2019

Member

I'd argue that someone adding 100 control plane nodes is doing something massively wrong.

This comment has been minimized.

Copy link
@ereslibre

This comment has been minimized.

Copy link
@ereslibre

ereslibre Apr 20, 2019

Author Member

so if a unit test passes for a excess number of concurrent clients, the real world case should be good too. at least that;'s how i see it.

Oh, you propose to start an apiserver for the unit tests? I have been using the fake client (that cannot mimic real server-side conflicts, it's just reactors that can mimic a conflict returned by the server, but it's not testing any real concurrency). I can look into that, but as far as I remember starting an apiserver was flaky and in general not desired, I might be wrong though.

This comment has been minimized.

Copy link
@neolit123

neolit123 Apr 20, 2019

Member

yeah, a real server is bad idea. it flakes in parallel unit tests; there are also endpoint and port conflicts etc. i was hopping that there is a fake api server that can be used here..

alternatively, we can have this as a test in test/e2e-kubeadm where a set of concurrent clients are trying to update the real ClusterStatus ConfigMap of the real server that the suite is being run on.

This comment has been minimized.

Copy link
@ereslibre

ereslibre Apr 20, 2019

Author Member

I think e2e tests are better for this purpose, because with a test server we would have to mimic ourselves a basic conflict return (version handling and so on); all in memory, but still, I think it will include some boilerplate code that we'll have to maintain in the end.

I will look into the e2e tests and see if they can be improved to include a multi-writer test. Do you want me to tackle that as part of this PR as well?

This comment has been minimized.

Copy link
@neolit123

neolit123 Apr 20, 2019

Member

thanks. i think a separate PR is better.

@fabriziopandini merged a change recently that allows you to run this kubeadm suite on an existing cluster like so: make test-e2e-kubeadm
#75546

This comment has been minimized.

Copy link
@ereslibre

ereslibre Apr 20, 2019

Author Member

I will work on that as a separate PR as soon as this one merges. Thank you!

@ereslibre ereslibre force-pushed the ereslibre:kubeadm-config-retry-on-conflict branch 2 times, most recently from 91fc66a to 3389c7a Apr 19, 2019

@timothysc
Copy link
Member

left a comment

Definitely needs a multiple concurrent writer test.

@timothysc
Copy link
Member

left a comment

/approve

lgtm on tests.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ereslibre, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rosti
Copy link
Member

left a comment

Thanks @ereslibre ! Looks great!
I have a couple of suggestions for additional test cases, but I think, that you have covered the most important ones.
What table testing gives us is the easier ability to add new test cases for the func at hand (provided, that the "table" was abstracted properly). I am fine with the way you chose.
Once you are happy with the tests, please remove the hold and ping us for final reviews.

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2019

Once you are happy with the tests, please remove the hold and ping us for final reviews.

Thank you @rosti, I will add the tests that you propose and also start to work in parallel in the e2e tests, so maybe I can adapt this PR with @neolit123 concern about the backoff defaults when I gather some info from the e2e tests themselves. I have no strong opinions on whether this should merge as is (when the tests you mention are added) and then the backoff adapted after checking the e2e behavior in another PR, or if work on both in parallel so the backoff gets fixed on this very same PR with our findings.

For now I'll work on both at the same time, if this one happens to merge I can follow up with another PR if the backoff defaults need any kind of adapting.

@ereslibre ereslibre force-pushed the ereslibre:kubeadm-config-retry-on-conflict branch 6 times, most recently from d346155 to 45f2fa9 Apr 22, 2019

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

@kubernetes/sig-cluster-lifecycle-pr-reviews this is ready for a final pass.

I think the default backoff in this PR is reasonable for this to merge directly, but while I'm working on the e2e tests I will update this PR with my latest findings if it's still open.

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

/retest

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

Also, I believe this patch is safe enough to be backported to 1.14, what do you think? We have some use cases for 1.14 that are interested in multi master concurrent joins.

cc/ @stealthybox

@neolit123
Copy link
Member

left a comment

i have minimal nitpicks about copy editing and code formatting, but nothing outstanding.
wait.Backoff LGTM
thanks for the update @ereslibre

@rosti
Copy link
Member

left a comment

Thanks @ereslibre !
Only a few tiny nits from me and it's otherwise LGTM.

@rosti

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

Also, I believe this patch is safe enough to be backported to 1.14, what do you think? We have some use cases for 1.14 that are interested in multi master concurrent joins.

I am 90% sure too, but the remaining 10% are making me worried.
In general, my rule is that only things, that are marked as bug+critical-urgent should get backported. I think, that it's safer for this to be tested a bit with the 1.15 master and eventually be released then.
Also, people have lived with this problem for now and there are available workarounds.

Therefore, I am a bit hesitant to backporting.

@neolit123

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

the criteria for a "critical fix" is hard to define.

the workaround in previous versions is to manually sync the join of members.
this makes it difficult for users, but they are certainly not blocked.

@ereslibre ereslibre force-pushed the ereslibre:kubeadm-config-retry-on-conflict branch 2 times, most recently from b3478bd to 95f959e Apr 23, 2019

kubeadm: improve resiliency when conflicts arise when updating the ku…
…beadm-config ConfigMap

Add the functionality to support `CreateOrMutateConfigMap` and `MutateConfigMap`.

* `CreateOrMutateConfigMap` will try to create a given ConfigMap object; if this ConfigMap
  already exists, a new version of the resource will be retrieved from the server and a
  mutator callback will be called on it. Then, an `Update` of the mutated object will be
  performed. If there's a conflict during this `Update` operation, retry until no conflict
  happens. On every retry the object is refreshed from the server to the latest version.

* `MutateConfigMap` will try to get the latest version of the ConfigMap from the server,
  call the mutator callback and then try to `Update` the mutated object. If there's a
  conflict during this `Update` operation, retry until no conflict happens. On every retry
  the object is refreshed from the server to the latest version.

Add unit tests for `MutateConfigMap`

* One test checks that in case of no conflicts, the update of the
  given ConfigMap happens without any issues.

* Another test mimics 5 consecutive CONFLICT responses when updating
  the given ConfigMap, whereas the sixth try it will work.

@ereslibre ereslibre force-pushed the ereslibre:kubeadm-config-retry-on-conflict branch from 95f959e to bc8bafd Apr 23, 2019

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

I think I addressed all your feedback, thanks!

@rosti

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

Thanks again @ereslibre !
I think, that this is now ok to be merged.

/lgtm
/hold cancel

@fejta-bot

This comment has been minimized.

Copy link

commented Apr 23, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@ereslibre

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

/retest

@k8s-ci-robot k8s-ci-robot merged commit 36ccff1 into kubernetes:master Apr 23, 2019

20 checks passed

cla/linuxfoundation ereslibre authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@ereslibre ereslibre deleted the ereslibre:kubeadm-config-retry-on-conflict branch Apr 23, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.