kubeadm: wait for the etcd cluster to be available when growing it #72984

ereslibre · 2019-01-16T19:45:32Z

What this PR does / why we need it:

When the etcd cluster grows we need to explicitly wait for it to be
available. This ensures that we are not implicitly doing this in
following steps when they try to access the apiserver.

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#1353

Does this PR introduce a user-facing change?:

kubeadm: explicitly wait for `etcd` to have grown when joining a new control plane

/kind bug

k8s-ci-robot · 2019-01-16T19:45:39Z

Hi @ereslibre. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cmd/kubeadm/app/util/etcd/etcd.go

rosti · 2019-01-17T13:33:26Z

/ok-to-test
/priority critical-urgent

rosti · 2019-01-17T13:34:33Z

/assign @fabriziopandini @timothysc @neolit123

neolit123

thanks for this @ereslibre
i think apart from the retry rate this is good.

also please add a release note instead of NONE
kubeadm: ......

neolit123 · 2019-01-17T13:37:14Z

cmd/kubeadm/app/phases/etcd/local.go

-	certsVolumeName = "etcd-certs"
+	etcdVolumeName           = "etcd-data"
+	certsVolumeName          = "etcd-certs"
+	etcdHealthyCheckInterval = 1 * time.Second


what do your tests reveal for the interval and n-retries @ereslibre ?

i think the 1 second rate might be too high. i would do something like 5 seconds, with 20 retries.
but let's gather more comments on this one before changing.

In my environment it's succeeding around the third time. An environment that has been alive for seconds, so I agree that 5 seconds looks reasonable (if the cluster of one machine is long-living more sync would be neded).

I'll adapt the PR, thanks!

neolit123 · 2019-01-17T13:42:39Z

cmd/kubeadm/app/util/etcd/etcd.go

@@ -146,7 +147,7 @@ type Member struct {
 }

 // AddMember notifies an existing etcd cluster that a new member is joining
-func (c Client) AddMember(name string, peerAddrs string) ([]Member, error) {
+func (c *Client) AddMember(name string, peerAddrs string) ([]Member, error) {


as a note/TODO: in a separate PR we need to make all the client methods to use pointers.
golang encourages a matching pattern.

cmd/kubeadm/app/phases/etcd/local.go

rosti

Thanks @ereslibre !
Overall it looks good, though we need to fix the AddMember call.

cmd/kubeadm/app/util/etcd/etcd.go

rosti

Re-lgtm after things are back to normal.
/lgtm

ereslibre · 2019-01-17T17:24:18Z

/retest

fabriziopandini

@ereslibre Great work! this is a candidate a cherry pick!
Only a few small nits mostly UX related
Ping me when this is ready for approval

cmd/kubeadm/app/phases/etcd/local.go

fabriziopandini · 2019-01-17T21:31:00Z

cmd/kubeadm/app/phases/etcd/local.go

@@ -121,6 +124,12 @@ func CreateStackedEtcdStaticPodManifestFile(client clientset.Interface, manifest
 	}

 	fmt.Printf("[etcd] Wrote Static Pod manifest for a local etcd instance to %q\n", kubeadmconstants.GetStaticPodFilepath(kubeadmconstants.Etcd, manifestDir))
+
+	fmt.Println("[etcd] Waiting for the etcd cluster to be healthy")
+	if _, err := etcdClient.WaitForClusterAvailable(etcdHealthyCheckRetries, etcdHealthyCheckInterval); err != nil {


I don't like this function printing

[util/etcd] Attempt timed out [util/etcd] Waiting 5s until next retry [util/etcd] Attempt timed out [util/etcd] Waiting 5s until next retry

IMO those output should be removed (or converted into log messages) in order to be consistent with all the other waiters in kubeadm.

However, considering that this requires to add "This can take up to ..." in every place where WaitForClusterAvailable is used, this goes out of the scope of this PR, so please open an issue to track this as a todo/good first issue

I proposed to use klog here too but @rosti didn't want to address this change on this PR, only changing the one potentially long with the endpoints. I agree with your point of view though @fabriziopandini.

@rosti, wdyt? Should I change this now that @fabriziopandini also raised this issue?

Marked as resolved as per discussion with @fabriziopandini, leaving it as it was as @rosti proposed.

I do think, that we need some sort of indication about the reason we wait another 5 seconds. This is tightly coupled with the UX of end users, that run kubeadm directly on command line. For that matter I am not a fan of klogging this. In my opinion it should go out via print.
On the other hand, we can certainly reduce the output here to say a single, more descriptive message per retry.
However, as @fabriziopandini mentioned, this will require changes in a few more places, thus it may better be done in another PR. We can file a backlog issue for now.

i wanted to get more feedback on the 5 seconds and the interval of 20 tries.
if we can get a check faster than 5 on the average, possibly we can reduce the value?
also 20 tries is a lot. in reality, we might get the failed state much sooner.

In the last run it took 4 retries (of 5 seconds interval), this one was way off-charts and with a clean environment :(

i've mentioned this on slack:

we may want to keep the overall time under 40seconds to match the kubelet timeout.
how about 2 seconds rate with 20 retries.

@fabriziopandini @rosti please give you stamp of approval for the above comment.

I like the 40 seconds idea, but let's keep the steps at 5 sec. Bear in mind, that we have just written out the static pod spec, so the kubelet needs to detect it, spin it up and for etcd to become responsive. On some systems it's easy for this to come above 2 seconds.

ok, @rosti is voting for 5 sec / 8 retries.
@fabriziopandini ?

cmd/kubeadm/app/util/etcd/etcd.go

When the etcd cluster grows we need to explicitly wait for it to be available. This ensures that we are not implicitly doing this in following steps when they try to access the apiserver.

rosti · 2019-01-18T11:14:05Z

@ereslibre can you, please, re-run the update-gofmt.sh script to fix the verify test? Thanks!

ereslibre · 2019-01-18T13:54:54Z

/retest

fabriziopandini · 2019-01-20T06:47:55Z

Looking at the code I don't see a relation between the kubelet timeout for TLS bottstrap and the timeout this PR sets for waiting for the new etcd member joining the cluster.

Said that, IMO 5s * 8 is reasonable for unblocking this fix and start the cherry-picking process;
so, considering that it seems there is consensus on those settings also from the other reviewers, and that the proposed solution is definetly an improvement vs current situation
/approve
/lgtm

Let's continue the discussion in parallel (slack/at the next office hours meeting) eventually asking to broader sig for some more feedbacks from the field

k8s-ci-robot · 2019-01-20T06:48:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ereslibre, fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cmd/kubeadm/OWNERS~~ [fabriziopandini]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

MalloZup · 2019-01-20T09:31:44Z

We could also use kind of backoff algo for don't use a fixed timeout. https://en.m.wikipedia.org/wiki/Exponential_backoff?wprov=sfla1

ereslibre · 2019-01-20T11:32:18Z

@fabriziopandini I also have the same feeling and have discussed it with @neolit123 previously. So, just for the record, WaitForKubeletAndFunc basically waits for two things:

WaitForHealthyKubelet before some initial timeout (40 seconds)
The given function to WaitForKubeletAndFunc

When any of these two functions (given one or WaitForHealthyKubelet) returns, WaitForKubeletAndFunc returns. If the given function takes more than 40 seconds to finish, then the sleep on WaitForHealthyKubelet timeouts and actually checks that the kubelet is healthy.

I guess that I would say WaitForKubeletAndFunc basically means "please, run this function I give you, if it fails or succeeds it's fine, but if it takes more than 40 seconds to answer, then we check that the kubelet is running and healthy"

…984-origin-release-1.13 Automated cherry pick of #72984: kubeadm: wait for the etcd cluster to be available when

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 16, 2019

k8s-ci-robot requested review from dixudx and kad January 16, 2019 19:45

ereslibre commented Jan 16, 2019

View reviewed changes

cmd/kubeadm/app/util/etcd/etcd.go Outdated Show resolved Hide resolved

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 17, 2019

k8s-ci-robot assigned fabriziopandini, neolit123 and timothysc Jan 17, 2019

neolit123 approved these changes Jan 17, 2019

View reviewed changes

neolit123 reviewed Jan 17, 2019

View reviewed changes

cmd/kubeadm/app/phases/etcd/local.go Outdated Show resolved Hide resolved

rosti suggested changes Jan 17, 2019

View reviewed changes

cmd/kubeadm/app/util/etcd/etcd.go Show resolved Hide resolved

cmd/kubeadm/app/util/etcd/etcd.go Outdated Show resolved Hide resolved

rosti approved these changes Jan 17, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 17, 2019

fabriziopandini suggested changes Jan 17, 2019

View reviewed changes

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 18, 2019

kubeadm: wait for the etcd cluster to be available when growing it

b4cb3fd

When the etcd cluster grows we need to explicitly wait for it to be available. This ensures that we are not implicitly doing this in following steps when they try to access the apiserver.

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 20, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 20, 2019

k8s-ci-robot merged commit f2b133d into kubernetes:master Jan 20, 2019

ereslibre deleted the wait-for-etcd-when-growing branch January 20, 2019 11:32

This was referenced Jan 20, 2019

kubeadm: adapt timeout for new etcd member to join #73112

Merged

Automated cherry pick of #72984: kubeadm: wait for the etcd cluster to be available when #73113

Closed

neolit123 mentioned this pull request Jan 20, 2019

Automated cherry pick of #72984: kubeadm: wait for the etcd cluster to be available when #73114

Merged

rosti mentioned this pull request Jan 21, 2019

add support for the OpenRC as init system kubernetes/kubeadm#1295

Closed

k8s-ci-robot added a commit that referenced this pull request Jan 21, 2019

Merge pull request #73114 from neolit123/automated-cherry-pick-of-#72…

a0ad376

…984-origin-release-1.13 Automated cherry pick of #72984: kubeadm: wait for the etcd cluster to be available when

ereslibre mentioned this pull request Jan 27, 2019

REQUEST: New membership for ereslibre kubernetes/org#421

Closed

6 tasks

fabriziopandini mentioned this pull request Jan 30, 2019

Tracking issue for Certificates copy for join --control-plane kubernetes/kubeadm#1373

Closed

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubeadm: wait for the etcd cluster to be available when growing it #72984

kubeadm: wait for the etcd cluster to be available when growing it #72984

ereslibre commented Jan 16, 2019 •

edited

k8s-ci-robot commented Jan 16, 2019

rosti commented Jan 17, 2019

rosti commented Jan 17, 2019

neolit123 left a comment •

edited

neolit123 Jan 17, 2019

ereslibre Jan 17, 2019

neolit123 Jan 17, 2019

rosti left a comment

rosti left a comment

ereslibre commented Jan 17, 2019

fabriziopandini left a comment

fabriziopandini Jan 17, 2019

ereslibre Jan 17, 2019

ereslibre Jan 18, 2019

rosti Jan 18, 2019

neolit123 Jan 18, 2019

ereslibre Jan 18, 2019

neolit123 Jan 18, 2019

neolit123 Jan 18, 2019

rosti Jan 18, 2019 •

edited

neolit123 Jan 18, 2019

rosti commented Jan 18, 2019

ereslibre commented Jan 18, 2019

fabriziopandini commented Jan 20, 2019

k8s-ci-robot commented Jan 20, 2019

MalloZup commented Jan 20, 2019 •

edited

ereslibre commented Jan 20, 2019

kubeadm: wait for the etcd cluster to be available when growing it #72984

kubeadm: wait for the etcd cluster to be available when growing it #72984

Conversation

ereslibre commented Jan 16, 2019 • edited

k8s-ci-robot commented Jan 16, 2019

rosti commented Jan 17, 2019

rosti commented Jan 17, 2019

neolit123 left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosti left a comment

Choose a reason for hiding this comment

rosti left a comment

Choose a reason for hiding this comment

ereslibre commented Jan 17, 2019

fabriziopandini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosti Jan 18, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosti commented Jan 18, 2019

ereslibre commented Jan 18, 2019

fabriziopandini commented Jan 20, 2019

k8s-ci-robot commented Jan 20, 2019

MalloZup commented Jan 20, 2019 • edited

ereslibre commented Jan 20, 2019

ereslibre commented Jan 16, 2019 •

edited

neolit123 left a comment •

edited

rosti Jan 18, 2019 •

edited

MalloZup commented Jan 20, 2019 •

edited