Add timeout value to configure how long kubeadm init takes to timeout #1168

soggiest · 2018-10-12T17:41:27Z

FEATURE REQUEST

Add a KubeADM configuration option or CLI flag that determines how long kubeadm init takes to timeout. Cloud-based installations have a chance to fail if the API Server's load balancer takes too long to recognize an instance is listening on the appropriate port. This can potentially be mitigated by modifying the load balancer's health checks. However, an option to extend the kubeadm init would be helpful.

Versions

kubeadm version (use kubeadm version): 1.11.3

Environment:

Kubernetes version (use kubectl version): 1.11.3
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Ubuntu 16.04
Kernel (e.g. uname -a): kubeadm does not copy the cloud configuration file to nodes #77-Ubuntu SMP

What happened?

KubeADM init runs were consistently failing on AWS with the following error: "timed out waiting for condition"
After each KubeADM run the Kubernetes control plane was working as expected. During an investigation into what was causing this error I observed that the AWS ELB would become active after KubeADM had timed out. Modifying the ELB health checks to be quicker alleviated the issue.

From a UX stand point I should be able to set the init timeout from a config or cli flag rather than worry about a race against time with my ELB.

What you expected to happen?

KubeADM init would wait long enough for the control plane to come up.

How to reproduce it (as minimally and precisely as possible)?

Region:
ap-southeast-1 (Singapore)

Configure an AWS EC2 Instance with:
AMI: Ubuntu 16.04 LTS
Size: m5.xlarge
Storage: 50GB
Tag: kubernetes.io/cluster/kubernetes: owned

Configure an ELB with:
Standard Health Checks Intervals, Health check against TCP:6443
TCP Passthrough 443 -> TCP 6443

Run kubeadm init with a kubeadm config that points to the ELB as the controlPlaneEndpoint

The text was updated successfully, but these errors were encountered:

neolit123 · 2018-10-13T13:29:12Z

From a UX stand point I should be able to set the init timeout from a config or cli flag rather than worry about a race against time with my ELB.

i agree with this.

/kind feature
/cc @kubernetes/sig-cluster-lifecycle

fabriziopandini · 2018-10-13T15:13:48Z

@timothysc opinions?

dixudx · 2018-10-15T12:44:26Z

I'd love to add this. +1.

timothysc · 2018-10-16T17:06:07Z

This has been requested a number of times, but the use case was on the other end of the spectrum.

@fabriziopandini could we roll this into the v1beta1 component(apiserver) migration work?

timothysc · 2018-10-16T17:07:41Z

/assign @soggiest
/assign @fabriziopandini
/assign @timothysc

fabriziopandini · 2018-10-16T19:15:46Z

@timothysc ok
Just to set expectations: I'm going to propose a generic solution for defining timeouts in the kubeadm config API (not in flags) and use the new settings for making configurable only the timeout above + eventually the same timeout in the kubeadm --join-workflow (not all the timeouts in kubeadm).

neolit123 · 2018-10-31T18:52:22Z

xref kubernetes/kubernetes#70480

neolit123 · 2018-11-13T04:19:22Z

timeout for the api server was added in the v1beta1 config:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1beta1/types.go#L135

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 13, 2018

timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Oct 16, 2018

timothysc added this to the v1.13 milestone Oct 16, 2018

k8s-ci-robot assigned fabriziopandini, soggiest and timothysc Oct 16, 2018

fabriziopandini added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Oct 18, 2018

fabriziopandini mentioned this issue Oct 19, 2018

Kubeadm - Add timeouts to kubeadm config API kubernetes/kubernetes#70025

Closed

timothysc assigned rosti and unassigned timothysc and soggiest Oct 31, 2018

neolit123 closed this as completed Nov 13, 2018

Xnyle mentioned this issue Nov 28, 2018

kubeadm init fail 90% of the time on aarch64 (rock64) due to TLS timeouts kubernetes/kubernetes#71505

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add timeout value to configure how long kubeadm init takes to timeout #1168

Add timeout value to configure how long kubeadm init takes to timeout #1168

soggiest commented Oct 12, 2018 •

edited

neolit123 commented Oct 13, 2018

fabriziopandini commented Oct 13, 2018

dixudx commented Oct 15, 2018

timothysc commented Oct 16, 2018 •

edited

timothysc commented Oct 16, 2018

fabriziopandini commented Oct 16, 2018

neolit123 commented Oct 31, 2018

neolit123 commented Nov 13, 2018

Add timeout value to configure how long kubeadm init takes to timeout #1168

Add timeout value to configure how long kubeadm init takes to timeout #1168

Comments

soggiest commented Oct 12, 2018 • edited

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

neolit123 commented Oct 13, 2018

fabriziopandini commented Oct 13, 2018

dixudx commented Oct 15, 2018

timothysc commented Oct 16, 2018 • edited

timothysc commented Oct 16, 2018

fabriziopandini commented Oct 16, 2018

neolit123 commented Oct 31, 2018

neolit123 commented Nov 13, 2018

soggiest commented Oct 12, 2018 •

edited

timothysc commented Oct 16, 2018 •

edited