Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve reliability of kube-proxy configmap updates (retry, block until pods are up) #3774

Merged
merged 8 commits into from
Mar 1, 2019

Conversation

tstromberg
Copy link
Contributor

@tstromberg tstromberg commented Mar 1, 2019

Today I ran into "waiting for kube-proxy to be up for configmap update", and noticed that kube-proxy came up ~50 seconds after the current timeout, so I extended the timeout by 3 minutes.

I then began on a mission to fix other related bugs by making sure that StartCluster/RestartCluster block until all of the pods are healthy. While this may extend start times on broken configs somewhat, it should result in less flakiness overall.

Example start:

📶  "waitaminute" IP address is 192.168.39.253
🐳  Configuring Docker as the container runtime ...
✨  Preparing Kubernetes environment ...
🚜  Pulling images required by Kubernetes v1.13.3 ...
🚀  Launching Kubernetes v1.13.3 using kubeadm ... 
⌛  Waiting for pods: apiserver proxy etcd dns controller scheduler storage-provisioner addon-manager
🔑  Configuring cluster permissions ...
🤔  Verifying component health .....
💗  kubectl is now configured to use "waitaminute"
🏄  Done! Thank you for using minikube!

Example restart:

🏃  Re-using the currently running kvm2 VM for "waitaminute" ...
⌛  Waiting for SSH access ...
📶  "waitaminute" IP address is 192.168.39.253
🐳  Configuring Docker as the container runtime ...
✨  Preparing Kubernetes environment ...
🚜  Pulling images required by Kubernetes v1.13.3 ...
🔄  Relaunching Kubernetes v1.13.3 using kubeadm ... 
⌛  Waiting for pods: apiserver proxy etcd dns controller scheduler storage-provisioner addon-manager
📯  Reconfiguring kube-proxy ...
🤔  Verifying component health .....
💗  kubectl is now configured to use "waitaminute"
🏄  Done! Thank you for using minikube!

Issues this PR may affect: #3031 #2726 #3765 #3511

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 1, 2019
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 1, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tstromberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 1, 2019
Copy link
Collaborator

@afbjorklund afbjorklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, worked here!

⌛ Waiting for pods: apiserver proxy etcd dns controller scheduler storage-provisioner addon-manager

Copy link
Contributor

@balopat balopat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@tstromberg tstromberg changed the title Start/Restart: block until system pods are healthy Improve reliability of kube-proxy configmap updates (retry, block until pods are up) Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants