Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm upgrade plan and kubeadm upgrade apply v1.10.1 both hang #755

Closed
danderson opened this issue Apr 16, 2018 · 7 comments
Closed

kubeadm upgrade plan and kubeadm upgrade apply v1.10.1 both hang #755

danderson opened this issue Apr 16, 2018 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@danderson
Copy link

danderson commented Apr 16, 2018

Is this a BUG REPORT or FEATURE REQUEST?

/kind bug

Choose one: BUG REPORT or FEATURE REQUEST

Versions

kubeadm version (use kubeadm version): 1.10.1

kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.1", GitCommit:"d4ab47518836c750f9949b9e0d387f20fb92260b", GitTreeState:"clean", BuildDate:"2018-04-12T14:14:26Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version): 1.10.0
  • Cloud provider or hardware configuration: Bare metal, x86_64 (Xeon D-1518)
  • OS (e.g. from /etc/os-release): Debian Testing
  • Kernel (e.g. uname -a): 4.15.0-2 (Debian)
  • Others:

What happened?

kubeadm upgrade plan hangs after discovering that the latest version is v1.10.1 (left running 10min, makes no further progress). Output before hanging is:

root@prod-01:~# kubeadm upgrade plan
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.10.0
[upgrade/versions] kubeadm version: v1.10.1
[upgrade/versions] Latest stable version: v1.10.1

Similarly, kubeadm upgrade apply v1.10.1 hangs before changing any manifests (control plane pods don't restart at all). Output before hanging:

root@prod-01:~# kubeadm upgrade apply v1.10.1
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/version] You have chosen to change the cluster version to "v1.10.1"
[upgrade/versions] Cluster version: v1.10.0 , etcd 3.1.12
[upgrade/versions] kubeadm version: v1.10.1
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.10.1"...

What you expected to happen?

kubeadm should not hang indefinitely before doing things.

How to reproduce it (as minimally and precisely as possible)?

Initialize a 1.10.0 cluster using kubeadm 1.10.0. Upgrade to kubeadm 1.10.1, and attempt to plan/execute an upgrade to 1.10.1.

Anything else we need to know?

At least two other people seem to have seen the exact same symptoms that I did:

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 16, 2018
@danderson
Copy link
Author

I should also add: Kubernetes itself is working fine (control plane up, responsive, scheduling pods...). I also tried rebooting this machine in case there was any wedged state anywhere, but it didn't help.

@kubernetes/sig-cluster-lifecycle-bugs

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Apr 16, 2018
@k8s-ci-robot
Copy link
Contributor

@danderson: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-bugs

In response to this:

I should also add: Kubernetes itself is working fine (control plane up, responsive, scheduling pods...). I also tried rebooting this machine in case there was any wedged state anywhere, but it didn't help.

@kubernetes/sig-cluster-lifecycle-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@timothysc
Copy link
Member

@danderson Thank you for reporting, we are aware of 2 distinct upgrade bugs are working to get a fix in 1.10.2. I'm going to close this issue as a dupe.

/cc @liztio @detiber @stealthybox

@stealthybox
Copy link
Member

@timothysc this is currently undocumented in the other issues, but I was seeing this is morning with existing TLS clusters being unable to upgrade.

The root of this symptom is that the Etcd client used for the pre-upgrade check doesn't support TLS.

kubernetes/kubernetes#62655 does address this case

@vaizki
Copy link

vaizki commented Apr 20, 2018

@danderson Thank you for reporting, we are aware of 2 distinct upgrade bugs are working to get a fix in 1.10.2. I'm going to close this issue as a dupe.

I was unable to find the issues / PRs for these 2 upgrade bugs so I could track them, can someone reference them?

@stealthybox
Copy link
Member

stealthybox commented Apr 22, 2018

@vaizki
Copy link

vaizki commented Apr 29, 2018

This issue is now gone with kubeadm 1.10.2 - I just ran a successful upgrade on a cluster that had this issue using kubeadm 1.10.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests

5 participants