New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm upgrade from 1.9.6 to 1.10.0: converting v1.ConfigMap to v1alpha1.MasterConfiguration: API not present in src #61764

Closed
fgbreel opened this Issue Mar 27, 2018 · 19 comments

Comments

Projects
None yet
@fgbreel

fgbreel commented Mar 27, 2018

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
Upgrading from 1.9.6 to 1.10.0 using kubeadm an error is thrown:

[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/config] FATAL: could not decode configuration: unable to decode config from bytes: v1alpha1.MasterConfiguration: KubeProxy: v1alpha1.KubeProxy: Config: v1alpha1.KubeProxyConfiguration: FeatureGates: ReadMapCB: expect { or n, but found ", error found in #10 byte of ...|reGates":"","healthz|..., bigger context ...|24h0m0s"},"enableProfiling":false,"featureGates":"","healthzBindAddress":"0.0.0.0:10256","hostnameOv|...

Running kubectl -n kube-system get cm kubeadm-config -oyaml > kubeconfig.yaml followed by kubeadm config upload from-file --config=kubeconfig.yaml I receive a hint about the issue:

kubeadm config upload from-file --config=kubeconfig.yaml
unable to decode config from "kubeconfig.yaml" [converting (v1.ConfigMap) to (v1alpha1.MasterConfiguration): API not present in src]

What you expected to happen:
Successfully upgrade to 1.10.0.

How to reproduce it (as minimally and precisely as possible):
Execute:

kubeadm upgrade plan

This cluster already received the following upgrade paths:
1.8.3 (ok) -> 1.9.3 (ok) -> 1.9.6 (ok) -> 1.10.0 (failed)

Anything else we need to know?:

Environment:

Kubernetes version (use `kubectl version`):
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g. from /etc/os-release):
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
  • Kernel (e.g. uname -a):
Linux k8s-master-0 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux
  • Install tools:
kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • Others:
@fgbreel

This comment has been minimized.

fgbreel commented Mar 27, 2018

@kubernetes/sig-cluster-lifecycle-bugs

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Mar 27, 2018

@fgbreel: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-bugs

In response to this:

@kubernetes/sig-cluster-lifecycle-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@titou10titou10

This comment has been minimized.

titou10titou10 commented Mar 27, 2018

Same problem on RHEL v7.4, migrating from v1.9.6 to v1.10.0, followinf the instructions on how to upgrade here

  • Cloud provider or hardware configuration: bare metal
  • OS (e.g. from /etc/os-release):
NAME="Red Hat Enterprise Linux Server"
VERSION="7.4 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.4"
  • uname-a:
    Linux sldops0196 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
@neyz

This comment has been minimized.

neyz commented Mar 28, 2018

I have the same error message from a fresh 1.9.6 to 1.10.0 on CoreOS

OS:
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1632.3.0
VERSION_ID=1632.3.0
BUILD_ID=2018-02-14-0338
PRETTY_NAME="Container Linux by CoreOS 1632.3.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

uname -a :
Linux k801 4.14.19-coreos #1 SMP Wed Feb 14 03:18:05 UTC 2018 x86_64 Intel Core Processor (Broadwell) GenuineIntel GNU/Linux

@liggitt

This comment has been minimized.

Member

liggitt commented Mar 28, 2018

from the 1.10 release notes:

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.10.md#before-upgrading

kube-proxy: feature gates are now specified as a map when provided via a JSON or YAML
KubeProxyConfiguration, rather than as a string of key-value pairs. For example:

KubeProxyConfiguration Before:
===
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates: "SupportIPVSProxyMode=true"

KubeProxyConfiguration After:
===
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
  SupportIPVSProxyMode: true
@liggitt

This comment has been minimized.

Member

liggitt commented Mar 28, 2018

changed in #57962

@kubernetes/sig-cluster-lifecycle-bugs is kubeadm persisting alpha configuration types?

@berlinsaint

This comment has been minimized.

berlinsaint commented Mar 28, 2018

use kubectl -n kube-system edit cm kubeadm-config
and edit the :
image
save and upgrade

@xiangpengzhao

This comment has been minimized.

Member

xiangpengzhao commented Mar 28, 2018

@liggitt what's our generic policy for API compatibility in such case? I guess I should have made the change compatible :)

@liggitt

This comment has been minimized.

Member

liggitt commented Mar 28, 2018

@liggitt what's our generic policy for API compatibility in such case?

alpha-level APIs carry a 0-release compatibility guarantee (see https://kubernetes.io/docs/reference/deprecation-policy/)

kubeadm should not be persisting/using alpha data by default without making it clear that a configuration that may not be supported in the next release is being used

@vistalba

This comment has been minimized.

vistalba commented Mar 28, 2018

Is the change suggested by @berlinsaint a permanent solution or do I have to do some chenages to be safe for the future?

@danderson

This comment has been minimized.

danderson commented Mar 29, 2018

I would note that, in my cluster, featureGates was the empty string, and that also failed to decode. Are you saying the entire featureGates field is alpha and unsupported, or just particular values?

Either way, the workaround upthread worked, manual edit made kubeadm happy. If it's not too much work, I would humbly suggest having a fix in 1.10.1 for this time around, while y'all work out a policy for 1.11 and beyond?

@vistalba

This comment has been minimized.

vistalba commented Mar 29, 2018

@danderson On my cluster "featureGates" was also empty.

@xiangpengzhao

This comment has been minimized.

Member

xiangpengzhao commented Mar 29, 2018

I add the workaround provided by @berlinsaint to 1.10 release notes. We may need a patch (if possible) for kubeadm to do this automatically as suggested by @danderson . Further discussion on API policy in kubeadm may be needed as well.

@ReSearchITEng

This comment has been minimized.

ReSearchITEng commented Mar 29, 2018

if featureGates: "", replace with with featureGates: {}

@papile

This comment has been minimized.

papile commented Mar 29, 2018

I too have no featureGates set "" yet get this issue. I can see if someone has alpha features enabled to halt but should do what @ReSearchITEng says otherwise as this will fail for everyone upgrading.

@haashah

This comment has been minimized.

haashah commented Apr 4, 2018

Faced the same problem. Workarounds by @berlinsaint and @ReSearchITEng both work. no idea why. Hopefully this is fixed in kubeadm soon.

@timothysc

This comment has been minimized.

Member

timothysc commented Apr 6, 2018

We're working on a fix right now and trying to cleanup some other legacy issues. Please use the kubeadm repo for logging the details.

/assign @timothysc
cc @liztio

Closing this issue in favor of: kubernetes/kubeadm#744

k8s-merge-robot added a commit that referenced this issue Apr 25, 2018

Merge pull request #61882 from xiangpengzhao/fix-changelog
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add note on upgrading cluster by kubeadm.

**What this PR does / why we need it**:
Upgrading cluster from 1.9.x to 1.10.0 by kubeadm fails due to type change of `featureGates` in `KubeProxyConfiguration` done in #57962. This PR add a note on what should be done before upgrading.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #61764

**Special notes for your reviewer**:
cc @kubernetes/sig-cluster-lifecycle-bugs @liggitt @fgbreel
Thanks @berlinsaint for the workaround!

We may need a patch (if possible) for kubeadm to do this automatically as suggested by @danderson .

**Release note**:

```release-note
NONE
```
@ypsingh27

This comment has been minimized.

ypsingh27 commented Apr 27, 2018

Hello All,
I am trying to upgrade the the kuberenetes cluster using kubeadm upgrade apply v1.10.1, I get the below error:
root@vmaksa69901dzl # kubeadm upgrade apply v1.10.1
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/version] You have chosen to change the cluster version to "v1.10.1"
[upgrade/versions] Cluster version: v1.9.3
[upgrade/versions] kubeadm version: v1.10.2
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.10.1"...
Static pod: kube-apiserver-vmaksa69901dzl hash: dc3749dffa0c124bd5c4964613658249
Static pod: kube-controller-manager-vmaksa69901dzl hash: 9e6798d0ba2ebe6747904b2195183c11
Static pod: kube-scheduler-vmaksa69901dzl hash: d38a1ee5cf80a84bc4f295d19b6874c2
[upgrade/etcd] Upgrading to TLS for etcd
Static pod: etcd-vmaksa69901dzl hash: 17c801f54fe8bd173a9f4810b4242bbb
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests729921876/etcd.yaml"
[certificates] Using the existing etcd/ca certificate and key.
[certificates] Using the existing etcd/server certificate and key.
[certificates] Using the existing etcd/peer certificate and key.
[certificates] Using the existing etcd/healthcheck-client certificate and key.
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests825459363/etcd.yaml"
[upgrade/staticpods] Not waiting for pod-hash change for component "etcd"
[upgrade/etcd] Waiting for etcd to become available
[util/etcd] Waiting 30s for initial delay
[util/etcd] Attempting to get etcd status 1/10
[util/etcd] Attempt failed with error: dial tcp [::1]:2379: getsockopt: connection refused
[util/etcd] Waiting 15s until next retry
[util/etcd] Attempting to get etcd status 2/10
[util/etcd] Attempt failed with error: dial tcp [::1]:2379: getsockopt: connection refused
[util/etcd] Waiting 15s until next retry
[util/etcd] Attempting to get etcd status 3/10
[util/etcd] Attempt failed with error: dial tcp [::1]:2379: getsockopt: connection refused
[util/etcd] Waiting 15s until next retry
[util/etcd] Attempting to get etcd status 4/10
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests729921876"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests729921876/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests729921876/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests729921876/kube-scheduler.yaml"
[upgrade/staticpods] The etcd manifest will be restored if component "kube-apiserver" fails to upgrade
[certificates] Using the existing etcd/ca certificate and key.
[certificates] Using the existing apiserver-etcd-client certificate and key.
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests825459363/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]

can someone help with this issue?

@dvdmuckle

This comment has been minimized.

dvdmuckle commented Apr 30, 2018

@ypsingh27 check the logs for the api-server using docker logs while the upgrade is happening. I suspect you might have run into the same issue I did, in which case I'd suggest creating a separate issue.

dlipovetsky added a commit to platform9/ssh-provider that referenced this issue Jun 27, 2018

dlipovetsky added a commit to platform9/ssh-provider that referenced this issue Jun 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment