New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cherry-pick of #63495: kubeadm - fix upgrades with external etcd #63925

Merged
merged 2 commits into from May 18, 2018

Conversation

@detiber
Member

detiber commented May 16, 2018

What this PR does / why we need it:

Backports fixes for kubeadm upgrades with external etcd

Release note:

kubeadm - fix upgrades with external etcd
Support kubeadm upgrade with remote etcd cluster
Currently kubeadm only performs an upgrade if the etcd cluster is
colocated with the control plane node. As this is only one possible
configuration, kubeadm should support upgrades with etcd clusters
that are not local to the node.

Signed-off-by: Craig Tracey <craigtracey@gmail.com>

@k8s-ci-robot k8s-ci-robot requested review from kad and luxas May 16, 2018

@detiber detiber changed the title from kubeadm - backport fixes for external etcd upgrades to kubeadm - fix upgrades with external etcd May 16, 2018

@detiber detiber changed the title from kubeadm - fix upgrades with external etcd to cherry-pick of #63495: kubeadm - fix upgrades with external etcd May 16, 2018

@detiber

This comment has been minimized.

Member

detiber commented May 16, 2018

@MaciekPytel

This comment has been minimized.

Contributor

MaciekPytel commented May 16, 2018

@detiber This needs to merge by tomorrow EOD to make it into 1.10.3.

}{
{
name: "Up to date",
upgrades: []upgrade.Upgrade{},
expectedBytes: []byte(`Awesome, you're up-to-date! Enjoy!

This comment has been minimized.

@neolit123

neolit123 May 16, 2018

Member

huh, i wonder how the old code passed go fmt. it complains for me when not indenting the members property.

return "", fmt.Errorf("etcd cluster contains endpoints with mismatched versions: %v", versions)
} else {
clusterVersion = v
}

This comment has been minimized.

@neolit123

neolit123 May 16, 2018

Member

change only if see fit 👍

for _, v := range versions {
	if clusterVersion != "" && clusterVersion != v {
		return "", fmt.Errorf("etcd cluster contains endpoints with mismatched versions: %v", versions)
	}
	clusterVersion = v
}

This comment has been minimized.

@detiber

detiber May 17, 2018

Member

I added this to my PR for fixing the static pod upgrade regression in master.

},
},
externalEtcd: true,
expectedBytes: []byte(`External components that should be upgraded manually before you upgrade the control plane with 'kubeadm upgrade apply':

This comment has been minimized.

@neolit123

neolit123 May 16, 2018

Member

these byte comparisons without helper functions are scary, but we have them in the old test code, so 👍

@stealthybox

This comment has been minimized.

Contributor

stealthybox commented May 17, 2018

Ran into this when upgrading staticpod etcd:

root@vagrant:~# /vagrant/bin/cherry-63925_kubeadm upgrade apply v1.10.2
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/version] You have chosen to change the cluster version to "v1.10.2"
[upgrade/versions] Cluster version: v1.9.7
[upgrade/versions] kubeadm version: v1.10.3-beta.0.50+29974a37bba8ad
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.10.2"...
Static pod: kube-apiserver-vagrant hash: 8995194791ca1350e37e54cdb9db6c44
Static pod: kube-controller-manager-vagrant hash: 305b23ead04fc3114ba6755a75d9d17d
Static pod: kube-scheduler-vagrant hash: 96944ce896a1ba4844bab386f40c0acc
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/apply] FATAL: etcd cluster is not healthy: etcdclient: no available endpoints

Are you able to reproduce?

I was able to upgrade an insecure external etcd successfully.

@stealthybox

@detiber -- I'm still very confused, but I was able to fix this problem with a small patch:

diff --git a/cmd/kubeadm/app/cmd/upgrade/apply.go b/cmd/kubeadm/app/cmd/upgrade/apply.go
index b7fc2c0443..ad6b420a73 100644
--- a/cmd/kubeadm/app/cmd/upgrade/apply.go
+++ b/cmd/kubeadm/app/cmd/upgrade/apply.go
@@ -35,7 +35,7 @@ import (
        "k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient"
        configutil "k8s.io/kubernetes/cmd/kubeadm/app/util/config"
        dryrunutil "k8s.io/kubernetes/cmd/kubeadm/app/util/dryrun"
-       etcdutil "k8s.io/kubernetes/cmd/kubeadm/app/util/etcd"
+       // etcdutil "k8s.io/kubernetes/cmd/kubeadm/app/util/etcd"
        "k8s.io/kubernetes/pkg/api/legacyscheme"
        "k8s.io/kubernetes/pkg/util/version"
 )
@@ -269,9 +269,8 @@ func PerformStaticPodUpgrade(client clientset.Interface, waiter apiclient.Waiter
                return err
        }

-       // These are uninitialized because passing in the clients allow for mocking the client during testing
-       var oldEtcdClient, newEtdClient etcdutil.Client
-       return upgrade.StaticPodControlPlane(waiter, pathManager, internalcfg, etcdUpgrade, oldEtcdClient, newEtdClient)
+       return upgrade.StaticPodControlPlane(waiter, pathManager, internalcfg, etcdUpgrade, nil, nil)
+       // The arguments, oldEtcdClient and newEtdClient, are uninitialized because passing in the clients allow for mocking the client during testing
 }

 // DryRunStaticPodUpgrade fakes an upgrade of the control plane

original source: app/cmd/upgrade/apply.go#L273-L274

The client vars were previously being initialized with a zero-value (not nil).
This actually made the very important comparisons for those clients against nil unreachable from the cmd entrypoint.

I'm not sure how initialization and method behavior for these clients was ever working now that I've found this bug, but I'm going to take a break.

We definitely need to validate the behavior in all of the common cases again.

@stealthybox

This comment has been minimized.

Contributor

stealthybox commented May 17, 2018

/hold

kubeadm - fix external etcd upgrades
- Update upgrade plan output when configured for external etcd
  - Move etcd to a separate section and show available upgrades
@MaciekPytel

This comment has been minimized.

Contributor

MaciekPytel commented May 17, 2018

Added the milestone, so this moves faster through submit queue and make it before branch freeze.

@detiber

This comment has been minimized.

Member

detiber commented May 17, 2018

@stealthybox I've finished testing this against both static pod etcd and external etcd installations. ptal again.

@detiber

This comment has been minimized.

Member

detiber commented May 17, 2018

/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-e2e-gce

@detiber

This comment has been minimized.

Member

detiber commented May 17, 2018

/test pull-kubernetes-verify

@stealthybox

ty @detiber 🌮
I've verified locally that this now works in both cases.
/lgtm

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented May 17, 2018

[MILESTONENOTIFIER] Milestone Pull Request Needs Approval

@detiber @stealthybox @timothysc @kubernetes/sig-cluster-lifecycle-misc

Action required: This pull request must have the status/approved-for-milestone label applied by a SIG maintainer.

Pull Request Labels
  • sig/cluster-lifecycle: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/bug: Fixes a bug discovered during the current release.
Help
@stealthybox

This comment has been minimized.

Contributor

stealthybox commented May 17, 2018

/hold cancel

@detiber

This comment has been minimized.

Member

detiber commented May 17, 2018

/test pull-kubernetes-e2e-gce

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented May 17, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: detiber, stealthybox, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@detiber

This comment has been minimized.

Member

detiber commented May 18, 2018

/test pull-kubernetes-e2e-kops-aws

@detiber

This comment has been minimized.

Member

detiber commented May 18, 2018

/test pull-kubernetes-e2e-gce

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented May 18, 2018

Automatic merge from submit-queue.

@k8s-merge-robot k8s-merge-robot merged commit aa7f374 into kubernetes:release-1.10 May 18, 2018

17 of 18 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
cla/linuxfoundation detiber authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Skipped
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details

zoidbergwill added a commit to zoidbergwill/kubernetes that referenced this pull request Jun 22, 2018

Merge pull request kubernetes#63925 from detiber/external_etcd_upgrad…
…e_1_10 into release-1.9

Automatic merge from submit-queue.

cherry-pick of kubernetes#63495: kubeadm - fix upgrades with external etcd

**What this PR does / why we need it**:

Backports fixes for kubeadm upgrades with external etcd

**Release note**:
```release-note
kubeadm - fix upgrades with external etcd
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment