Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Kubernetes downgrade when restoring etcd snapshot #22232

Closed
dnoland1 opened this issue Aug 16, 2019 · 7 comments
Closed

Allow Kubernetes downgrade when restoring etcd snapshot #22232

dnoland1 opened this issue Aug 16, 2019 · 7 comments
Assignees
Labels
kind/feature Issues that represent larger new pieces of functionality, not enhancements to existing functionality
Milestone

Comments

@dnoland1
Copy link
Contributor

What kind of request is this (question/bug/enhancement/feature request):
Feature request

Description
The Kubernetes upgrade documentation listed at https://rancher.com/docs/rancher/v2.x/en/cluster-admin/editing-clusters/#upgrading-kubernetes , it is recommended backing up (taking an etcd snapshot) before doing an upgrade. This implies that if you want to revert an upgrade, you should be able to restore the etcd snapshot and that will put back your Kubernetes cluster to the same version before the upgrade. However, if you attempt to restore the etcd cluster, it will not revert the upgrade, but only restore the state of the etcd data and Rancher will still attempt to upgrade the Kubernetes cluster.

User should be able to revert an upgrade and return their cluster to the version of Kubernetes that corresponds to when the etcd snapshot was taken. For example, a user is on Kubernetes 1.14.0, takes an etcd snapshot, then upgrades to 1.14.5. The user should be able to restore the etcd snapshot and downgrade Kubernetes to v1.14.0. This would involve reverting all Kubernetes components - kube-apiserver, kube-controller, kube-proxy, kube-scheduler, and kubelet, to the previous version.

@ajfriesen
Copy link

Did run into this today.

Wanted to test an upgrade. Upgrade failed due to some bug with kube-dns.
We had a snapshot and tried to restore that snapshot.

This did not work properly.

For the first few seconds kubectl get nodes did show the kubernetes version from the snapshot before the upgrade but suddenly the view switched to the newthe kubernetes version which we wanted to update to (testing at least).

Also kubectl get nodes and the rancher ui did show us different workes which I wonder about.
We habd workers in kubectl get nodes which we did not have in rancher ui.
I thought it could not be different technically.

@dnoland1 dnoland1 changed the title Allow downgrade when restoring etcd snapshot Allow Kubernetes downgrade when restoring etcd snapshot Oct 31, 2019
@chrisbulgaria
Copy link

yes - this would be a great enhancement !

@cloudnautique
Copy link
Contributor

Distilling this down a bit:

  1. A user needs to be able to optionally take a snapshot when initiating the upgrade. This snapshot would be tied to a specific Kubernetes version.
  2. The user should have the option to rollback the upgrade. This means restoring the cluster, including all kubernetes components and etcd database back to the configuration pre-upgrade.

Since etcd snapshot will have the Kubernetes version data tied to it going forward, users should be able to see which version of k8s the backup was taken at.

@soumyalj
Copy link

soumyalj commented Mar 2, 2020

Tested with master-head branch.
Verified that restoreRkeConfig field is added during etcd restore. We can restore both K8s Version and cluster config or just K8s version. Cluster restore for the below combination of tests was done with local backup:

Local backup

Screen Shot 2020-03-02 at 10 35 27 AM

Regression tests were also performed.

@izaac
Copy link
Contributor

izaac commented Mar 2, 2020

I tested the feature in 2.4 master-head with S3 backup enabled. Covering the same test combination as commented by @soumyalj #22232 (comment)

Including same regression tests.

@izaac
Copy link
Contributor

izaac commented Mar 4, 2020

Rancher version 2.4 commit id: 78ee11a (master-head (03/03/2020))

  • Validated the upgrade scenarios using S3/Minio enabled.
  • Created cluster using Standard User for validating P1 cases using Minio backup storage.

Found issue #25744 while creating a cluster with Standard user, that will get tracked separately.

cc @soumyalj

@soumyalj
Copy link

soumyalj commented Mar 4, 2020

Tested with 2.4 master-head(2a7415a2a190)
Validated upgrade scenarios with snapshot restore after upgrade from v2.3.x to master-head.

@soumyalj soumyalj closed this as completed Mar 4, 2020
@zube zube bot removed the [zube]: Done label Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Issues that represent larger new pieces of functionality, not enhancements to existing functionality
Projects
None yet
Development

No branches or pull requests

9 participants