-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rancher backup | Can't rollback (restore) Rancher from 2.6-head to 2.6.3 #36803
Comments
@superseb - Please look at this. Can this be fixed for 2.6.4 or do we need to push to 2.6.5? |
This is also seen when deploying v2.6.3, updating image to v2.6-head and going back to v2.6.3 after (just a rollback without restore)
Seems to be related to newer CRD versions introduced by bumping So far it seems to be caused by the CRDs that are updated with a new version (v1beta1), then the rollback to v2.6.3 (or the restore) wants to apply the CRDs from the previous versions that does not have v1beta1 but its trying to merge/modify Some options that were thought of:
Some code examples: For testing purposes, I've used this to get the CRD in a state where a rollback would succeed (haven't tested with restore but should be the same):
While we are discussing/figuring out what we can do next, can QA validate the workaround above? |
It seems that this issue only affects HA Rancher. I followed these steps to upgrade the Docker install of Rancher and then followed these steps to rollback the Docker install of Rancher and was successful with this. |
@superseb 's comment has explained the issue and cause pretty well. However, the proposed fix (patching the CRD) does not work, so we have to (manually) clean up the cluster before restoring the backup. We have made two scripts to make it easier: one for cleaning up the cluster, another one for checking if there is any rancher-related resources in the cluster. They are available here: https://github.com/rancherlabs/support-tools/tree/master/cleanup-rancher-k8s-resources As the result, we need to update the instruction for restoring Rancher backup in the Docs to add the new requirement. Also, add it to the release note. |
Thanks for the update Jack. @jtravee - please make sure this gets release noted with the information provided by Jack regarding the scripts. As far as the ticket, we will move it to 2.6.5 once you've release noted it. |
There is no fixes going into rancher or rancher-backup-restore. |
Cleanup script was hanging previously for namespaces having finalizers. @superseb provided a new fix rancherlabs/support-tools#160 |
Re opening the ticket: Rancher upgraded from 2.5.12 >> 2.6-head 85d6925
|
The fix is added to the above mentioned PR |
Validated the issue by
|
Keeping this issue open as the PR for the standalone cleanup and verify scripts is not merged. |
Moving this issue to to-test because the linked PR is merged. |
Closing the issue validations noted here #36803 (comment) |
kubectl delete -n fleet-default clusterregistrations $(kubectl get clusterregistrations -n fleet-default) |
Rancher Server Setup
Describe the bug
Rancher server is not coming up active when rolling back (restoring) rancher to the backup taken in 2.6.3
To Reproduce
Official docs : https://rancher.com/docs/rancher/v2.6/en/installation/install-rancher-on-k8s/rollbacks/
kubectl create -f <file>
Result
Restore is never completed with the following error logged in rancher-backup pod (
kubectl logs rancher-backup-6495d4976b-896b5 -n cattle-resources-system
)apiversion differences between 2.6.3 and 2.6-head
Expected Result
Rancher restores to the state from backup
The text was updated successfully, but these errors were encountered: