Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing v1alpha4 cluster resources from backup #382

Merged
merged 1 commit into from Sep 8, 2023

Conversation

eliyamlevy
Copy link
Contributor

@eliyamlevy eliyamlevy commented Sep 7, 2023

Issue

rancher/rancher#42631

Change

Removing the below code block from the default resourceSet:

- apiVersion: "cluster.x-k8s.io/v1alpha4"
  kindsRegexp: "."

Explanation

These resources should not be backed up and this selector was left in the default resourceSet after the new apiVersion was added in this commit. It is not clear why.

It looks like because these resources were left included in the backup, it became a silent issue until the capi-webhook changes led to conversion webhooks being called. It seems that the conversion webhooks were not running correctly (due to this line) until the most recent version of rancher and this is why we only ran into the error now.

By removing these unnecessary resources from the backup the error is fixed because they are not there to call the conversion webhooks which are now active, effectively solving the errors in the restore.

Testing considerations

Normal P0 test cases, mainly migrations with RKE2 and RKE1 downstream clusters, are all that need to be run to test this.

@eliyamlevy eliyamlevy requested a review from a team September 7, 2023 17:06
@prachidamle prachidamle requested review from prachidamle and removed request for a team September 7, 2023 18:43
@prachidamle
Copy link
Member

prachidamle commented Sep 7, 2023

The resources "cluster.x-k8s.io/v1alpha4" are being removed and will not be backed up by Backup-restore operator version coming with 2.7.7 - But what if a user uses a previous released version to take the backup for a setup post 2.7.7 and tries to restore from there? @eliyamlevy

@eliyamlevy
Copy link
Contributor Author

eliyamlevy commented Sep 7, 2023

@prachidamle

The resources "cluster.x-k8s.io/v1alpha4" are being removed and will not be backed up by Backup-restore operator version coming with 2.7.7 - But what if a user uses a previous released version to take the backup for a setup post 2.7.7 and tries to restore from there? @eliyamlevy

Any previous version of backup restore used to take a backup of rancher v2.7.7 will result in migrations hitting the capi-webhook error. The solution is either to make sure that the 102.0.2+up3.1.2 chart version (the version with cluster.x-k8s.io/v1alpha4 resources removed from the resourceSet) is the one being used for the backups or if its not possible to delete all cluster.x-k8s.io/v1alpha4 resources from the backup tar before using it.

@eliyamlevy eliyamlevy merged commit f2463f5 into rancher:release/v3.0 Sep 8, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants