Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEP-17] Add force restoring of shoots and backup entries #5123

Merged
merged 4 commits into from
Dec 20, 2021

Conversation

stoyanr
Copy link
Contributor

@stoyanr stoyanr commented Dec 6, 2021

How to categorize this PR?

/area control-plane-migration
/kind enhancement

What this PR does / why we need it:

  • Adds 2 new reconcilers to gardenlet to enable the forceful restoration of shoots and backup entries to the destination seed during control plane migration if the preparation for migration in the source seed is not finished after a certain grace period and is considered unlikely to succeed ("bad case" scenario).
  • Adds a new optional status field MigrationStartTime to Shoot and BackupEntry to be able to track how much time has elapsed since the migration of the shoot / backup entries to this seed started.
  • Adds a new feature gate ForceRestore to be able to enable / disable this forceful restoration. It depends on the feature gates UseDNSRecords and CopyEtcdBackupsDuringControlPlaneMigration being enabled.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

  • The force restore is disabled if either the source or the destination seed has owner checks disabled (via spec.setttings.ownerChecks.enabled=false). Owner checks are needed in the source, because otherwise the ownership change will not be detected, and in the destination, because otherwise the owner DNS record will not be created and the destination will not overtake the source as the owner.
  • The new reconcilers have been added to gardenlet, even though there are no strong reasons to prefer gardenlet over GCM. There are a few "weak" reasons:
    • There is a shoot controller in GCM but no backup entry controller, so moving the backup entry migration reconciler to GCM would result in adding some more boilerplate.
    • The controller in GCM could force the restoration to a destination seed that is also not reachable. gardenlet in such a seed would most likely also not be able to force the restoration (which is from my perspective better).

Release note:

If the `ForceRestore` feature gate is enabled, the shoot's restoration to the destination seed during control plane migration will be forced if the preparation for migration in the source seed is not finished after a certain grace period and is considered unlikely to succeed ("bad case" scenario).

@stoyanr stoyanr requested a review from a team as a code owner December 6, 2021 15:26
@gardener-robot gardener-robot added kind/api-change API change with impact on API users needs/second-opinion area/control-plane-migration Control plane migration related kind/enhancement Enhancement, improvement, extension size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 6, 2021
@stoyanr
Copy link
Contributor Author

stoyanr commented Dec 6, 2021

/cc @plkokanov @timebertt

@stoyanr
Copy link
Contributor Author

stoyanr commented Dec 7, 2021

Rebased on latest master and removed Draft.
/invite @rfranzke @timebertt @plkokanov

@plkokanov
Copy link
Contributor

/assign

@stoyanr
Copy link
Contributor Author

stoyanr commented Dec 9, 2021

/squash

@stoyanr
Copy link
Contributor Author

stoyanr commented Dec 18, 2021

@timebertt Thanks for your review, I answered / addressed your comments in a new commit. The one open point remaining is whether the controllers should be moved to GCM or not, let's clarify this and I will further adapt the PR if needed.

Copy link
Member

@timebertt timebertt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing and clarifying my comments!
/lgtm

@timebertt timebertt merged commit c227601 into gardener:master Dec 20, 2021
@@ -39,6 +39,7 @@ The following tables are a summary of the feature gates that you can set on diff
| WorkerPoolKubernetesVersion | `false` | `Alpha` | `1.35` | |
| CopyEtcdBackupsDuringControlPlaneMigration | `false` | `Alpha` | `1.37` | |
| SecretBindingProviderValidation | `false` | `Alpha` | `1.38` | |
| ForceRestore | `false` | `Alpha` | `1.38` | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be 1.39, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref #5190

@stoyanr stoyanr deleted the enh/add-shoot-migration-fallback branch December 20, 2021 14:17
krgostev pushed a commit to krgostev/gardener that referenced this pull request Apr 21, 2022
)

* Add force restoring of shoots and backup entries

* Disable force restoring if owner checks are disabled

* Add new controllers config to example gardenlet config

* Address code review comments.
krgostev pushed a commit to krgostev/gardener that referenced this pull request Jul 5, 2022
)

* Add force restoring of shoots and backup entries

* Disable force restoring if owner checks are disabled

* Add new controllers config to example gardenlet config

* Address code review comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane-migration Control plane migration related kind/api-change API change with impact on API users kind/enhancement Enhancement, improvement, extension size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants