Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gardenlet] Switch Shoot main controller to controller-runtime #7082

Merged
merged 12 commits into from
Dec 2, 2022

Conversation

timebertt
Copy link
Member

@timebertt timebertt commented Nov 25, 2022

How to categorize this PR?

/area dev-productivity scalability
/kind enhancement

What this PR does / why we need it:
Refactor the Shoot main controller to controller-runtime.
Co-Authored-By @rfranzke

Which issue(s) this PR fixes:
Part of #4251

Special notes for your reviewer:

This PR comes with a few behavioral changes:

  • the dedicated queue in the shoot controller for shoots that are registered as managed seeds is dropped
  • status.UID is always written when the operation starts, previously this was only done on the start of a reconciliation (i.e., not on deletion or migration)
  • the logic for skipping reconciliations during migration in the BackupEntry reconciler is changed and harmonized with the shoot controller (cc @plkokanov)

✅ In draft, until I have performed some manual tests for the special cases of shoot reconciliation (see docs)

@rfranzke and I are working on a follow-up PR for refactoring the corresponding logic of these special cases to make it more readable and testable.

Generally, we want to follow this cookbook while refactoring existing controllers:

  • Add documentation
  • Add integration test based on envtest (if not already present)
  • Switch controller to controller-runtime

Integration tests are skipped for this controller, as they can't be achieved with low effort while still providing meaningful benefits.

Release note:

NONE

timebertt and others added 6 commits November 25, 2022 14:43
- Precisely determine responsibility for next reconciliation
- Move central logic from `reconciler_migrate.go` to `reconciler.go`
- Move common preparation work (Cluster resource sync, operation initialization, early exits, ...) to the `prepareOperation` function

Greetings to all reviewers: Do not even try to review this. Otherwise, lay down to the floor, and ..., good luck.
@gardener-prow gardener-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/dev-productivity Developer productivity related (how to improve development) area/scalability Scalability related kind/enhancement Enhancement, improvement, extension labels Nov 25, 2022
@gardener-prow
Copy link
Contributor

gardener-prow bot commented Nov 25, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@gardener-prow gardener-prow bot added the cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. label Nov 25, 2022
@gardener-prow gardener-prow bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 25, 2022
@timebertt
Copy link
Member Author

/test all

@rfranzke
Copy link
Member

/assign

Previously, this was done to prevent duplicated reconciliations.
With predicate.GenerationChangedPredicate, this is no longer necessary
as we only enqueue when metadata.generation changes.
@timebertt
Copy link
Member Author

/test all

@timebertt
Copy link
Member Author

@timebertt
Copy link
Member Author

timebertt commented Nov 28, 2022

While testing locally, I found the following problem:

gardenlet is unable to record events on shoots:

{"level":"error","ts":"2022-11-28T08:49:54.605Z","msg":"Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"e2e-hibernated.172bb3734939828b\", GenerateName:\"\", Namespace:\"garden-local\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Shoot\", Namespace:\"garden-local\", Name:\"e2e-hibernated\", UID:\"65ef8916-cd35-409c-8d08-cb0a2d57861f\", APIVersion:\"core.gardener.cloud/v1beta1\", ResourceVersion:\"324\", FieldPath:\"\"}, Reason:\"Reconciled\", Message:\"Reconciled Shoot cluster\", Source:v1.EventSource{Component:\"shoot-controller\", Host:\"\"}, FirstTimestamp:time.Date(2022, time.November, 28, 8, 49, 54, 602017419, time.Local), LastTimestamp:time.Date(2022, time.November, 28, 8, 49, 54, 602017419, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'events is forbidden: User \"system:serviceaccount:garden:gardenlet\" cannot create resource \"events\" in API group \"\" in the namespace \"garden-local\"' (will not retry!)\n","stacktrace":"k8s.io/client-go/tools/record.recordEvent\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:267\nk8s.io/client-go/tools/record.recordToSink\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:216\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartRecordingToSink.func1\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:194\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:314"}
...
{"level":"error","ts":"2022-11-28T08:53:52.724Z","msg":"Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"e2e-hibernated.172bb3aab99578a4\", GenerateName:\"\", Namespace:\"garden-local\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Shoot\", Namespace:\"garden-local\", Name:\"e2e-hibernated\", UID:\"65ef8916-cd35-409c-8d08-cb0a2d57861f\", APIVersion:\"core.gardener.cloud/v1beta1\", ResourceVersion:\"439\", FieldPath:\"\"}, Reason:\"Deleted\", Message:\"Deleted Shoot cluster\", Source:v1.EventSource{Component:\"shoot-controller\", Host:\"\"}, FirstTimestamp:time.Date(2022, time.November, 28, 8, 53, 52, 710293668, time.Local), LastTimestamp:time.Date(2022, time.November, 28, 8, 53, 52, 710293668, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'events is forbidden: User \"system:serviceaccount:garden:gardenlet\" cannot create resource \"events\" in API group \"\" in the namespace \"garden-local\"' (will not retry!)\n","stacktrace":"k8s.io/client-go/tools/record.recordEvent\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:267\nk8s.io/client-go/tools/record.recordToSink\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:216\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartRecordingToSink.func1\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:194\nk8s.io/client-go/tools/record.(*eventBroadcasterImpl).StartEventWatcher.func1\n\tk8s.io/client-go@v11.0.1-0.20190409021438-1a26190bd76a+incompatible/tools/record/event.go:314"}

Looking into it now.

@timebertt timebertt marked this pull request as ready for review November 28, 2022 11:16
@gardener-prow gardener-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 28, 2022
@gardener-prow gardener-prow bot requested a review from acumino November 28, 2022 11:16
@timebertt
Copy link
Member Author

This PR is ready. I have performed several manual tests for the special cases mentioned in the docs and fixed a few things accordingly.

@timebertt
Copy link
Member Author

@shafeeqes
Copy link
Contributor

Should pass now, #7064 is fixed.
/test pull-gardener-e2e-kind

Copy link
Contributor

@shafeeqes shafeeqes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR.
Had a combined look with @oliver-goetz and @ary1992.
Looks good in general. Just left some questions.
/assign @oliver-goetz @ary1992 @shafeeqes

@shafeeqes
Copy link
Contributor

/lgtm

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2022
Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, I can fix it myself and then let's get this in. Thanks for making this ready @timebertt!

docs/concepts/gardenlet.md Outdated Show resolved Hide resolved
@gardener-prow gardener-prow bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2022
@rfranzke
Copy link
Member

rfranzke commented Dec 2, 2022

/test pull-gardener-integration
/lgtm
/approve

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Dec 2, 2022
@gardener-prow
Copy link
Contributor

gardener-prow bot commented Dec 2, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 2, 2022
@gardener-prow
Copy link
Contributor

gardener-prow bot commented Dec 2, 2022

@timebertt: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gardener-apidiff 9fb6b6e link false /test pull-gardener-apidiff

Full PR test history. Your PR dashboard. Command help for this repository.
Please help us cut down on flakes by linking this test failure to an open flake report or filing a new flake report if you can't find an existing one. Also see our testing guideline for how to avoid and hunt flakes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rfranzke
Copy link
Member

rfranzke commented Dec 2, 2022

/test pull-gardener-integration

@gardener-prow gardener-prow bot merged commit c11c86a into gardener:master Dec 2, 2022
@timebertt timebertt deleted the refactor/shoot-ctrl-2 branch December 5, 2022 07:26
@rfranzke rfranzke changed the title [gardenlet] Switch Shoot main controller to controller-runtime [gardenlet] Switch Shoot main controller to controller-runtime Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/dev-productivity Developer productivity related (how to improve development) area/scalability Scalability related cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants