-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Federation] Add a worker queue to the generic sync controller. #44987
Conversation
Hi @perotinus. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@k8s-bot ok to test |
/approve LGTM. Will add lgtm label once you verify that e2e tests pass. The unit test failure seems to be for deleting a federated service. cc @shashidharatd hope we didnt introduce any flakiness with recent service controller changes. |
@nikhiljindal, yeah it is a flake in deleting a federated service. it has been showing up more frequently in recent builds. I am surprised on what causes this flakiness, and why it was not showing up before service controller refactoring. the delete finalizers part is almost untouched by the refactoring. |
Automatic merge from submit-queue Add wait for federated service deletion Fixes the flaky kubectl tests #44987 (comment), #45264 service deletion is not instantaneous in federation. The fix is same as #42674. We need the fix now for services since we recently fixed federation service controller so that it runs successfully now. cc @shashidharatd
@k8s-bot unit test this |
@nikhiljindal @shashidharatd @marun Can you someone PTAL and run the Federation e2e tests? This has been updated to have the worker method do redelivery, which makes the reconcile method a bit simpler. |
kind, namespacedName, err) | ||
s.deliver(namespacedName, 0, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statusError will deliver this as s.deliver(namespacedName, 0, true)
which is a change in behavior. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct: I'm inclined to believe that returning true before was a typo, since this is an error condition and diverging like this would have warranted at least a comment as to why.
I think this PR is not controversial and modulo avoiding returning errors from reconcile it should merge before #45374. Reviewed 2 of 2 files at r1. federation/pkg/federation-controller/sync/controller.go, line 242 at r1 (raw file):
(No action required) Would it make sense to centralize status rather than defining for each remaining controller? federation/pkg/federation-controller/sync/controller.go, line 258 at r1 (raw file):
I think that @shashidharatd's recent work with the services PR suggests having the reconcile handle errors internally to ensure traceability. federation/pkg/federation-controller/sync/controller.go, line 274 at r1 (raw file):
Is this just defensive? Seems like something better handled via testing. Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed. federation/pkg/federation-controller/sync/controller.go, line 242 at r1 (raw file): Previously, marun (Maru Newby) wrote…
Absolutely! Not sure where it should live, though: it seems like it could live in the sync package since it will eventually, ideally, be an implementation detail of the sync controller and not used elsewhere. federation/pkg/federation-controller/sync/controller.go, line 258 at r1 (raw file): Previously, marun (Maru Newby) wrote…
That seems reasonable. federation/pkg/federation-controller/sync/controller.go, line 274 at r1 (raw file): Previously, marun (Maru Newby) wrote…
Agreed. I copied this from the reconcile loop in the replicaset controller without thinking too much about it. There doesn't seem to be harm in leaving it, though it scares me a bit to think how this codepath would be triggered–it would require casting–so I think I'll remove it. Comments from Reviewable |
/lgtm Reviewed 1 of 1 files at r2. federation/pkg/federation-controller/sync/controller.go, line 242 at r1 (raw file): Previously, perotinus (Jonathan MacMillan) wrote…
Makes sense to me. Comments from Reviewable |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: marun, nikhiljindal, perotinus
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 45453, 45307, 44987) |
case statusNeedsRecheck: | ||
s.deliver(*namespacedName, s.reviewDelay, false) | ||
case statusNotSynced: | ||
s.deliver(*namespacedName, s.reviewDelay, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice until I went to rebase on this, but shouldn't this be s.ClusterAvailableDelay?
This is in preparation for converting the ReplicaSet controller to be a generic sync controller.
This doesn't include support for multiple workers yet: it's not immediately obvious how to support the command-line flags for ReplicaSet (or, I suppose in general, how do TypeAdapters support external configuration via whatever flag mechanism we're using).
cc @marun
Release note: