-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple TaintManager from NodeLifeCycleController (KEP-3902) #119208
Decouple TaintManager from NodeLifeCycleController (KEP-3902) #119208
Conversation
Hi @atosatto. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test /sig scheduling |
/milestone v1.28 |
/test pull-kubernetes-e2e-gce-cos-alpha-features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. The logic looks good. Could you add some integration tests to cover that:
- if the feature gate is enabled, we are able to disable taintmanager from controller manager
- if the feature gate is disable, there is no way to disable taintmanager (taintmanger functions by default)
Hey, I'm Anhelina from the Bug Triage Release team 1.28 👋. I wanted to reach out to discuss its progress and ensure it's on track for the v1.28 release, considering the Code Freeze is coming up on 01:00 UTC Wednesday 19th July 2023 / 18:00 PDT Tuesday 18th July 2023. Could you please provide an update on the current status of the issue? Is it planned for the v1.28 release? We are currently in Week 9, and Code Freeze is just over a week away. If there's anything I can do, please let me know:) |
4edec75
to
12dca6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atosatto: 1 invalid OWNERS file
In response to this:
What type of PR is this?
/kind feature
What this PR does / why we need it:
This is implementing KEP-3902 by introducing a feature-flag,
SeparateTaintManager
, allowing to de-coupletaint-manager
fromnode-lifecycle-controller
.Which issue(s) this PR fixes:
Special notes for your reviewer:
Unlike the KEP proposal, the flag used to disable/enable the controller in KCM has been named
taint-eviction-controller
in order to preserve the existing convention of having all of the controllers names ending with-controller
. This has been preferred to thetaint-manager-controller
and/ortaint-manager
alternatives. For compatibility with the KEP ataint-manager
legacy flag has been provided in the implementation.Does this PR introduce a user-facing change?
Decouple TaintManager from NodeLifeCycleController (KEP-3902)
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I've added some specific tests covering the controller registration in e9b28ea8923 which should cover this from the point of view of the kube-controller-manager registered controllers. Is this the scenario you had in mind? If not I am happy to add more tests. |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: f4a6fbd596ab190095e7a47e7a630f1b0e03b491
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1071a66
to
85347a2
Compare
2f323d4
to
62cde68
Compare
…roller (KEP-3902)
62cde68
to
ccda2d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the last minute changes to the controllers registration in #120371.
Feature gating approach has changed a needs to be updated. Otherwise LGTM
@@ -556,6 +557,10 @@ func NewControllerDescriptors() map[string]*ControllerDescriptor { | |||
register(newResourceClaimControllerDescriptor()) | |||
register(newLegacyServiceAccountTokenCleanerControllerDescriptor()) | |||
register(newValidatingAdmissionPolicyStatusControllerDescriptor()) | |||
if utilfeature.DefaultFeatureGate.Enabled(features.SeparateTaintEvictionController) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with #120371 there is no need for defining the gating here anymore, since the gating should be automatic (taken from the descriptor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atiratree not quite, this controller is on by default, since it's beta, so this check ensures with the feature gate off it's not turned on. Not sure how to handle that with your controller initiation changes. So I think we can leave it as is, for now, and fix in a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check has been moved to
kubernetes/cmd/kube-controller-manager/app/controllermanager.go
Lines 683 to 688 in b84e4a1
for _, featureGate := range controllerDescriptor.GetRequiredFeatureGates() { | |
if !utilfeature.DefaultFeatureGate.Enabled(featureGate) { | |
logger.Info("Controller is disabled by a feature gate", "controller", controllerName, "requiredFeatureGates", controllerDescriptor.GetRequiredFeatureGates()) | |
return nil, nil | |
} | |
} |
and it should work properly even when it is beta and on by default and then turned off by the user.
So it is redundant to have it in this place as well. But I don't mind merging this PR first and fixing this later.
fyi, I am trying to enforce this in #121611
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, but you remove a controller explicitly, but when you also separately have a feature gate, which allows disabling the split, what then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requiredFeatureGates
field is meant as a replacement of this logic and the other gated controllers were ported to it without any issue. It is possible to gate a controller with multiple feature gates as you can see for example in StorageVersionGarbageCollectorController
.
I made the simplest implementation I could. The implementation can be changed in the future if more custom decision logic (not just AND
) is required. Since this controller has just one, it should be okay in this situation as well.
The reason behind this is simple, we want to reason about controllers without running them and have the gating information available declaratively. For example we want to show available controllers in kube-controller-manager --help
. We can then also show the gating information there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of warned this was going to happen #120371 (comment) 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the #121611 PR and included a test to show that the gating works properly with this new controller. Please let me know which part of the test you think is insufficient. Or if you think it would be a good idea to implement a generic solution for future controllers that would require more complex gating.
|
||
// TestTaintEvictionControllerDeclaration ensures that it is possible to run taint-manager as a separated controller | ||
// only when the SeparateTaintEvictionController feature is enabled | ||
func TestTaintEvictionControllerDeclaration(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this test now, or just check the presence of descriptor.requiredFeatureGates
if you really want to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing as the other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
@@ -556,6 +557,10 @@ func NewControllerDescriptors() map[string]*ControllerDescriptor { | |||
register(newResourceClaimControllerDescriptor()) | |||
register(newLegacyServiceAccountTokenCleanerControllerDescriptor()) | |||
register(newValidatingAdmissionPolicyStatusControllerDescriptor()) | |||
if utilfeature.DefaultFeatureGate.Enabled(features.SeparateTaintEvictionController) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atiratree not quite, this controller is on by default, since it's beta, so this check ensures with the feature gate off it's not turned on. Not sure how to handle that with your controller initiation changes. So I think we can leave it as is, for now, and fix in a followup.
|
||
// TestTaintEvictionControllerDeclaration ensures that it is possible to run taint-manager as a separated controller | ||
// only when the SeparateTaintEvictionController feature is enabled | ||
func TestTaintEvictionControllerDeclaration(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing as the other comment.
LGTM label has been added. Git tree hash: 17798ff8a66e3c744af8fdb388a29ff4bdc64c19
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: atosatto, Huang-Wei, soltysh, yuanchen8911 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/milestone v1.29 |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This is implementing KEP-3902 by introducing a feature-flag,
SeparateTaintManager
, allowing to de-coupletaint-manager
fromnode-lifecycle-controller
.Which issue(s) this PR fixes:
kubernetes/enhancements#3902
Special notes for your reviewer:
Unlike the KEP proposal, the flag used to disable/enable the controller in KCM has been named
taint-eviction-controller
in order to preserve the existing convention of having all of the controllers names ending with-controller
. This has been preferred to thetaint-manager-controller
and/ortaint-manager
alternatives. For compatibility with the KEP ataint-manager
legacy flag has been provided in the implementation.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: