New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-17433: Sync featuregate controller during the node config controller sync #3846
OCPBUGS-17433: Sync featuregate controller during the node config controller sync #3846
Conversation
1. Changes in the `nodes.config.openshift.io` object results in the create/update of `97-{master/worker}-generated-kubelet` machine configs 2. However, these changes are not rendered into the `master/worker` mcps if `98-{master/worker}-generated-kubelet` or `99-{master/worker}-generated-kubelet` machine configs exist due to precedence. 3. Hence, in order to bring in the change caused by nodes.config object's update, the respective sync functions of featuregate controller and the Kubelet config controller need to be called/executed in the syncNodeConfig function. 4. `syncKubeletConfig` function is already getting called and this code change introduces the calling of `syncFeatureHandler` function 5. Also added a minor fix in featuregate controller sync to apply the workerlatencyprofiles change only to the worker mcp Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
@sairameshv: This pull request references Jira Issue OCPBUGS-17433, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@sairameshv: This pull request references Jira Issue OCPBUGS-17433, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@sairameshv: This pull request references Jira Issue OCPBUGS-17433, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (schoudha@redhat.com), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
/test okd-scos-images |
@LorbusChris pushed an older build to registry.ci.openshift.org/origin/scos-4.14:centos-stream-coreos-9 and the backing quay.io/okd/centos-stream-coreos-9 images to recover the io.openshift.build.versions=machine-os=... labels whose recent removals had lead to unknown version reference "machine-os". /retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions below:
@@ -177,6 +177,14 @@ func (ctrl *Controller) syncNodeConfigHandler(key string) error { | |||
} | |||
} | |||
|
|||
// syncing the featuregate controller | |||
features, err := ctrl.featLister.Get(ctrlcommon.ClusterFeatureInstanceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work. This would also be an issue for kubeletconfigcontroller generating the 99-xx off a user config right? This cluster that had the bug only had 97 and 98, but due to how our hierarchy works we would need to also call sync kubelet config here (and then in features function, also call kubelet config sync too.
The alternative would be for features controller to also watch changes to nodes.config, and kubelet config controller to also watch for features and nodes.config.
Longer term though I think this hierarchy is probably not the best way to do this as it is not very scalable, but for now I think we should focus on fixing this issue since I suspect it will show up more often with the setting cgroups via nodes.config we added to 4.13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with your comment. The issue with Kubeconfigcontroller generating 99-XX has already been taken care as of now in the node config controller here
But going forward, we might have to come up with another approach to overcome this complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold For QE pre-merge verification. Feel free to unhold if that's not relevant here |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sairameshv, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label qe-approved |
/hold cancel This change has been validated and hence ready for the merge. |
/retest |
@sairameshv: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@sairameshv: Jira Issue OCPBUGS-17433: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-17433 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
nodes.config.openshift.io
object results in the create/update of97-{master/worker}-generated-kubelet
machine configsmaster/worker
mcps if98-{master/worker}-generated-kubelet
or99-{master/worker}-generated-kubelet
machine configs exist due to precedence.syncKubeletConfig
function is already getting called and this code change introduces the calling ofsyncFeatureHandler
functionFixes: https://issues.redhat.com/browse/OCPBUGS-17433