New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPNODE-1717: Update config node spec while explicitly updating the cgroups mode #3793
OCPNODE-1717: Update config node spec while explicitly updating the cgroups mode #3793
Conversation
sairameshv
commented
Jul 12, 2023
•
edited
edited
- Future OCP releases will default to cgroupsv2
- During the upgrades, in order to honor the cgroupsv1 on the existing cluster, there should be a reference in the node spec for the future code to depend on deciding the cgroup version.
- Hence, this code updates the config node spec whenever the cgroupsv1 is explicitly set
Skipping CI for Draft Pull Request. |
/test all |
/cc @yuqi-zhang @harche |
/hold cancel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this path should work, but we would need to be a bit careful of how we update the nodes config.
So the expectation here is for upgraded clusters, the user would need to manually set the cgroups to v2 via nodes.config, or delete the nodes.config object in cluster (thereby defaulting it back to the RHCOS default (of v2)?
We should definitely document the behaviour change and how users should do this carefully. Let's also talk to the release team about edge blocking, just so they are ok with this solution.
Future OCP releases will default to cgroupsv2 During the upgrades, in order to honor the cgroupsv1 on the existing cluster, there should be a reference in the node spec for the future code to depend on deciding the cgroup version. Hence, this code updates the config node spec whenever the cgroupsv1 is explicitly set Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
@sairameshv: This pull request references OCPNODE-1717 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, based on the CI run I think the behaviours are as expected. We end up with duplicated kargs for now, but that shouldn't be an issue once we upgrade to 4.14.
/label cherry-pick-approved |
/lgtm |
/label backport-risk-assessed |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rphillips, sairameshv, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold @haircommander brought up a great point... Does this PR upgrade successfully to? #3782 |
Since this is a bit of an exception to the regular process (4.13 only) we may need to either:
|
In theory it should, the MCO upon upgrade will generate a new config without the kargs in the base, but since the kargs still exist in the update node.configs object generated 97-generated-kubeletconfig, it will end up in the render and just be a no-op change by itself. @sairameshv did you verify that path? I seem to recall you did |
Yes, this upgrade has been tested. |
/hold cancel ok. thanks! |
I created a payload using this PR(to 4.14) and performed |
@sairameshv: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
56c0dc0
into
openshift:release-4.13
Following PR includes changes to save the cgroups context in the node config object openshift/machine-config-operator#3793 OCP 4.14+ releases default the cgroups version to "v2" The above changes help in restoring the cgroups context during the upgrade of the clusters Hence, it is required for the clusters to upgrade to 4.13.6 before upgrading to 4.14 Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
Following PR includes changes to save the cgroups context in the node config object openshift/machine-config-operator#3793 OCP 4.14+ releases default the cgroups version to "v2" The above changes help in restoring the cgroups context during the upgrade of the clusters Hence, it is required for the clusters to upgrade to 4.13.6 before upgrading to 4.14 Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>