New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-22933: [release-4.12]: vsphere: fix validation of CPUS and CoresPerSocket #7677
OCPBUGS-22933: [release-4.12]: vsphere: fix validation of CPUS and CoresPerSocket #7677
Conversation
I think the original intention behind the validation was to check that, in case `CoresPerSocket` was not set by the user, it would be checked against a default value (currently set to 4). However, that's not what is happening in practice, and both the check against the user-inputed value and the default were being performed. That means that using CPUS=2 and CoresPerSocket=2 as exemplified in the official docs [1] would result in a validation error: ``` failed to create install config: invalid "install-config.yaml" file: controlPlane.platform.vsphere.cpus: Invalid value: 2: numCPUs specified should be a multiple of cores per socket which is by default 4] ``` even though it's a valid configuration (albeit a pretty tiny compute node). I'm changing the code a bit to make it more clear what the defaults values are representing and fixing the checks so they are made only once with the correct values for CPUs and CoresPerSocket. [1] https://docs.openshift.com/container-platform/4.11/installing/installing_vsphere/installing-vsphere-installer-provisioned-customizations.html#installation-installer-provisioned-vsphere-config-yaml_installing-vsphere-installer-provisioned-customizations
/jira cherrypick OCPBUGS-6919 |
@r4f4: Jira Issue OCPBUGS-6919 has been cloned as Jira Issue OCPBUGS-22933. Will retitle bug to link to clone. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@r4f4: This pull request references Jira Issue OCPBUGS-22933, which is valid. The bug has been moved to the POST state. 6 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
"k8s.io/apimachinery/pkg/util/validation/field" | ||
|
||
"github.com/openshift/installer/pkg/types" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can get rid of this extra line
numCoresPerSocket = defaultCoresPerSocket | ||
} | ||
|
||
if numCoresPerSocket > numCPUs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user specified numCPUs to be 2 and the did not specify numCoresPerSocket, it would be given a value of 4. Then this check will fail. I think that is incorrect behavior, imho. When only one of those values is specified, then the other should be made equal to that? That way we don't inadvertently break check in line 49. Related, is the user allowed to specify odd numbers for numCPUs or numCoresPerSocket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think that's a bug, it needs to be reported against master. I can't fix this in the backport.
} | ||
if numCoresPerSocket <= 0 { | ||
numCoresPerSocket = defaultCoresPerSocket | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info message when a default value is in use esp when one value is specified by the user and the other is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you feel strongly about that, open a bug so we can fix in master first and then backport all the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I see that this is a back-port.
/label cherry-pick-approved |
/retest-required |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sadasu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@r4f4: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
99a69e2
into
openshift:release-4.12
@r4f4: Jira Issue OCPBUGS-22933: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-22933 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[ART PR BUILD NOTIFIER] This PR has been included in build ose-installer-container-v4.12.0-202311140330.p0.g99a69e2.assembly.stream for distgit ose-installer. |
/cherry-pick release-4.11 |
@r4f4: #7677 failed to apply on top of branch "release-4.11":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fix included in accepted release 4.12.0-0.nightly-2023-11-14-120414 |
I think the original intention behind the validation was to check that, in case
CoresPerSocket
was not set by the user, it would be checked against a default value (currently set to 4). However, that's not what is happening in practice, and both the check against the user-inputed value and the default were being performed.That means that using CPUS=2 and CoresPerSocket=2 as exemplified in the official docs [1] would result in a validation error:
even though it's a valid configuration (albeit a pretty tiny compute node).
I'm changing the code a bit to make it more clear what the defaults values are representing and fixing the checks so they are made only once with the correct values for CPUs and CoresPerSocket.
[1] https://docs.openshift.com/container-platform/4.11/installing/installing_vsphere/installing-vsphere-installer-provisioned-customizations.html#installation-installer-provisioned-vsphere-config-yaml_installing-vsphere-installer-provisioned-customizations