New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1796996: baremetal: map hardware profile to baremetal-operator default #2969
Conversation
We rely on vendored baremetal-operator code to handle various information about baremetal servers including BMC credentials and hardware profiles. The BMO uses the string 'unknown' as `hardware.DefaultProfileName`. As these string values are exposed to the user in the baremetal platform as part of the install-config, it felt awkward if a user wanted to explicitly use the default hardware profile to have to write the string 'unknown'. In golang, people are generally just writing the constant, 'hardware.DefaultProfileName`. So, in the terraform variables, we mapped the string value 'default' to mean that, but it wasn't done for creating the machines for workers. Now that you can deploy workers during installation as a day-1 operation, if you specify the string 'default' for workers, they stuck stuck in the 'match profile' state of the BMO. Keep in mind, these baremetalhosts are all defined in the 'hosts' section of the install config, but the profiles are now working differently depending on whether it's part of the control plane (i.e. created by terraform) or not. This PR ensures the behavior is the same in both instances, so you can use the string 'default' for workers and masters. fixes openshift#2705
/assign @hardys Hopefully the explanation is clear what's going on here. Maybe in retrospect this wasn't a good idea, but I think it has to be consistent. The masters mapping is done in installer/pkg/tfvars/baremetal/baremetal.go Lines 45 to 47 in 96d416e
|
/label platform/baremetal |
Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1448/ |
/cc karmab |
/retest |
/approve |
Looks good, I re-kicked the metal3ci as the previous run timed out before getting to deploy any workers |
Failure might also be because of the wrong interface name for metal3 openshift-metal3/dev-scripts#897 (comment) |
Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1449/ |
Another solution to this would be to fix the metal3 code. The current "default" value is based on the fact that the matching code to pick a profile based on inspection data hasn't been written. There's a completely different approach for that planned now, so we could drop the placeholder code that sets the value to "unknown" and add some better names for the profiles we do have so it's clear when any of them should be used explicitly. |
That's fine longer term, but I think we need something now. This is simple and can land today before dev freeze, and cna be backported to 4.3. |
@stbenjam: This pull request references Bugzilla bug 1796996, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @dhellmann Would you mind /lgtm if you're ok with this change? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dhellmann, hardys The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
22 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@stbenjam: All pull requests linked via external trackers have merged. Bugzilla bug 1796996 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@stbenjam: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Bug 1796996: baremetal: map hardware profile to baremetal-operator default
/cherry-pick release-4.3 |
@stbenjam: new pull request created: #3106 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We rely on vendored baremetal-operator code to handle various
information about baremetal servers including BMC credentials and
hardware profiles. The BMO uses the string 'unknown' as
hardware.DefaultProfileName
.As these string values are exposed to the user in the baremetal platform
as part of the install-config, it felt awkward if a user wanted to
explicitly use the default hardware profile to have to write the string
'unknown'. In golang, people are generally just writing the constant,
'hardware.DefaultProfileName`. So, in the terraform variables, we mapped
the string value 'default' to mean that, but it wasn't done for creating
the machines for workers.
Now that you can deploy workers during installation as a day-1
operation, if you specify the string 'default' for workers, they get
stuck in the 'match profile' state of the BMO. Keep in mind, these
baremetalhosts are all defined in the 'hosts' section of the install
config, but the profiles are now working differently depending on
whether it's part of the control plane (i.e. created by terraform) or
not.
This PR ensures the behavior is the same in both instances, so you can
use the string 'default' for workers and masters.
fixes #2705