fix: OCPBUGS-86012: reset associatedVCenter in failure domain validation loop#10559
fix: OCPBUGS-86012: reset associatedVCenter in failure domain validation loop#10559chdeshpa-hue wants to merge 2 commits into
Conversation
…ion loop The associatedVCenter variable was declared outside the for loop that iterates over failure domains, causing it to retain a stale pointer from a previous iteration. When a failure domain referenced a non-existent vCenter server after a valid one, the nil check was skipped and validation passed silently. The installer then crashed during machine asset generation with a confusing error. Move the variable declaration inside the loop so it resets on each iteration. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@chdeshpa-hue: This pull request references Jira Issue OCPBUGS-86012, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR fixes a variable scoping issue in vCenter failure-domain validation and adds test coverage for the fix. The ChangesvCenter Validation Scoping
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @chdeshpa-hue. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/jira refresh |
|
@chdeshpa-hue: This pull request references Jira Issue OCPBUGS-86012, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
/retest |
|
@jcpowermac can this PR now be approved to merge? |
|
@chdeshpa-hue: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
associatedVCentervariable inside the failure domain validation loop so each failure domain is validated independentlyProblem
When an install-config has multiple vSphere failure domains and one of them references a vCenter server that does not exist in the
vcenterslist, the installer silently skips the validation error if a valid failure domain appears earlier in the list. The validation function remembers the vCenter it found for the previous failure domain, so when it processes the next one with a non-existent server, it still thinks a match was found. This causes:The installer then proceeds past validation and crashes during machine asset generation with:
This error gives no indication that the root cause is a wrong server name in a failure domain. The expected behavior is an early validation error:
The behavior is also order-dependent — swapping the order of failure domains in the YAML changes whether the bug triggers.
Fix
Move the
var associatedVCenter *vsphere.VCenterdeclaration from outside theforloop to inside the loop body invalidateFailureDomains(). This ensures the pointer resets to nil on each iteration, so every failure domain is validated against the vcenters list independently.Test plan
TestValidateFailureDomainStaleVCenter— confirms "server does not exist in vcenters" error is raised when an invalid server appears after a valid oneopenshift-install create manifests— should get a clear validation error pointing at the invalid server field instead of a late crash during machine generationMade with Cursor
Summary by CodeRabbit
Bug Fixes
Tests