ibmcloud: add preflight quota validation for VPC IPI installations#10589
ibmcloud: add preflight quota validation for VPC IPI installations#10589asadawar wants to merge 1 commit into
Conversation
Implement PlatformQuotaCheck for IBM Cloud VPC following the existing AWS/GCP pattern. Checks floating IP, security group, load balancer, and instance counts against known default limits before creating any infrastructure. During 4.22 rc5 testing, an IBM Cloud IPI install failed 20 minutes in when the floating IP quota (40/40) was exhausted. The installer had already created instances, load balancers, security groups, and uploaded the RHCOS image before discovering the limit. All resources were orphaned and required manual cleanup. With this change, the installer validates resource availability before provisioning and fails fast with a clear quota error. Assisted-by: Claude Code RFE-9374
|
Warning Review limit reached
More reviews will be available in 3 minutes and 8 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (7)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @asadawar. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Summary
PlatformQuotaCheckfor IBM Cloud VPC following the existing AWS/GCP patternListFloatingIPs,ListSecurityGroups,ListLoadBalancers,ListInstances) to the IBM Cloud clientWhy this approach
IBM Cloud VPC IPI installations currently have no preflight quota validation. The installer creates infrastructure (instances, load balancers, security groups, COS bucket, RHCOS image) over 15-20 minutes before discovering resource limits. When a limit is exceeded, the install fails late with orphaned resources that require
openshift-install destroy clusterto clean up.The
PlatformQuotaCheckasset already has implementations for AWS (pkg/quota/aws), GCP (pkg/quota/gcp), and OpenStack. IBM Cloud was grouped under "no special provisioning requirements" at line 156 ofpkg/asset/quota/quota.go. This change follows the same Constraints/Load/Check pattern.IBM Cloud does not have a Service Quotas API like AWS. Usage is determined by counting existing resources via the VPC API and comparing against known default limits. The limits are hardcoded with a comment noting they may vary by account. API failures during quota loading log a warning and skip the check rather than blocking the install.
Floating IP count accounts for publish mode:
External(default) needs an additional floating IP for the public API load balancer.Cluster verification
During OCP 4.22 rc5 IPI testing on IBM Cloud VPC:
Attempt 1 (without this change): Install ran for ~20 minutes creating 4 instances, 2 load balancers, 6 security groups, and uploading the RHCOS image before failing at floating IP assignment (40/40 quota). All resources orphaned.
Attempt 3 (without this change): GPU worker (gx3d-160x1792x8h200) failed with
cannot_start_capacity. Cluster installed but timed out because no workers joined for ingress.Both would be caught in <30 seconds with this preflight check.
Test plan
go build ./pkg/quota/ibmcloud/... ./pkg/asset/quota/ibmcloud/... ./pkg/asset/quota/...compilesgo vetpasses on all modified packagesgofmtreports no formatting issuesgo test ./pkg/quota/ibmcloud/...passes (table-driven tests for constraint aggregation)Pager.GetAllWithContext()matching existing codebase patternVPCNameis set in install-configRFE: https://issues.redhat.com/browse/RFE-9374