[release-4.22] OCPBUGS-83712: Use upgrades.Skippable for Gateway API upgrade test skip logic#31030
Conversation
…ip logic The Gateway API upgrade test was calling g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters. Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup, avoiding the goroutine panic. Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support). https://redhat.atlassian.net/browse/OCPBUGS-83267 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: openshift-cherrypick-robot The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
there's a bug with cherry-pick where it still links to the parent so i must retitle: |
|
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-83712, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-83712, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No e2e CI jobs are running, either it's a glitch or a bug. |
|
/pipeline required |
|
/test help |
|
Kicking off the e2e jobs manually: |
|
/test help |
|
@CodeRabbit out of these optional jobs, which ones should I test based on the logic changed in this PR? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me look at the actual changes in this PR to give you a targeted recommendation. 🏁 Script executed: Length of output: 10594
Here are the most relevant optional jobs: 🔴 High priority — directly exercises the changed upgrade-test
|
| Job | Why |
|---|---|
e2e-aws-ovn-upgrade |
AWS/OVN upgrade; most common upgrade path, exercises Skip() logic |
e2e-aws-ovn-upgrade-rollback |
Exercises Skip() on both upgrade and rollback flows |
e2e-aws-ovn-single-node-upgrade |
Single-node variant; different topology may hit different skip conditions |
e2e-azure-ovn-upgrade |
Different cloud platform; validates shouldSkipGatewayAPITests platform-branch logic |
e2e-gcp-ovn-rt-upgrade |
GCP real-time upgrade path |
🟡 Medium priority — exercises the GatewayAPIWithoutOLM tech-preview feature gate path in gatewayapicontroller.go
| Job | Why |
|---|---|
e2e-gcp-ovn-techpreview |
Enables tech-preview feature gates including GatewayAPIWithoutOLM |
e2e-gcp-ovn-techpreview-serial-1of2 |
Serial variant of tech-preview on GCP |
e2e-gcp-ovn-techpreview-serial-2of2 |
Serial variant of tech-preview on GCP |
e2e-aws-ovn-single-node-techpreview |
Single-node + tech-preview on AWS |
e2e-aws-ovn-single-node-techpreview-serial |
Serial variant of single-node tech-preview |
⬇️ Low/skip — not relevant to this change
Everything else (etcd scaling, kube-apiserver-rollout, metal IPI, openstack, vsphere, cgroupsv2, FIPS, proxy, kubevirt, etc.) does not exercise upgrade skip logic or the Gateway API feature gate path, so they're unlikely to catch regressions from this change.
TL;DR: Run the 5 upgrade jobs for the primary fix, and the 5 techpreview jobs for the gatewayapicontroller.go refactor. The e2e-gcp-ovn-upgrade you already triggered from the required set covers the baseline upgrade path on GCP.
|
Taking Rabbit's suggestion: |
|
I'm seeing infra errors like: /retest |
|
infra failures |
|
@openshift-cherrypick-robot: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
I hate to do this, but better safe than sorry - as the reason this fix is needed is because I missed a nightly job that wasn't on the presubmits, and I really don't want to break 4.22 CI any further: /payload 4.22 nightly informing |
|
@gcs278: trigger 66 job(s) of type informing for the nightly release of OCP 4.22
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a6ad5290-3abe-11f1-9e74-d9e79748250e-0 |
|
@gcs278: trigger 66 job(s) of type informing for the nightly release of OCP 4.22
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b55af3b0-3abe-11f1-8959-c56508329bae-0 |
|
oh gosh i triggered 2: |
|
Let's do this again: I hate to do this, but better safe than sorry - as the reason this fix is needed is because I missed a nightly job that wasn't on the presubmits, and I really don't want to break 4.22 CI any further: /payload 4.22 nightly informing |
|
@gcs278: trigger 66 job(s) of type informing for the nightly release of OCP 4.22
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ec77afa0-3abe-11f1-8eb1-096054443738-0 |
This is an automated cherry-pick of #31000
/assign gcs278