Skip to content

NO-JIRA: Improve test/e2e-aws-ovn-upgrade-paused for 4.22#78284

Open
hongkailiu wants to merge 1 commit intoopenshift:mainfrom
hongkailiu:fix-e2e-aws-ovn-upgrade-paused-followup
Open

NO-JIRA: Improve test/e2e-aws-ovn-upgrade-paused for 4.22#78284
hongkailiu wants to merge 1 commit intoopenshift:mainfrom
hongkailiu:fix-e2e-aws-ovn-upgrade-paused-followup

Conversation

@hongkailiu
Copy link
Copy Markdown
Member

@hongkailiu hongkailiu commented Apr 23, 2026

This is to follow up

Summary by CodeRabbit

  • Chores
    • Updated test configuration and scripts for nightly upgrade testing. Modified how test arguments are passed during upgrade procedures to improve consistency in test execution and monitor disabling behavior.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 23, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@hongkailiu: This pull request explicitly references no jira issue.

Details

In response to this:

This is to follow up

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@hongkailiu
Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

Walkthrough

The changes update CI operator configuration for OpenShift upgrade testing by introducing a new TEST_ARGS_2 environment variable in the test step registry and upgrade config. The corresponding upgrade test script removes the variable from its mid-upgrade invocation and changes cluster version query output formatting from jsonpath with newline to jsonpath-as-json.

Changes

Cohort / File(s) Summary
Test Registry Definition
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-ref.yaml
Adds optional TEST_ARGS_2 environment variable to provide additional command-line arguments for a second invocation of openshift-test.
Test Configuration Assignment
ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml
Sets TEST_ARGS_2 to disable legacy-cvo-invariants monitor in the nightly 4.22 upgrade test.
Upgrade Script Implementation
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
Removes TEST_ARGS_2 variable usage from the upgrade_paused function's mid-upgrade invocation and changes cluster version query output format from jsonpath with newline to jsonpath-as-json.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: improving the e2e-aws-ovn-upgrade-paused test configuration for 4.22, which is reflected in all three modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR contains only CI/CD configuration (YAML) and shell script changes, not Ginkgo test code.
Test Structure And Quality ✅ Passed This custom check is not applicable to the provided pull request. The check is designed to evaluate Ginkgo test code, but the PR only modifies CI/CD configuration files and a bash script, not Go test code.
Microshift Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. Changes are exclusively to CI operator configuration files and test invocation shell scripts.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR modifies CI infrastructure (configuration YAML and helper scripts) but does not add any new Ginkgo e2e test code.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies CI/CD test configuration and scripts, not production deployment manifests or operator code that would define pod scheduling constraints or topology-aware configurations.
Ote Binary Stdout Contract ✅ Passed PR introduces only YAML configuration and bash orchestration script, not OTE binary Go code communicating via JSON stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR modifies infrastructure configuration files but does not introduce new Ginkgo e2e test definitions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-ref.yaml`:
- Around line 12-15: The second openshift-tests run-upgrade invocation uses only
${TEST_ARGS_2:-} which contradicts the documented “additional” semantics of
TEST_ARGS_2; update the call in
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh so
the second run-upgrade passes the combined arguments (e.g., expand to use both
${TEST_ARGS} and ${TEST_ARGS_2:-} together) or alternatively rename/document
TEST_ARGS_2 as a replacement; specifically modify the second openshift-tests
run-upgrade invocation to include ${TEST_ARGS} ${TEST_ARGS_2:-} (or merge them
into a single variable) so long-lived disable lists aren’t dropped on the second
hop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: be4a4478-8920-4210-abdb-26d31de8d3f9

📥 Commits

Reviewing files that changed from the base of the PR and between 4ccfbea and f4ad895.

📒 Files selected for processing (3)
  • ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml
  • ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
  • ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-ref.yaml

Comment on lines +12 to +15
- name: TEST_ARGS_2
default: ""
documentation: |-
Additional arguments to be passed to the 2nd call of 'openshift-test', e.g., when TEST_TYPE=upgrade-paused.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

TEST_ARGS_2 is documented as additive, but upgrade-paused currently uses it as a replacement.

ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh passes only ${TEST_ARGS_2:-} to the second openshift-tests run-upgrade call, not ${TEST_ARGS} plus ${TEST_ARGS_2}. With the new contract here, jobs like ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml will apply the long paused-upgrade disable list only to the first control-plane hop and re-enable those monitors for the second hop. Either the second invocation should combine both vars, or this variable should be renamed/documented as a replacement instead of “additional”.

Suggested fix in ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
-    openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" \
-        --to-image "${OPENSHIFT_UPGRADE1_RELEASE_IMAGE_OVERRIDE}" "${TEST_ARGS_2:-}" \
+    openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" "${TEST_ARGS:-}" \
+        --to-image "${OPENSHIFT_UPGRADE1_RELEASE_IMAGE_OVERRIDE}" "${TEST_ARGS_2:-}" \
         --options "${TEST_UPGRADE_OPTIONS-}" \
         --provider "${TEST_PROVIDER}" \
         -o "${ARTIFACT_DIR}/e2e.log" \
         --junit-dir "${ARTIFACT_DIR}/junit" &
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-ref.yaml`
around lines 12 - 15, The second openshift-tests run-upgrade invocation uses
only ${TEST_ARGS_2:-} which contradicts the documented “additional” semantics of
TEST_ARGS_2; update the call in
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh so
the second run-upgrade passes the combined arguments (e.g., expand to use both
${TEST_ARGS} and ${TEST_ARGS_2:-} together) or alternatively rename/document
TEST_ARGS_2 as a replacement; specifically modify the second openshift-tests
run-upgrade invocation to include ${TEST_ARGS} ${TEST_ARGS_2:-} (or merge them
into a single variable) so long-lived disable lists aren’t dropped on the second
hop.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Additional" is inherited from documentation of the existing env. var. TEST_ARGS in Line 10-11.

It means the additional to the existing args such as --to-image etc. when invoking openshift-tests.

TEST_ARGS_2: The _2 suffix indicates it is applied only to the 2nd upgrade hop.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hongkailiu, thank you for the clarification! That makes the design clear:

  • "Additional" in both TEST_ARGS and TEST_ARGS_2 means extra arguments on top of the hardcoded flags (like --to-image, --options, --provider, etc.) passed to openshift-tests — not cumulative between the two variables.
  • TEST_ARGS_2 is scoped exclusively to the 2nd upgrade hop invocation, while TEST_ARGS applies to the primary invocation. They are independent, hop-specific argument sets.

My original concern was a misreading of the semantics. The current implementation and documentation are correct. Apologies for the noise!


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@hongkailiu: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-ironic-image-main-okd-scos-e2e-aws-ovn openshift/ironic-image presubmit Registry content changed
pull-ci-openshift-ironic-image-release-4.21-okd-scos-e2e-aws-ovn openshift/ironic-image presubmit Registry content changed
pull-ci-openshift-multus-cni-main-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-main-okd-scos-e2e-aws-ovn openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-5.1-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-5.0-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.23-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.22-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.21-okd-scos-e2e-aws-ovn openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.21-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.20-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.19-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.18-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.17-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.16-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.15-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.14-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.13-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.12-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.11-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.10-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.9-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.8-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.7-e2e-aws openshift/multus-cni presubmit Registry content changed
pull-ci-openshift-multus-cni-release-4.6-e2e-aws openshift/multus-cni presubmit Registry content changed

A total of 17302 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci openshift-ci Bot requested review from neisw and xueqzhan April 23, 2026 20:40
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 23, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hongkailiu
Once this PR has been reviewed and has the lgtm label, please assign dgoodwin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hongkailiu
Copy link
Copy Markdown
Member Author

Install failure in the previous one.

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@hongkailiu
Copy link
Copy Markdown
Member Author

hongkailiu commented Apr 24, 2026

The rehearsal still hit

: [Monitor:oc-adm-upgrade-status][sig-cli][OCPFeatureGate:UpgradeStatus] oc adm upgrade status snapshots reflect the cluster upgrade lifecycle

and this is the reason.

$ git branch -a --contains 2d60dffe5a572f3078f9ee8018a9d31e61eadf17 | grep origin
  remotes/origin/HEAD -> origin/main
  remotes/origin/main
  remotes/origin/release-4.23
  remotes/origin/release-5.0
  remotes/origin/release-5.1

The paused test is available only for even minor releases.

I have to backport it to 4.22. And I need a bug for it. I will convert https://redhat.atlassian.net/browse/OTA-1977 to a bug.

What a bumpy road.


Update

Here we go openshift/origin#31065

@hongkailiu
Copy link
Copy Markdown
Member Author

@hongkailiu
Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@hongkailiu
Copy link
Copy Markdown
Member Author

hongkailiu commented Apr 24, 2026

4.20.0-0.nightly-2026-04-22-115050

I need to wait for the new nightly.

@hongkailiu
Copy link
Copy Markdown
Member Author

https://amd64.ocp.releases.ci.openshift.org/releasestream/4.22.0-0.nightly/release/4.22.0-0.nightly-2026-04-25-030944

This one has the backport

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 25, 2026

@hongkailiu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused f4ad895 link unknown /pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hongkailiu
Copy link
Copy Markdown
Member Author

infrastructure failure in the previous run.

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants