Fix test: e2e-aws-ovn-upgrade-paused#77842
Fix test: e2e-aws-ovn-upgrade-paused#77842hongkailiu wants to merge 5 commits intoopenshift:mainfrom
Conversation
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 9 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh (1)
291-297: Add a comment explaining why only the first upgrade receivesTEST_ARGS.The asymmetric handling of
TEST_ARGS(passed to the first upgrade but not the second or worker upgrade) is intentional per the PR description, but future maintainers won't understand this without context. Consider adding a brief comment explaining this is a workaround for binary version mismatch.Additionally, the static analysis tool (SC2086) flags the unquoted
${TEST_ARGS:-}. While this follows the existing pattern on line 335, quoting would be safer if any arguments contain spaces or special characters.Suggested documentation and optional quoting fix
echo "Starting control-plane upgrade to ${OPENSHIFT_UPGRADE0_RELEASE_IMAGE_OVERRIDE}" + # Pass TEST_ARGS only to first upgrade to work around binary version mismatch + # between 4.y test binary and 4.y-1 release (see PR `#77842`) - openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" ${TEST_ARGS:-} \ + openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" "${TEST_ARGS:-}" \ --to-image "${OPENSHIFT_UPGRADE0_RELEASE_IMAGE_OVERRIDE}" \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh` around lines 291 - 297, Add an inline comment above the first openshift-tests run-upgrade invocation explaining that TEST_ARGS are intentionally passed only to the initial upgrade (not the subsequent or worker upgrade) as a workaround for binary/version mismatch between the test binary and the target cluster; then also fix the shellwarning by quoting TEST_ARGS (change ${TEST_ARGS:-} to "${TEST_ARGS:-}") in the run-upgrade call so arguments with spaces/special chars are handled safely—refer to the openshift-tests run-upgrade invocation lines and ensure the comment mentions the version-mismatch/workaround rationale for future maintainers.ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml (1)
32-34: Consider tracking the workaround with a TODO or linked issue.The PR description indicates this is a temporary workaround until the 4.y-1 binary can be used for the first upgrade. Disabling 22+ monitors substantially reduces test coverage for the 4.20→4.21 upgrade phase. Adding a TODO comment or linking to a tracking issue would help ensure this workaround is revisited.
Suggested documentation
env: TEST_UPGRADE_OPTIONS: "" + # TODO(JIRA-XXXX): Remove TEST_ARGS once 4.y-1 binary is used for first upgrade TEST_ARGS: --disable-monitor=etcd-log-analyzer,node-lifecycle,...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml` around lines 32 - 34, Add a short TODO comment or add a reference to a tracking issue next to the temporary workaround where monitors are disabled (in the TEST_ARGS environment value that lists the disabled monitors) so the change is clearly flagged for revisit; update the same YAML block that sets TEST_UPGRADE_OPTIONS/TEST_ARGS to include a one-line TODO or an issue URL describing this is a temporary workaround until the 4.y-1 binary is available and indicating who should follow up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml`:
- Line 34: The TEST_ARGS contains a misspelled monitor name
"pod-network-avalibility"; update all occurrences of that string to the correct
"pod-network-availability" (e.g., in the TEST_ARGS value and other places where
"pod-network-avalibility" appears across the repo, roughly 8 locations) so
monitor names are consistent; search for the exact token
"pod-network-avalibility" and replace it with "pod-network-availability", then
run any config linter/tests that validate monitor names (the relevant token to
change is TEST_ARGS and the monitor name string).
---
Nitpick comments:
In
`@ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml`:
- Around line 32-34: Add a short TODO comment or add a reference to a tracking
issue next to the temporary workaround where monitors are disabled (in the
TEST_ARGS environment value that lists the disabled monitors) so the change is
clearly flagged for revisit; update the same YAML block that sets
TEST_UPGRADE_OPTIONS/TEST_ARGS to include a one-line TODO or an issue URL
describing this is a temporary workaround until the 4.y-1 binary is available
and indicating who should follow up.
In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh`:
- Around line 291-297: Add an inline comment above the first openshift-tests
run-upgrade invocation explaining that TEST_ARGS are intentionally passed only
to the initial upgrade (not the subsequent or worker upgrade) as a workaround
for binary/version mismatch between the test binary and the target cluster; then
also fix the shellwarning by quoting TEST_ARGS (change ${TEST_ARGS:-} to
"${TEST_ARGS:-}") in the run-upgrade call so arguments with spaces/special chars
are handled safely—refer to the openshift-tests run-upgrade invocation lines and
ensure the comment mentions the version-mismatch/workaround rationale for future
maintainers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: ef0248c8-12b0-4bf8-88aa-8e8c6a8b532d
📒 Files selected for processing (2)
ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yamlci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
| cluster_profile: openshift-org-aws | ||
| env: | ||
| TEST_UPGRADE_OPTIONS: "" | ||
| TEST_ARGS: --disable-monitor=etcd-log-analyzer,node-lifecycle,on-prem-haproxy,on-prem-keepalived,initial-and-final-operator-log-scraper,apiserver-incluster-availability,kubelet-log-collector,audit-log-analyzer,metrics-endpoints-down,alert-summary-serializer,cpu-metric-collector,pod-network-avalibility,service-type-load-balancer-availability,ingress-availability,pathological-event-analyzer,legacy-test-framework-invariants,operator-state-analyzer,legacy-cvo-invariants,apiserver-external-availability,azure-metrics-collector,etcd-disk-metrics-intervals,termination-message-policy |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check for the correct spelling of the pod-network availability monitor
# Search for monitor names in the codebase
rg -n "pod-network-av" --type go --type yaml | head -30Repository: openshift/release
Length of output: 3790
🏁 Script executed:
# Search for the correctly spelled "pod-network-availability"
rg "pod-network-availability" --type go --type yaml | head -20Repository: openshift/release
Length of output: 43
🏁 Script executed:
# Search for monitor definitions to find authoritative names
rg "pod-network" --type go -A 2 -B 2 | head -50Repository: openshift/release
Length of output: 43
🏁 Script executed:
# Search for monitor definitions or registry across all file types
rg "monitor.*pod-network" -i --max-count 10Repository: openshift/release
Length of output: 4003
🏁 Script executed:
# Check if there's a monitoring or test framework file that lists valid monitors
fd -e go -e yaml -e json | xargs grep -l "monitor" | grep -i "monitor\|test\|framework" | head -10Repository: openshift/release
Length of output: 915
🏁 Script executed:
# Look for where TEST_ARGS or disable-monitor is documented
rg "disable-monitor" -A 5 -B 5 --max-count 5 | head -40Repository: openshift/release
Length of output: 5352
Possible typo: pod-network-avalibility should likely be pod-network-availability.
This misspelling appears consistently across at least 8 locations in the codebase, suggesting it was copied from an original source. The monitor name should use the correct English spelling "availability" rather than "avalibility".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml`
at line 34, The TEST_ARGS contains a misspelled monitor name
"pod-network-avalibility"; update all occurrences of that string to the correct
"pod-network-availability" (e.g., in the TEST_ARGS value and other places where
"pod-network-avalibility" appears across the repo, roughly 8 locations) so
monitor names are consistent; search for the exact token
"pod-network-avalibility" and replace it with "pod-network-availability", then
run any config linter/tests that validate monitor names (the relevant token to
change is TEST_ARGS and the monitor name string).
There was a problem hiding this comment.
The typo comes from its definition.
That source should be fixed first.
aedc166 to
4978391
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
| steps: | ||
| cluster_profile: openshift-org-aws | ||
| env: | ||
| TEST_ARGS: --disable-monitor=etcd-log-analyzer,node-lifecycle,on-prem-haproxy,on-prem-keepalived,initial-and-final-operator-log-scraper,apiserver-incluster-availability,kubelet-log-collector,audit-log-analyzer,metrics-endpoints-down,alert-summary-serializer,cpu-metric-collector,pod-network-avalibility,service-type-load-balancer-availability,ingress-availability,pathological-event-analyzer,legacy-test-framework-invariants,operator-state-analyzer,legacy-cvo-invariants,apiserver-external-availability,azure-metrics-collector,etcd-disk-metrics-intervals,termination-message-policy |
There was a problem hiding this comment.
If there is a knob of disable-all, I would use it here.
I would disable all other non-monitor tests as well, as long as it fails if the upgrade is failed.
|
Still failed on the first upgrade but $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_release/77842/rehearse-77842-periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused/2044433136883863552/artifacts/e2e-aws-ovn-upgrade-paused/openshift-e2e-test/artifacts/junit/e2e-events_20260415-155730.json | jq '[.items[]|select(.locator.type=="Kind" and .locator.keys["clusterversion"]=="cluster" and .message.reason=="UpgradeStarted" and .source=="KubeEvent")]'
[
{
"level": "Info",
"source": "KubeEvent",
"locator": {
"type": "Kind",
"keys": {
"clusterversion": "cluster",
"hmsg": "659712c7dc",
"namespace": "openshift-cluster-version"
}
},
"message": {
"reason": "UpgradeStarted",
"cause": "",
"humanMessage": "version/ image/registry.build09.ci.openshift.org/ci-op-bhr33tp1/release@sha256:fe8c01d54f42fd4818aa0801d7087966f91304d447bc3fe9ceb1b97b9b9e40cb",
"annotations": {
"firstTimestamp": "0001-01-01T00:00:00Z",
"lastTimestamp": "0001-01-01T00:00:00Z",
"reason": "UpgradeStarted"
}
},
"from": "2026-04-15T16:03:24Z",
"to": "2026-04-15T16:03:24Z"
}
]
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_release/77842/rehearse-77842-periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused/2044433136883863552/artifacts/e2e-aws-ovn-upgrade-paused/openshift-e2e-test/build-log.txt | grep 'Starting control-plane upgrade to' -A
3
+ echo 'Starting control-plane upgrade to registry.build09.ci.openshift.org/ci-op-bhr33tp1/release@sha256:fe8c01d54f42fd4818aa0801d7087966f91304d447bc3fe9ceb1b97b9b9e40cb'
Starting control-plane upgrade to registry.build09.ci.openshift.org/ci-op-bhr33tp1/release@sha256:fe8c01d54f42fd4818aa0801d7087966f91304d447bc3fe9ceb1b97b9b9e40cb
+ wait 366
+ openshift-tests run-upgrade all --disable-monitor=etcd-log-analyzer,node-lifecycle,on-prem-haproxy,on-prem-keepalived,initial-and-final-operator-log-scraper,apiserver-incluster-availability,kubelet-log-collector,audit-log-analyzer,metrics-endpoints-down,alert-summary-serializer,cpu-metric-collector,pod-network-avalibility,service-type-load-balancer-availability,ingress-availability,pathological-event-analyzer,legacy-test-framework-invariants,operator-state-analyzer,legacy-cvo-invariants,apiserver-external-availability,azure-metrics-collector,etcd-disk-metrics-intervals,termination-message-policy --retry-strategy=aggressive --to-image registry.build09.ci.openshift.org/ci-op-bhr33tp1/release@sha256:fe8c01d54f42fd4818aa0801d7087966f91304d447bc3fe9ceb1b97b9b9e40cb --options '' --provider '{"type":"aws","region":"us-east-1","zone":"us-east-1d","multizone":true,"multimaster":true}' -o /logs/artifacts/e2e.log --junit-dir /logs/artifacts/junit
I0415 15:56:51.780793 366 factory.go:195] Registered Plugin "containerd" |
4978391 to
63b6df3
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh (1)
291-291: HardenTEST_ARGSexpansion forrun-upgradeinvocationAt Line 291, unquoted
${TEST_ARGS:-}can trigger globbing/accidental word splitting. Please pass parsed args via an array to preserve intended tokens safely.Proposed change
function upgrade_paused() { set -x + local -a first_upgrade_args=() + if [[ -n "${TEST_ARGS:-}" ]]; then + read -r -a first_upgrade_args <<< "${TEST_ARGS}" + fi unset TEST_SUITE TARGET_RELEASES="${OPENSHIFT_UPGRADE_RELEASE_IMAGE_OVERRIDE:-}" @@ - openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" ${TEST_ARGS:-} \ + openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" "${first_upgrade_args[@]}" \ --to-image "${OPENSHIFT_UPGRADE0_RELEASE_IMAGE_OVERRIDE}" \ --options "${TEST_UPGRADE_OPTIONS-}" \ --provider "${TEST_PROVIDER}" \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh` at line 291, The unquoted ${TEST_ARGS:-} in the openshift-tests run-upgrade invocation can cause word-splitting and globbing; fix by converting TEST_ARGS into a safe array and expanding that array when calling run-upgrade. Specifically, parse TEST_ARGS into an array (e.g., read -r -a TEST_ARGS_ARRAY <<< "$TEST_ARGS" or otherwise build TEST_ARGS_ARRAY), then call openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" "${TEST_ARGS_ARRAY[@]:-}" so each intended token is preserved and no globbing occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh`:
- Line 291: The unquoted ${TEST_ARGS:-} in the openshift-tests run-upgrade
invocation can cause word-splitting and globbing; fix by converting TEST_ARGS
into a safe array and expanding that array when calling run-upgrade.
Specifically, parse TEST_ARGS into an array (e.g., read -r -a TEST_ARGS_ARRAY
<<< "$TEST_ARGS" or otherwise build TEST_ARGS_ARRAY), then call openshift-tests
run-upgrade "${TEST_UPGRADE_SUITE}" "${TEST_ARGS_ARRAY[@]:-}" so each intended
token is preserved and no globbing occurs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: feb25979-748a-473b-97ae-51ddfac7ed7c
📒 Files selected for processing (2)
ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yamlci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
63b6df3 to
e033ce3
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh`:
- Line 291: The unquoted expansion of TEST_ARGS in the openshift-tests
run-upgrade invocation can cause word-splitting; update the invocation that
currently uses openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}"
${TEST_ARGS:-} \ to safely expand TEST_ARGS—either quote the scalar as
"${TEST_ARGS:-}" or, preferably, convert TEST_ARGS to an array and expand it as
"${TEST_ARGS[@]:-}" so multi-flag values are preserved; adjust any places that
set or read TEST_ARGS accordingly (referencing TEST_ARGS and the openshift-tests
run-upgrade call).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 7ff68b06-4fc6-41d8-8588-1e8f82ca9cef
📒 Files selected for processing (2)
ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yamlci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
✅ Files skipped from review due to trivial changes (1)
- ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml
e033ce3 to
2404cfc
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh (1)
291-291:⚠️ Potential issue | 🟡 MinorLine 291: keep
TEST_ARGSexpansion safe (still unresolved).
${TEST_ARGS:-}is still unquoted in command context, so shell word-splitting/globbing can alter arguments unexpectedly. This is the same issue raised in prior review comments and is still present.Proposed fix
function upgrade_paused() { set -x + local -a first_upgrade_args=() + if [[ -n "${TEST_ARGS:-}" ]]; then + read -r -a first_upgrade_args <<< "${TEST_ARGS}" + fi unset TEST_SUITE @@ - openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" ${TEST_ARGS:-} \ + openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" "${first_upgrade_args[@]}" \ --to-image "${OPENSHIFT_UPGRADE0_RELEASE_IMAGE_OVERRIDE}" \#!/bin/bash set -euo pipefail file="ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh" echo "Inspecting changed block:" nl -ba "$file" | sed -n '286,296p' echo echo "Finding unquoted TEST_ARGS expansions in command invocations:" rg -n '\$\{TEST_ARGS:-\}' "$file"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh` at line 291, The command invocation openshift-tests run-upgrade "${TEST_UPGRADE_SUITE}" ${TEST_ARGS:-} should quote the TEST_ARGS expansion to prevent word-splitting/globbing: replace the unquoted ${TEST_ARGS:-} with a quoted expansion (e.g. "${TEST_ARGS:-}") in the openshift-tests run-upgrade invocation so TEST_ARGS is passed safely; if TEST_ARGS is meant to carry multiple distinct arguments, refactor to use an array variable instead and expand it as "${TEST_ARGS[@]}".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@ci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh`:
- Line 291: The command invocation openshift-tests run-upgrade
"${TEST_UPGRADE_SUITE}" ${TEST_ARGS:-} should quote the TEST_ARGS expansion to
prevent word-splitting/globbing: replace the unquoted ${TEST_ARGS:-} with a
quoted expansion (e.g. "${TEST_ARGS:-}") in the openshift-tests run-upgrade
invocation so TEST_ARGS is passed safely; if TEST_ARGS is meant to carry
multiple distinct arguments, refactor to use an array variable instead and
expand it as "${TEST_ARGS[@]}".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 094f63b9-40ea-4209-a2de-e83f9fb7fdcb
📒 Files selected for processing (2)
ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yamlci-operator/step-registry/openshift/e2e/test/openshift-e2e-test-commands.sh
✅ Files skipped from review due to trivial changes (1)
- ci-operator/config/openshift/release/openshift-release-main__nightly-4.22-upgrade-from-stable-4.20.yaml
|
Progress! The failure feels like a bug for the case of paused pool. |
2404cfc to
c17d3df
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
The failure is on the worker upgrade. The two master upgrades succeeded. Progress. I am not sure if the failure is legit or not. |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
a4c6432 to
24fa9f8
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: hongkailiu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
24fa9f8 to
3821c1b
Compare
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
The failure on the 4.20 job looks irrelevant to this pull. /pj-rehearse periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
3821c1b to
c50517e
Compare
|
The 4.22 job failed. But our fix seems working. /pj-rehearse periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Workers are upraded. |
|
The 4.20 job has two failures run1 and run2. Both failed on However, we used to have successful runs such as run3. I cannot explain what made the difference. The test became too flaky suddenly? |
|
The test suite are too flaky. /pj-rehearse periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Some CO may go Progressing before the upgrade to 4.22 (i.e., still in 4.21) and its exception on 4.22 has been removed in the test binary.
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/retest |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@hongkailiu: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/pj-rehearse periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused |
|
@hongkailiu: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
The binary of the target release 4.y is used in each upgrade in the
e2e-aws-ovn-upgrade-pausede2e test. This limitation of the testconfiguration may cause unexpected failure of the job, e.g.,
do so but it is unintuitive in my opinion.
Then, the 4.y test with upgrade from 4.y-2 might fail because the fix is not
available in 4.y-1.
Ideally, we should use the 4.y-1 binary for the first one. Before that is
implemented, we use this as a workaround to fix the test nightly-4.22-upgrade-from-stable-4.20.
Many monitors have been disabled in the first upgrade to avoid failures.
It should not impact the product quality because there are tests in release
4.y-1 that do only one upgrade from 4.y-2 to 4.y-1.
Summary by CodeRabbit