no-jira: checking operator ready status comprehensive#80136
Conversation
|
@bmeng: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThe PR enhances the ROSA cluster readiness check script with a multi-stage verification pipeline. It introduces a ChangesCluster Readiness Verification
🎯 2 (Simple) | ⏱️ ~10 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bmeng The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
[REHEARSALNOTIFIER]
A total of 303 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ci-operator/step-registry/rosa/cluster/wait-ready/operators/rosa-cluster-wait-ready-operators-commands.sh (1)
59-59: 💤 Low valueSimplify arithmetic expansion syntax.
The
${}expansions are unnecessary within$(( ))arithmetic context. Shellcheck flags this as style issue SC2004.📝 Suggested simplification
- record_cluster "timers" "co_wait_time" $(( "${end_time}" - "${start_time}" )) + record_cluster "timers" "co_wait_time" $(( end_time - start_time ))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/rosa/cluster/wait-ready/operators/rosa-cluster-wait-ready-operators-commands.sh` at line 59, Replace the unnecessary ${} expansions inside the arithmetic context in the echo statement: update the expression in the echo call that prints the duration (the line containing "All cluster operators done progressing after $(( ${end_time} - ${start_time} )) seconds") to use $(( end_time - start_time )) instead of $(( ${end_time} - ${start_time} )); keep the rest of the message unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@ci-operator/step-registry/rosa/cluster/wait-ready/operators/rosa-cluster-wait-ready-operators-commands.sh`:
- Line 59: Replace the unnecessary ${} expansions inside the arithmetic context
in the echo statement: update the expression in the echo call that prints the
duration (the line containing "All cluster operators done progressing after $((
${end_time} - ${start_time} )) seconds") to use $(( end_time - start_time ))
instead of $(( ${end_time} - ${start_time} )); keep the rest of the message
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: dc6e9544-cdf8-4341-8ef8-2970bb791fcf
📒 Files selected for processing (2)
ci-operator/step-registry/rosa/cluster/wait-ready/operators/rosa-cluster-wait-ready-operators-commands.shci-operator/step-registry/rosa/cluster/wait-ready/operators/rosa-cluster-wait-ready-operators-ref.yaml
|
/pj-rehearse periodic-ci-openshift-online-rosa-e2e-main-periodics-rosa-classic-sts-e2e-stable-4-21 |
|
@bmeng: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@bmeng: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Hey Bo, I looked at how other Prow jobs handle this. The standard No DS/RS checks, no certificate checks, no CVO check. That's the accepted pattern across all of Prow. DS/RS/certificate issues surface through CO conditions since the owning CO reports Degraded when those sub-resources are unhealthy. I think adding the Available and Degraded checks (to match the OCP e2e pattern) makes sense, but the CVO check is redundant since CVO availability is a prerequisite for COs reporting correctly. The CAMO PR (#557) already merged and is validated on staging using the same three CO conditions. On ci-rosa-s-4ao6, it correctly detected all 34 COs stable and configured PD while the osd-cluster-ready Job was still stuck 2h+ later. I'd suggest simplifying this PR to just add the Available and Degraded checks to match the OCP e2e standard, and skip the CVO check. |
Check operators ready status with more checkpoints
Summary by CodeRabbit
This PR enhances the ROSA cluster operator readiness check step in the OpenShift CI infrastructure to perform comprehensive multi-stage validation of operator health status.
What's changing:
The
rosa-cluster-wait-ready-operatorsstep, which is used to verify ROSA clusters are ready for testing, now implements a more thorough readiness verification workflow:Previous behavior: The step only checked if cluster operators finished progressing (Progressing=false) and immediately branched to error handling if a timeout occurred.
New behavior: The step now validates operator health through sequential checks:
Key improvements:
check_failedflag to track failures across all stages rather than handling only the progressing timeoutThe step documentation in the ref.yaml file has also been updated to clearly describe these multi-stage validation requirements for ROSA clusters.