Dnm hcp runs for 422#79362
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (2)
WalkthroughUpdates perfscale CI job definitions (switch CLUSTER_SECTOR to perf4, add cron on rosa_hcp-4.21, add GC on 4.22, remove three test refs) and enhances hypershift infra script to apply monitoring config, wait for reconciliation, restart prometheus-k8s, and verify pod placement on infra nodes. ChangesPerfscale CI Configuration Updates
Prometheus Infra Migration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/uncc @mohit-sheth @rsevilla87 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/openshift-qe/hypershift-infra/openshift-qe-hypershift-infra-commands.sh`:
- Around line 107-121: The current check can set VERIFY_SUCCESS=true when no
prometheus pods were evaluated because ALL_ON_INFRA defaults to true; modify the
for node in $(oc get pods -n openshift-monitoring -o wide | grep -i
"prometheus-k8s-" | grep -i running | awk '{print$7}') loop to track whether any
pods were found (e.g., increment a POD_COUNT or set a FOUND_PROM_POD flag inside
the loop) and after the loop only set VERIFY_SUCCESS=true and break if BOTH
ALL_ON_INFRA is true and the pod count/FOUND_PROM_POD indicates at least one pod
was evaluated; ensure you update the conditional that currently checks [
"$ALL_ON_INFRA" = true ] to require the additional non-zero pod condition.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2ff3fde4-3dc1-47ab-8104-03514ededc9e
⛔ Files ignored due to path filters (2)
ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-presubmits.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.21-nightly-x86.yamlci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__rosa_hcp-4.22-nightly-x86.yamlci-operator/step-registry/openshift-qe/hypershift-infra/openshift-qe-hypershift-infra-commands.sh
| while [ $RETRY -lt $MAX_RETRIES ]; do | ||
| ALL_ON_INFRA=true | ||
| for node in $(oc get pods -n openshift-monitoring -o wide | grep -i "prometheus-k8s-" | grep -i running | awk '{print$7}'); do | ||
| if [[ $(oc get nodes --no-headers -l node-role.kubernetes.io/infra | awk '{print$1}' | grep -w "$node") != "" ]]; then | ||
| log "$(date) - prometheus pod on $node (infra node) ✓" | ||
| else | ||
| log "$(date) - WARNING: prometheus pod on $node is NOT an infra node" | ||
| ALL_ON_INFRA=false | ||
| fi | ||
| done | ||
|
|
||
| if [ "$ALL_ON_INFRA" = true ]; then | ||
| log "$(date) - All prometheus-k8s pods are on infra nodes ✓" | ||
| VERIFY_SUCCESS=true | ||
| break |
There was a problem hiding this comment.
Avoid false success when no Prometheus pods are matched.
Line 118 can mark verification successful even when zero pods were evaluated, because ALL_ON_INFRA starts as true and the loop may not run. This can incorrectly pass migration validation.
Proposed fix
while [ $RETRY -lt $MAX_RETRIES ]; do
ALL_ON_INFRA=true
- for node in $(oc get pods -n openshift-monitoring -o wide | grep -i "prometheus-k8s-" | grep -i running | awk '{print$7}'); do
+ mapfile -t PROM_NODES < <(oc get pods -n openshift-monitoring -o wide | awk '/prometheus-k8s-/ && tolower($3)=="running" {print $7}')
+
+ if [ "${`#PROM_NODES`[@]}" -eq 0 ]; then
+ log "$(date) - WARNING: no running prometheus-k8s pods found yet"
+ ALL_ON_INFRA=false
+ fi
+
+ for node in "${PROM_NODES[@]}"; do
if [[ $(oc get nodes --no-headers -l node-role.kubernetes.io/infra | awk '{print$1}' | grep -w "$node") != "" ]]; then
log "$(date) - prometheus pod on $node (infra node) ✓"
else
log "$(date) - WARNING: prometheus pod on $node is NOT an infra node"
ALL_ON_INFRA=false
fi
done📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| while [ $RETRY -lt $MAX_RETRIES ]; do | |
| ALL_ON_INFRA=true | |
| for node in $(oc get pods -n openshift-monitoring -o wide | grep -i "prometheus-k8s-" | grep -i running | awk '{print$7}'); do | |
| if [[ $(oc get nodes --no-headers -l node-role.kubernetes.io/infra | awk '{print$1}' | grep -w "$node") != "" ]]; then | |
| log "$(date) - prometheus pod on $node (infra node) ✓" | |
| else | |
| log "$(date) - WARNING: prometheus pod on $node is NOT an infra node" | |
| ALL_ON_INFRA=false | |
| fi | |
| done | |
| if [ "$ALL_ON_INFRA" = true ]; then | |
| log "$(date) - All prometheus-k8s pods are on infra nodes ✓" | |
| VERIFY_SUCCESS=true | |
| break | |
| while [ $RETRY -lt $MAX_RETRIES ]; do | |
| ALL_ON_INFRA=true | |
| mapfile -t PROM_NODES < <(oc get pods -n openshift-monitoring -o wide | awk '/prometheus-k8s-/ && tolower($3)=="running" {print $7}') | |
| if [ "${`#PROM_NODES`[@]}" -eq 0 ]; then | |
| log "$(date) - WARNING: no running prometheus-k8s pods found yet" | |
| ALL_ON_INFRA=false | |
| fi | |
| for node in "${PROM_NODES[@]}"; do | |
| if [[ $(oc get nodes --no-headers -l node-role.kubernetes.io/infra | awk '{print$1}' | grep -w "$node") != "" ]]; then | |
| log "$(date) - prometheus pod on $node (infra node) ✓" | |
| else | |
| log "$(date) - WARNING: prometheus pod on $node is NOT an infra node" | |
| ALL_ON_INFRA=false | |
| fi | |
| done | |
| if [ "$ALL_ON_INFRA" = true ]; then | |
| log "$(date) - All prometheus-k8s pods are on infra nodes ✓" | |
| VERIFY_SUCCESS=true | |
| break |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-qe/hypershift-infra/openshift-qe-hypershift-infra-commands.sh`
around lines 107 - 121, The current check can set VERIFY_SUCCESS=true when no
prometheus pods were evaluated because ALL_ON_INFRA defaults to true; modify the
for node in $(oc get pods -n openshift-monitoring -o wide | grep -i
"prometheus-k8s-" | grep -i running | awk '{print$7}') loop to track whether any
pods were found (e.g., increment a POD_COUNT or set a FOUND_PROM_POD flag inside
the loop) and after the loop only set VERIFY_SUCCESS=true and break if BOTH
ALL_ON_INFRA is true and the pod count/FOUND_PROM_POD indicates at least one pod
was evaluated; ensure you update the conditional that currently checks [
"$ALL_ON_INFRA" = true ] to require the additional non-zero pod condition.
2316b75 to
085ae44
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
085ae44 to
077b269
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.21-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-control-plane-120nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
077b269 to
fd9292c
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Sandeepyadav93 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fd9292c to
c86e396
Compare
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-node-density-heavy-24nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-rosa_hcp-4.22-nightly-x86-node-density-heavy-24nodes-onperfsector |
|
@Sandeepyadav93: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@Sandeepyadav93: job(s): either don't exist or were not found to be affected, and cannot be rehearsed |
|
@Sandeepyadav93: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR updates OpenShift CI job definitions for ROSA HCP performance testing and enhances a Hypershift infra helper script used when rebalancing Prometheus onto infra nodes.
CI / job changes (ocp-qe-perfscale-ci)
Hypershift infra script changes (ci-operator step)
Other notes