Add benchmark-runner Prow CI step-registry and test jobs#77965
Conversation
|
Hi @arpsharm. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughAdds a benchmark-runner step to the step-registry (image, command, env, resources, metadata, OWNERS), an entrypoint script to run workloads and collect artifacts, a test-step YAML matrix targeting OCP 4.22 nightly, and many presubmit Prow jobs on cluster build03 for the new scenarios. ChangesBenchmark Runner CI Pipeline
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml (1)
3-6: Pinningtag: latestmakes runs non-reproducible.Using
latestmeans each CI invocation may pull a different image, which can cause hard-to-diagnose flakes and makes failures impossible to reproduce. Consider using anImageStreamTagpinned to a specific version or a digest, and bumping it deliberately.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml` around lines 3 - 6, The from_image entry currently uses an unpinned tag ("tag: latest") which makes CI runs non-reproducible; update the from_image block (the from_image key and its nested tag field) to reference a pinned image (either a specific ImageStreamTag like "name: benchmark-runner tag: v1.2.3" or an immutable digest) and document the version bump process so the benchmark-runner image is updated deliberately rather than using "latest".ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh (3)
32-41: Trap function is invoked viatrap; shellcheck SC2329 is a false positive.
benchmark_runner_debugis registered on line 41 viatrap ... EXIT, so shellcheck's "never invoked" info is a false positive and safe to ignore. Optionally add# shellcheck disable=SC2329above the function to silence it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 32 - 41, Add a shellcheck disable for SC2329 above the benchmark_runner_debug function to silence the false-positive "never invoked" warning because the function is registered by trap EXIT; specifically place a comment like "# shellcheck disable=SC2329" immediately before the benchmark_runner_debug() definition (the function referenced by trap benchmark_runner_debug EXIT) so the linter knows the invocation via trap is intentional.
9-12:KUBEADMIN_PASSWORDis exported but unused by this script.If
main.pyreads it from the environment, this is fine — otherwise it's dead code. Worth a brief inline comment noting which consumer needs it, to avoid accidental removal later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 9 - 12, The script exports KUBEADMIN_PASSWORD from SHARED_DIR if present but doesn't use it itself; either add a brief inline comment stating which consumer reads it (e.g., main.py reads KUBEADMIN_PASSWORD from the environment) or remove the export if truly unused. Locate the export of KUBEADMIN_PASSWORD and add the comment referencing main.py (or adjust the code to pass the password directly to the consumer) so future maintainers know why the environment variable is set.
44-44: Silencing all errors on namespace create also hides real failures.
2>/dev/null || truewill mask genuine problems (RBAC denied, API server unreachable, quota, etc.) and the subsequentoc/python calls will just fail opaquely. Prefer an idempotent apply or explicit "AlreadyExists" filtering:♻️ Proposed refactor
-oc create namespace benchmark-runner 2>/dev/null || true +oc get namespace benchmark-runner >/dev/null 2>&1 || \ + oc create namespace benchmark-runner🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` at line 44, The current line "oc create namespace benchmark-runner 2>/dev/null || true" silences all errors and hides real failures; replace it with an idempotent check/create pattern or an apply-based approach so genuine errors surface — e.g., first check for existence with an "oc get namespace benchmark-runner" and only call "oc create namespace benchmark-runner" if it does not exist, or use "oc apply" with a namespace manifest; update the script where that exact command appears so RBAC, connectivity, quota and API errors are not suppressed.ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json (1)
19-19: Add trailing newline.File lacks a newline at EOF, which is a common lint nit in this repo.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json` at line 19, The file ends with a closing brace "}" but is missing a trailing newline; add a single newline character at end-of-file in redhat-performance-benchmark-runner-ref.metadata.json so the file terminates with a newline (i.e., ensure the file ends with "\n" after the final "}" to satisfy repo lint rules).ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml (1)
13-87: Heavy duplication across the five test entries.The five test blocks differ only in
asandWORKLOAD; every other field (cluster_profile, BASE_DOMAIN, COMPUTE_NODE_*, USER_TAGS, ref, workflow) is identical. ci-operator config doesn't support YAML anchors in a way that survives prowgen round-trips, so this is accepted repetition — just flagging that any future change (e.g. instance type bump or BASE_DOMAIN rename) must be applied in five places. Consider a follow-up to parameterize via the step registry chain if this grows further.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml` around lines 13 - 87, The five nearly identical test steps (as: benchmark-runner-step-stressng-pod, benchmark-runner-step-uperf-pod, benchmark-runner-step-hammerdb-pod-mariadb-ephemeral, benchmark-runner-step-hammerdb-pod-postgres-ephemeral, benchmark-runner-step-hammerdb-pod-mssql-ephemeral) only differ by the as name and the WORKLOAD env value; to fix, consolidate into a single reusable parameterized step in the step registry (or a shared template) that defines cluster_profile, BASE_DOMAIN, COMPUTE_NODE_REPLICAS/TYPE, CONTROL_PLANE_INSTANCE_TYPE, USER_TAGS, test ref: redhat-performance-benchmark-runner and workflow: ipi-aws, and then create minimal step entries that reference that template while supplying the unique WORKLOAD and as values; update references to use the template variable for WORKLOAD so future changes (e.g., BASE_DOMAIN or instance types) are made in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Around line 48-56: Because the script runs under set -e, a non-zero exit from
the python invocation will abort before rc=$? and the ARTIFACT_DIR collection
runs; change the python3.14 /benchmark_runner/main/main.py invocation to capture
the exit code without triggering exit (for example run the command with a
conditional or append "|| true"), then set rc=$? and SCRIPT_EXIT_CODE=$rc and
run the existing ARTIFACT_DIR/benchmark-runner-debug collection block; reference
the python3.14 /benchmark_runner/main/main.py invocation, the rc and
SCRIPT_EXIT_CODE assignments, and ARTIFACT_DIR/benchmark-runner-debug in your
change so the artifact-gathering always executes even when the Python process
fails.
---
Nitpick comments:
In
`@ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml`:
- Around line 13-87: The five nearly identical test steps (as:
benchmark-runner-step-stressng-pod, benchmark-runner-step-uperf-pod,
benchmark-runner-step-hammerdb-pod-mariadb-ephemeral,
benchmark-runner-step-hammerdb-pod-postgres-ephemeral,
benchmark-runner-step-hammerdb-pod-mssql-ephemeral) only differ by the as name
and the WORKLOAD env value; to fix, consolidate into a single reusable
parameterized step in the step registry (or a shared template) that defines
cluster_profile, BASE_DOMAIN, COMPUTE_NODE_REPLICAS/TYPE,
CONTROL_PLANE_INSTANCE_TYPE, USER_TAGS, test ref:
redhat-performance-benchmark-runner and workflow: ipi-aws, and then create
minimal step entries that reference that template while supplying the unique
WORKLOAD and as values; update references to use the template variable for
WORKLOAD so future changes (e.g., BASE_DOMAIN or instance types) are made in one
place.
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Around line 32-41: Add a shellcheck disable for SC2329 above the
benchmark_runner_debug function to silence the false-positive "never invoked"
warning because the function is registered by trap EXIT; specifically place a
comment like "# shellcheck disable=SC2329" immediately before the
benchmark_runner_debug() definition (the function referenced by trap
benchmark_runner_debug EXIT) so the linter knows the invocation via trap is
intentional.
- Around line 9-12: The script exports KUBEADMIN_PASSWORD from SHARED_DIR if
present but doesn't use it itself; either add a brief inline comment stating
which consumer reads it (e.g., main.py reads KUBEADMIN_PASSWORD from the
environment) or remove the export if truly unused. Locate the export of
KUBEADMIN_PASSWORD and add the comment referencing main.py (or adjust the code
to pass the password directly to the consumer) so future maintainers know why
the environment variable is set.
- Line 44: The current line "oc create namespace benchmark-runner 2>/dev/null ||
true" silences all errors and hides real failures; replace it with an idempotent
check/create pattern or an apply-based approach so genuine errors surface —
e.g., first check for existence with an "oc get namespace benchmark-runner" and
only call "oc create namespace benchmark-runner" if it does not exist, or use
"oc apply" with a namespace manifest; update the script where that exact command
appears so RBAC, connectivity, quota and API errors are not suppressed.
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json`:
- Line 19: The file ends with a closing brace "}" but is missing a trailing
newline; add a single newline character at end-of-file in
redhat-performance-benchmark-runner-ref.metadata.json so the file terminates
with a newline (i.e., ensure the file ends with "\n" after the final "}" to
satisfy repo lint rules).
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml`:
- Around line 3-6: The from_image entry currently uses an unpinned tag ("tag:
latest") which makes CI runs non-reproducible; update the from_image block (the
from_image key and its nested tag field) to reference a pinned image (either a
specific ImageStreamTag like "name: benchmark-runner tag: v1.2.3" or an
immutable digest) and document the version bump process so the benchmark-runner
image is updated deliberately rather than using "latest".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: dec56fd0-467b-4576-9660-e5eba8ab30b8
📒 Files selected for processing (6)
ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yamlci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yamlci-operator/step-registry/redhat-performance/benchmark-runner/OWNERSci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.shci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.jsonci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
b60069f to
4c31423
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh (2)
43-44: Namespace creation runs even whenKUBECONFIGwasn't set.If
/var/run/secrets/ci.openshift.io/multi-stage/kubeconfigisn't present (lines 15–18),KUBECONFIGis never exported and thisoc create namespacesilently fails against no cluster (swallowed by|| true). The subsequent Python run will then also be misconfigured, and failures will be harder to diagnose. Consider failing fast when no kubeconfig is found, or at minimum logging a clear warning:♻️ Proposed refactor
if [[ -f /var/run/secrets/ci.openshift.io/multi-stage/kubeconfig ]]; then cp /var/run/secrets/ci.openshift.io/multi-stage/kubeconfig /tmp/kubeconfig export KUBECONFIG=/tmp/kubeconfig +else + echo "ERROR: no multi-stage kubeconfig found; benchmark-runner needs a test cluster" >&2 + exit 1 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 43 - 44, The script currently runs oc create namespace benchmark-runner even when KUBECONFIG was never exported (missing /var/run/secrets/ci.openshift.io/multi-stage/kubeconfig), which hides failures; update the bootstrap logic to check for the kubeconfig file and either export KUBECONFIG or fail fast with a clear error before running oc create namespace (or at minimum emit a clear warning referencing KUBECONFIG and /var/run/secrets/ci.openshift.io/multi-stage/kubeconfig); ensure the check is placed before the oc create namespace invocation so the script exits non‑zero or logs an explicit message when the kubeconfig is absent.
24-27: Strip trailing newline when loadingBASE_DOMAIN.
cat /secret/base_domainpreserves any trailing newline in the secret file, which can silently break downstream consumers that concatenateBASE_DOMAINinto hostnames/URLs. Prefer a read that trims it:♻️ Proposed refactor
- if [[ -s /secret/base_domain ]]; then - BASE_DOMAIN=$(cat /secret/base_domain) - export BASE_DOMAIN - fi + if [[ -s /secret/base_domain ]]; then + BASE_DOMAIN=$(<"/secret/base_domain") + BASE_DOMAIN="${BASE_DOMAIN%$'\n'}" + export BASE_DOMAIN + fiThe same concern applies to
KUBEADMIN_PASSWORDat line 10 if any consumer does literal string matching.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 24 - 27, Replace the raw cat assignments that preserve trailing newlines by reading the secrets with a newline-trimming read or equivalent so variables don't include a trailing newline; specifically change the BASE_DOMAIN assignment to use a newline-trimming read (e.g. read -r BASE_DOMAIN < /secret/base_domain) and export BASE_DOMAIN, and apply the same pattern to KUBEADMIN_PASSWORD (read -r KUBEADMIN_PASSWORD < /secret/kubeadmin_password) to ensure neither variable contains a trailing newline before they are used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Around line 43-44: The script currently runs oc create namespace
benchmark-runner even when KUBECONFIG was never exported (missing
/var/run/secrets/ci.openshift.io/multi-stage/kubeconfig), which hides failures;
update the bootstrap logic to check for the kubeconfig file and either export
KUBECONFIG or fail fast with a clear error before running oc create namespace
(or at minimum emit a clear warning referencing KUBECONFIG and
/var/run/secrets/ci.openshift.io/multi-stage/kubeconfig); ensure the check is
placed before the oc create namespace invocation so the script exits non‑zero or
logs an explicit message when the kubeconfig is absent.
- Around line 24-27: Replace the raw cat assignments that preserve trailing
newlines by reading the secrets with a newline-trimming read or equivalent so
variables don't include a trailing newline; specifically change the BASE_DOMAIN
assignment to use a newline-trimming read (e.g. read -r BASE_DOMAIN <
/secret/base_domain) and export BASE_DOMAIN, and apply the same pattern to
KUBEADMIN_PASSWORD (read -r KUBEADMIN_PASSWORD < /secret/kubeadmin_password) to
ensure neither variable contains a trailing newline before they are used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 6e7f3c26-2231-4470-ae44-29da587b6edb
📒 Files selected for processing (6)
ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yamlci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yamlci-operator/step-registry/redhat-performance/benchmark-runner/OWNERSci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.shci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.jsonci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
✅ Files skipped from review due to trivial changes (4)
- ci-operator/step-registry/redhat-performance/benchmark-runner/OWNERS
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json
- ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml
4c31423 to
78fc331
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh (2)
23-33:/secret/base_domainsilently overrides the step-levelBASE_DOMAINenv.The test step in
redhat-performance-benchmark-runner-main__test-step.yamlalready setsBASE_DOMAIN: rhdh-perfscale.devcluster.openshift.com, and the ref mounts the credentials secret at/secret. If the secret happens to contain abase_domainkey, this block will transparently replace the intended value with no log line indicating the override, which can make misconfigurations hard to diagnose. Consider (a) only setting it when unset (: "${BASE_DOMAIN:=$(<"/secret/base_domain")}"), or (b) logging which source won.Also note: line 29's
BASE_DOMAIN="${BASE_DOMAIN%$'\n'}"is redundant —$(<file)already strips trailing newlines, and the<fileredirection on line 28 doesn't include them either.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 23 - 33, The current block unconditionally replaces BASE_DOMAIN from /secret/base_domain which can silently override a step-level value; change the logic in the vault-secret handling so you only set BASE_DOMAIN if it is not already set (use a conditional default assignment for BASE_DOMAIN when empty) or explicitly log which source wins before overriding, and remove the redundant newline-trimming operation (the BASE_DOMAIN="${BASE_DOMAIN%$'\n'}" line) because command substitution / file redirection already strips trailing newlines; update references to BASE_DOMAIN and the /secret/base_domain read to reflect this safe behavior and add a clear log line when the secret provides the value.
48-48: Namespace creation masks all errors, not just AlreadyExists.
oc create namespace benchmark-runner 2>/dev/null || trueswallows legitimate failures (RBAC denial, API server unreachable, etc.), so the script will proceed to run Python even when it can't talk to the cluster. Prefer an idempotent apply, or at least only suppress theAlreadyExistscase:♻️ Suggested change
-# Ensure benchmark-runner namespace exists on the TEST cluster -oc create namespace benchmark-runner 2>/dev/null || true +# Ensure benchmark-runner namespace exists on the TEST cluster (idempotent) +oc apply -f - <<EOF +apiVersion: v1 +kind: Namespace +metadata: + name: benchmark-runner +EOF🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` at line 48, The current line `oc create namespace benchmark-runner 2>/dev/null || true` hides all errors; replace it with an idempotent apply or only suppress the AlreadyExists case. For example, replace the command with an apply using a namespace manifest piped to oc (e.g., `cat <<EOF | oc apply -f -` with apiVersion: v1, kind: Namespace, metadata.name: benchmark-runner EOF) or, if keeping create, only ignore the AlreadyExists message by filtering stdout/stderr (e.g., run `oc create namespace benchmark-runner 2>&1 | grep -v 'AlreadyExists' || true`) so RBAC/network errors still surface. Ensure the change updates the existing `oc create namespace benchmark-runner` invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Around line 23-33: The current block unconditionally replaces BASE_DOMAIN from
/secret/base_domain which can silently override a step-level value; change the
logic in the vault-secret handling so you only set BASE_DOMAIN if it is not
already set (use a conditional default assignment for BASE_DOMAIN when empty) or
explicitly log which source wins before overriding, and remove the redundant
newline-trimming operation (the BASE_DOMAIN="${BASE_DOMAIN%$'\n'}" line) because
command substitution / file redirection already strips trailing newlines; update
references to BASE_DOMAIN and the /secret/base_domain read to reflect this safe
behavior and add a clear log line when the secret provides the value.
- Line 48: The current line `oc create namespace benchmark-runner 2>/dev/null ||
true` hides all errors; replace it with an idempotent apply or only suppress the
AlreadyExists case. For example, replace the command with an apply using a
namespace manifest piped to oc (e.g., `cat <<EOF | oc apply -f -` with
apiVersion: v1, kind: Namespace, metadata.name: benchmark-runner EOF) or, if
keeping create, only ignore the AlreadyExists message by filtering stdout/stderr
(e.g., run `oc create namespace benchmark-runner 2>&1 | grep -v 'AlreadyExists'
|| true`) so RBAC/network errors still surface. Ensure the change updates the
existing `oc create namespace benchmark-runner` invocation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 32b5c075-791f-4b07-92cc-ec7a880c0fde
📒 Files selected for processing (6)
ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yamlci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yamlci-operator/step-registry/redhat-performance/benchmark-runner/OWNERSci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.shci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.jsonci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
✅ Files skipped from review due to trivial changes (3)
- ci-operator/step-registry/redhat-performance/benchmark-runner/OWNERS
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml
- ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-step-hammerdb-pod-mariadb-ephemeral |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@arpsharm: needs-ok-to-test label found, no rehearsals will be run |
|
/ok-to-test |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-step-hammerdb-pod-mariadb-ephemeral |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
1 similar comment
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-step-hammerdb-pod-postgres-ephemeral |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
78fc331 to
7814328
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh (1)
27-30: Avoid silently overriding the job’sBASE_DOMAIN.Line 27 makes
/secret/base_domaintake precedence over the explicitBASE_DOMAINset by the test-step config, so a stale secret can redirect the runner to the wrong cluster domain. Prefer using the secret only as a fallback, or fail on mismatch if it must be authoritative.Proposed fallback behavior
- if [[ -s /secret/base_domain ]]; then + if [[ -z "${BASE_DOMAIN:-}" && -s /secret/base_domain ]]; then BASE_DOMAIN=$(<"/secret/base_domain") BASE_DOMAIN="${BASE_DOMAIN%$'\n'}" export BASE_DOMAIN fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh` around lines 27 - 30, The script currently always overrides BASE_DOMAIN from /secret/base_domain which can silently redirect jobs; change the logic in the block that reads /secret/base_domain so the secret is used only as a fallback or validated: if BASE_DOMAIN is already set and non-empty (the value from test-step config), read /secret/base_domain and if it differs fail early with a clear error (or log and exit), otherwise leave the existing BASE_DOMAIN intact; if BASE_DOMAIN is unset/empty, read the secret and export it as before. Ensure this behavior is applied where BASE_DOMAIN is referenced in the script (the existing if [[ -s /secret/base_domain ]] block).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml`:
- Around line 103-141: The pod specs mount /secrets/gcs (volumeMount with
mountPath /secrets/gcs and name: gcs-credentials) but the volumes: list is
missing a corresponding volume entry; add a volumes entry named
"gcs-credentials" that references the correct Kubernetes secret (e.g.
secretName: gcs-credentials or the repo's GCS publisher secret) and, if needed,
include an items mapping to ensure service-account.json is present; apply the
same addition to every affected job's volumes block that has the /secrets/gcs
mount.
---
Nitpick comments:
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Around line 27-30: The script currently always overrides BASE_DOMAIN from
/secret/base_domain which can silently redirect jobs; change the logic in the
block that reads /secret/base_domain so the secret is used only as a fallback or
validated: if BASE_DOMAIN is already set and non-empty (the value from test-step
config), read /secret/base_domain and if it differs fail early with a clear
error (or log and exit), otherwise leave the existing BASE_DOMAIN intact; if
BASE_DOMAIN is unset/empty, read the secret and export it as before. Ensure this
behavior is applied where BASE_DOMAIN is referenced in the script (the existing
if [[ -s /secret/base_domain ]] block).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 5fc333b6-5f28-407f-8df5-b78dcc7e64cf
📒 Files selected for processing (6)
ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yamlci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yamlci-operator/step-registry/redhat-performance/benchmark-runner/OWNERSci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.shci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.jsonci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
✅ Files skipped from review due to trivial changes (3)
- ci-operator/step-registry/redhat-performance/benchmark-runner/OWNERS
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.metadata.json
- ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yaml
| volumeMounts: | ||
| - mountPath: /etc/boskos | ||
| name: boskos | ||
| readOnly: true | ||
| - mountPath: /secrets/ci-pull-credentials | ||
| name: ci-pull-credentials | ||
| readOnly: true | ||
| - mountPath: /secrets/gcs | ||
| name: gcs-credentials | ||
| readOnly: true | ||
| - mountPath: /secrets/manifest-tool | ||
| name: manifest-tool-local-pusher | ||
| readOnly: true | ||
| - mountPath: /etc/pull-secret | ||
| name: pull-secret | ||
| readOnly: true | ||
| - mountPath: /etc/report | ||
| name: result-aggregator | ||
| readOnly: true | ||
| serviceAccountName: ci-operator | ||
| volumes: | ||
| - name: boskos | ||
| secret: | ||
| items: | ||
| - key: credentials | ||
| path: credentials | ||
| secretName: boskos-credentials | ||
| - name: ci-pull-credentials | ||
| secret: | ||
| secretName: ci-pull-credentials | ||
| - name: manifest-tool-local-pusher | ||
| secret: | ||
| secretName: manifest-tool-local-pusher | ||
| - name: pull-secret | ||
| secret: | ||
| secretName: registry-pull-credentials | ||
| - name: result-aggregator | ||
| secret: | ||
| secretName: result-aggregator |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml"
python - <<'PY'
from pathlib import Path
path = Path("ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml")
jobs = []
current = None
pending_mount = False
in_volumes = False
for line in path.read_text().splitlines():
if line.startswith(" - agent: kubernetes"):
if current:
jobs.append(current)
current = {"name": "<unknown>", "mounts": set(), "volumes": set()}
pending_mount = False
in_volumes = False
continue
if current is None:
continue
if line.startswith(" name: pull-ci-"):
current["name"] = line.split(":", 1)[1].strip()
if line.startswith(" volumes:"):
in_volumes = True
pending_mount = False
continue
if line.startswith(" - mountPath:"):
pending_mount = True
continue
if pending_mount and line.startswith(" name:"):
current["mounts"].add(line.split(":", 1)[1].strip().strip('"'))
pending_mount = False
continue
if in_volumes:
if line.startswith(" - name:"):
current["volumes"].add(line.split(":", 1)[1].strip().strip('"'))
elif line.startswith(" trigger:"):
in_volumes = False
if current:
jobs.append(current)
failed = False
for job in jobs:
missing = sorted(job["mounts"] - job["volumes"])
for name in missing:
print(f"{job['name']}: missing volume for volumeMount {name!r}")
failed = True
raise SystemExit(1 if failed else 0)
PYRepository: openshift/release
Length of output: 1219
Add the missing gcs-credentials volume to all affected jobs.
Each of eight new jobs mounts name: gcs-credentials and uses --gcs-upload-secret=/secrets/gcs/service-account.json, but the volumes list never defines gcs-credentials. Kubernetes will reject these pod specs before the jobs start.
Proposed fix pattern
Apply this to each affected job's volumes: block:
volumes:
+ - name: gcs-credentials
+ secret:
+ secretName: gce-sa-credentials-gcs-publisher
- name: boskos
secret:
items:Verify the exact secret name against the repo's generated ci-operator jobs if this repo uses a different GCS publisher secret.
Also applies to: 188-226, 273-311, 358-396, 443-481, 528-566, 613-651
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yaml`
around lines 103 - 141, The pod specs mount /secrets/gcs (volumeMount with
mountPath /secrets/gcs and name: gcs-credentials) but the volumes: list is
missing a corresponding volume entry; add a volumes entry named
"gcs-credentials" that references the correct Kubernetes secret (e.g.
secretName: gcs-credentials or the repo's GCS publisher secret) and, if needed,
include an items mapping to ensure service-account.json is present; apply the
same addition to every affected job's volumes block that has the /secrets/gcs
mount.
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-step-stressng-vm |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
540c244 to
72214bc
Compare
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-all |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
72214bc to
591e992
Compare
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-uperf-vm |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-uperf-vm |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-benchmark-runner-sysbench-pod |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-sysbench-pod |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`:
- Line 27: The log prints the bastion endpoint (${BASTION_ADDRESS}) which
exposes sensitive infrastructure details; update the echo at the SOCKS proxy
startup (the line that currently says "API not directly reachable, starting
SOCKS proxy through ${BASTION_ADDRESS}") to a generic message without the
variable (e.g., "API not directly reachable, starting SOCKS proxy") so no
bastion address or other secrets are output to logs; ensure any other uses of
BASTION_ADDRESS in this script are not echoed or logged either.
- Around line 30-32: Guard reading the secret file "elasticsearch_port" before
using it in the SOCKS fallback path: check that the secret file exists and is
readable (e.g., test -r "/secret/elasticsearch_port") and only then read into
ES_PORT and strip the newline; if the file is missing or unreadable, skip the
ssh -D/ -L proxy setup that uses ES_PORT (or fall back to a sensible
default/explicit error), ensuring the ssh invocation with SSH_ARGS,
BASTION_ADDRESS and SOCKS_PORT is only executed when ES_PORT is successfully
set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: a6f644b0-f855-4a44-9cc0-a46c15c83dc5
⛔ Files ignored due to path filters (1)
ci-operator/jobs/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main-presubmits.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-main__test-step.yamlci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.shci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-ref.yaml
| cp /secret/jh_priv_ssh_key /tmp/bastion_key | ||
| chmod 600 /tmp/bastion_key | ||
| if ! curl -sk --max-time 5 "${CLUSTER_SERVER}/version" &>/dev/null; then | ||
| echo "API not directly reachable, starting SOCKS proxy through ${BASTION_ADDRESS}" |
There was a problem hiding this comment.
Avoid printing bastion endpoint details in logs.
Line 27 logs ${BASTION_ADDRESS}. Please keep the message generic to avoid exposing infrastructure endpoint details.
🔧 Proposed fix
- echo "API not directly reachable, starting SOCKS proxy through ${BASTION_ADDRESS}"
+ echo "API not directly reachable, starting SOCKS proxy through bastion host"As per coding guidelines, "Protect sensitive information in step registry scripts - never echo or print passwords, tokens, API keys, cluster URLs, or kubeconfig contents".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "API not directly reachable, starting SOCKS proxy through ${BASTION_ADDRESS}" | |
| echo "API not directly reachable, starting SOCKS proxy through bastion host" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`
at line 27, The log prints the bastion endpoint (${BASTION_ADDRESS}) which
exposes sensitive infrastructure details; update the echo at the SOCKS proxy
startup (the line that currently says "API not directly reachable, starting
SOCKS proxy through ${BASTION_ADDRESS}") to a generic message without the
variable (e.g., "API not directly reachable, starting SOCKS proxy") so no
bastion address or other secrets are output to logs; ensure any other uses of
BASTION_ADDRESS in this script are not echoed or logged either.
| ES_PORT=$(<"/secret/elasticsearch_port") | ||
| ES_PORT="${ES_PORT%$'\n'}" | ||
| ssh ${SSH_ARGS} root@"${BASTION_ADDRESS}" -fNT -D "${SOCKS_PORT}" -L "127.0.0.1:${ES_PORT}:localhost:${ES_PORT}" |
There was a problem hiding this comment.
Guard elasticsearch_port before reading it in the SOCKS fallback path.
At Line 30, the file is read unconditionally; with set -e, a missing secret exits immediately and obscures why proxy setup failed.
🔧 Proposed fix
- ES_PORT=$(<"/secret/elasticsearch_port")
- ES_PORT="${ES_PORT%$'\n'}"
+ if [[ ! -s /secret/elasticsearch_port ]]; then
+ echo "ERROR: missing /secret/elasticsearch_port for SOCKS proxy setup" >&2
+ exit 1
+ fi
+ ES_PORT=$(<"/secret/elasticsearch_port")
+ ES_PORT="${ES_PORT%$'\n'}"
ssh ${SSH_ARGS} root@"${BASTION_ADDRESS}" -fNT -D "${SOCKS_PORT}" -L "127.0.0.1:${ES_PORT}:localhost:${ES_PORT}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ES_PORT=$(<"/secret/elasticsearch_port") | |
| ES_PORT="${ES_PORT%$'\n'}" | |
| ssh ${SSH_ARGS} root@"${BASTION_ADDRESS}" -fNT -D "${SOCKS_PORT}" -L "127.0.0.1:${ES_PORT}:localhost:${ES_PORT}" | |
| if [[ ! -s /secret/elasticsearch_port ]]; then | |
| echo "ERROR: missing /secret/elasticsearch_port for SOCKS proxy setup" >&2 | |
| exit 1 | |
| fi | |
| ES_PORT=$(<"/secret/elasticsearch_port") | |
| ES_PORT="${ES_PORT%$'\n'}" | |
| ssh ${SSH_ARGS} root@"${BASTION_ADDRESS}" -fNT -D "${SOCKS_PORT}" -L "127.0.0.1:${ES_PORT}:localhost:${ES_PORT}" |
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 32-32: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/redhat-performance/benchmark-runner/redhat-performance-benchmark-runner-commands.sh`
around lines 30 - 32, Guard reading the secret file "elasticsearch_port" before
using it in the SOCKS fallback path: check that the secret file exists and is
readable (e.g., test -r "/secret/elasticsearch_port") and only then read into
ES_PORT and strip the newline; if the file is missing or unreadable, skip the
ssh -D/ -L proxy setup that uses ES_PORT (or fall back to a sensible
default/explicit error), ensuring the ssh invocation with SSH_ARGS,
BASTION_ADDRESS and SOCKS_PORT is only executed when ES_PORT is successfully
set.
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-uperf-vm |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
A total of 31 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse pull-ci-redhat-performance-benchmark-runner-main-test-step-sysbench-pod |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@arpsharm: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@arpsharm: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arpsharm, ebattat, jhutar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
Prow CI step-registry and test jobs for benchmark-runner workloads on pre-provisioned bare-metal cluster.
redhat-performance-benchmark-runnerTest plan