Skip to content

OSAC-854: add nightly vmaas snapshot build job#79377

Open
omer-vishlitzky wants to merge 2 commits into
openshift:mainfrom
omer-vishlitzky:osac-854-snapshot-nightly
Open

OSAC-854: add nightly vmaas snapshot build job#79377
omer-vishlitzky wants to merge 2 commits into
openshift:mainfrom
omer-vishlitzky:osac-854-snapshot-nightly

Conversation

@omer-vishlitzky
Copy link
Copy Markdown
Contributor

@omer-vishlitzky omer-vishlitzky commented May 17, 2026

Summary

OSAC presubmit E2E jobs boot OpenShift clusters from pre-built snapshots using
cluster-tool, which brings
cluster boot time down from ~2 hours to ~6 minutes. Currently these snapshots
are built manually. This PR automates that with a nightly periodic job.

What this adds

  • osac-project-cluster-tool-snapshot step-registry ref: installs cluster-tool
    on the provisioned machine, snapshots the running OSAC cluster, and pushes the
    resulting OCI image to quay.io
  • osac-project-cluster-tool-snapshot-vmaas workflow: chains the full pipeline —
    acquire baremetal machine, provision OCP via assisted-installer, install OSAC via
    setup.sh, then snapshot and push
  • snapshot-vmaas periodic job: runs the workflow nightly at 2am UTC from
    osac-test-infra, keeping the snapshot current with the latest osac-installer

Flow

ofcir-acquire → assisted-ofcir-setup → assisted-common-pre → osac-project-installer → cluster-tool snapshot → push to quay.io

The pushed snapshot image is what all e2e-vmaas presubmit jobs pull via
CLUSTER_TOOL_FLAVOR_IMAGE to boot clusters in ~6 minutes.

Test plan

  • Rehearse via /test snapshot-vmaas
  • Verify snapshot image appears in quay.io
  • Verify existing boot step can pull and boot from the new snapshot

Summary

This PR updates OpenShift CI (openshift/release) configuration for the osac-project/osac-test-infra repository to add an automated nightly pipeline that builds and publishes VMAAS cluster snapshots used by OSAC E2E jobs.

What changed (practical effect)

  • New periodic test job: snapshot-vmaas (ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml)

    • Runs nightly at 02:00 UTC (cron: 0 2 * * *)
    • capabilities: [intranet], cluster_profile: packet-assisted
    • Invokes workflow: osac-project-cluster-tool-snapshot-vmaas
    • Supplies ASSISTED_CONFIG (provisions a SNO-like assisted cluster: OLM operators cnv,lvm; NUM_MASTERS=1; NUM_WORKERS=0; MASTER_MEMORY=65536; 2×200GB disks; MASTER_CPU=24; OPENSHIFT_VERSION=4.20)
  • New workflow: osac-project-cluster-tool-snapshot-vmaas (ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/osac-project-cluster-tool-snapshot-vmaas-workflow.yaml)

    • Orchestrates: ofcir-acquire → assisted-ofcir-setup → assisted-common-pre → osac-project-installer → osac-project-cluster-tool-snapshot → post gather/release steps
    • Uses cluster_profile: packet-assisted, CLUSTERTYPE: assisted_large_el9, allow_best_effort_post_steps: true
    • Purpose: provision assisted baremetal cluster, run OSAC setup, snapshot cluster and push OCI image to a registry
  • New reusable step-ref and commands to perform snapshot and push:

    • Step YAML: ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.yaml (defines timeouts, resources, vault mounts, env vars CLUSTER_TOOL_COMMIT and SNAPSHOT_REGISTRY)
    • Commands script: osac-project-cluster-tool-snapshot-commands.sh
      • SSHs to the provisioned ci_machine, downloads cluster-tool at provided commit, initializes it, discovers the running test-infra cluster, creates a snapshot flavored "osac-vmaas", logs into the target registry using Quay creds from Vault (/var/run/vault/osac-quay-creds), and pushes the snapshot via cluster-tool push
      • Default registry referenced (quay.io target) and step grace/timeout set (grace_period 10m, timeout 2h)
  • Ownership/metadata:

    • Added OWNERS entries and workflow metadata mapping with approvers/reviewers set to osac-cicd for the new step-registry and workflow files.

Impact / motivation

  • Produces nightly VMAAS cluster snapshot OCI images consumed by e2e-vmaas presubmit jobs via CLUSTER_TOOL_FLAVOR_IMAGE. This reduces test cluster boot time from ~2 hours to ~6 minutes and keeps snapshots aligned with osac-installer changes.

Testing notes

  • Intended rehearse via: /test snapshot-vmaas
  • Verification: snapshot image appears in the configured quay.io registry and existing boot steps can pull and boot from the pushed snapshot.

Robot feedback

  • openshift-ci-robot confirmed JIRA OSAC-854 reference.
  • Robot warned the referenced JIRA task lacks the expected target version (expected 5.0.0).

Add cluster-tool snapshot step and workflow that provisions a
baremetal cluster via assisted-installer, installs OSAC, snapshots
the cluster, and pushes the OCI image to quay.io. Runs nightly
at 2am UTC.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 17, 2026

@omer-vishlitzky: This pull request references OSAC-854 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

OSAC presubmit E2E jobs boot OpenShift clusters from pre-built snapshots using
cluster-tool, which brings
cluster boot time down from ~2 hours to ~6 minutes. Currently these snapshots
are built manually. This PR automates that with a nightly periodic job.

What this adds

  • osac-project-cluster-tool-snapshot step-registry ref: installs cluster-tool
    on the provisioned machine, snapshots the running OSAC cluster, and pushes the
    resulting OCI image to quay.io
  • osac-project-cluster-tool-snapshot-vmaas workflow: chains the full pipeline —
    acquire baremetal machine, provision OCP via assisted-installer, install OSAC via
    setup.sh, then snapshot and push
  • snapshot-vmaas periodic job: runs the workflow nightly at 2am UTC from
    osac-test-infra, keeping the snapshot current with the latest osac-installer

Flow

ofcir-acquire → assisted-ofcir-setup → assisted-common-pre → osac-project-installer → cluster-tool snapshot → push to quay.io

The pushed snapshot image is what all e2e-vmaas presubmit jobs pull via
CLUSTER_TOOL_FLAVOR_IMAGE to boot clusters in ~6 minutes.

Test plan

  • Rehearse via /test snapshot-vmaas
  • Verify snapshot image appears in quay.io
  • Verify existing boot step can pull and boot from the new snapshot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

Walkthrough

This PR adds a new snapshot-vmaas test job that captures assisted-installer cluster snapshots and publishes them to an OCI registry. The change introduces the snapshot step script, workflow orchestration, metadata/OWNERS entries, and test job registration across CI operator config and step-registry.

Changes

Cluster Snapshot Workflow

Layer / File(s) Summary
Snapshot step implementation
ci-operator/step-registry/osac-project/cluster-tool/snapshot/OWNERS, ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.metadata.json, ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.yaml, ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh
Step registry defines the snapshot executable: a bash script that reads Quay credentials from Vault, downloads cluster-tool from GitHub, discovers a running cluster via virsh, creates a snapshot for the vmaas flavor, logs into the target registry, and pushes the OCI image.
Snapshot vmaas workflow
ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/OWNERS, ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/osac-project-cluster-tool-snapshot-vmaas-workflow.metadata.json, ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/osac-project-cluster-tool-snapshot-vmaas-workflow.yaml
Workflow orchestrates cluster provisioning (ofcir acquire, assisted-installer setup), snapshot creation, and resource cleanup. Configures packet-assisted cluster profile and assisted_large_el9 CLUSTERTYPE for a baremetal deployment.
Test job registration
ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml
Adds snapshot-vmaas test job scheduled daily at 02:00 UTC with intranet capability, injecting ASSISTED_CONFIG and invoking the osac-project-cluster-tool-snapshot-vmaas workflow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

lgtm, rehearsals-ack

Suggested reviewers

  • danmanor
  • akshaynadkarni
  • jhernand
🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: adding a nightly job for VMAAS snapshot builds.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Check designed for Ginkgo test definitions (Go test files). PR adds only CI/CD config files and bash scripts - no Go tests.
Test Structure And Quality ✅ Passed Custom check requires Ginkgo test code review. PR contains zero Go test files. All changes are CI/CD infrastructure: YAML configs, bash scripts, JSON metadata. Check is not applicable.
Microshift Test Compatibility ✅ Passed Custom check is not applicable. PR adds CI/CD infrastructure (YAML configs, Bash scripts, metadata files) but no Ginkgo e2e tests. Check applies only to new Go test code using Ginkgo patterns.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR adds CI/CD infrastructure (YAML, shell scripts, metadata) but no Ginkgo e2e tests. Custom check only applies when new Ginkgo tests are added. Not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed No topology issues. Changes are CI/CD pipeline configs (step-registry, job configs, scripts), not deployment manifests or operator code. No scheduling constraints found.
Ote Binary Stdout Contract ✅ Passed PR adds CI/CD configuration and infrastructure helper scripts. No OTE binaries or test code is present. Check is not applicable to this repository's CI configuration context.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds CI/CD infrastructure (YAML configs, bash scripts) to automate container snapshots, not new Ginkgo e2e tests. Custom check applies only to Ginkgo tests.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: omer-vishlitzky

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 17, 2026
@openshift-ci openshift-ci Bot requested review from akshaynadkarni and danmanor May 17, 2026 19:36
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh`:
- Around line 50-55: The podman login call currently passes the password with -p
which exposes QUAY_PASS in the process list; change it to use --password-stdin
and pipe the password into podman login instead of using -p. Locate the podman
login invocation (the line using podman login --root ... "$(echo ${REGISTRY} |
cut -d/ -f1)" -u "${QUAY_USER}" -p "${QUAY_PASS}") and replace the -p usage by
supplying QUAY_PASS via stdin (keep the --root, registry host extraction, and -u
"${QUAY_USER}" as-is) so the password is not visible in process arguments.
- Around line 35-37: Download of the cluster-tool binary should include SHA256
verification: download to a temp path (instead of writing directly to
/usr/local/bin/cluster-tool), ensure CLUSTER_TOOL_SHA256 environment variable is
present, compute the SHA256 of the downloaded file (e.g., via sha256sum or
shasum -a 256), compare it to CLUSTER_TOOL_SHA256 and exit non‑zero on mismatch,
only move the verified file into /usr/local/bin/cluster-tool and then run chmod
+x on that path; update the curl -> temp file step and references to COMMIT and
/usr/local/bin/cluster-tool accordingly so the file is never executed or
installed unless the checksum matches.
- Around line 15-25: The script currently enables xtrace around the ssh
invocation and passes QUAY_PASS as a positional argument (QUAY_PASS and the
timeout/ssh invocation block), which risks leaking the password; change this by
saving the current xtrace state (e.g., store "$(set +x; false || true)" or check
$- for xtrace), then disable xtrace before reading/using QUAY_PASS and before
the ssh command that includes "${QUAY_PASS}", and finally restore the original
xtrace state immediately after; apply the identical save/disable/restore pattern
around the podman login invocation (the podman login block that uses QUAY_PASS)
so credentials are never printed to CI logs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 51bedd0b-c514-4e65-9a76-664080c5dcb6

📥 Commits

Reviewing files that changed from the base of the PR and between 16e4c03 and 227a9f9.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/osac-project/osac-test-infra/osac-project-osac-test-infra-main-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (8)
  • ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/OWNERS
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/osac-project-cluster-tool-snapshot-vmaas-workflow.metadata.json
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot-vmaas/osac-project-cluster-tool-snapshot-vmaas-workflow.yaml
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot/OWNERS
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.metadata.json
  • ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.yaml

Comment on lines +15 to +25
set +x
QUAY_PASS=$(cat /var/run/vault/osac-quay-creds/password)
set -x

timeout -s 9 90m ssh -F "${SHARED_DIR}/ssh_config" ci_machine bash -s \
"${CLUSTER_TOOL_COMMIT}" \
"${SNAPSHOT_REGISTRY}" \
"${FLAVOR_NAME}" \
"${QUAY_USER}" \
"${QUAY_PASS}" \
<<'REMOTE_EOF'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

head -50 ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh

Repository: openshift/release

Length of output: 1570


🏁 Script executed:

tail -30 ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh

Repository: openshift/release

Length of output: 1221


🏁 Script executed:

# Let me verify the exact line numbers and check the context around the credential handling
wc -l ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh

Repository: openshift/release

Length of output: 172


🏁 Script executed:

cat -n ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh

Repository: openshift/release

Length of output: 2382


🏁 Script executed:

# Verify the fix syntax is correct by checking bash parameter expansion behavior
bash -c 'set -o nounset; was_xtrace=0; [[ "$-" == *x* ]] && was_xtrace=1; echo "was_xtrace=$was_xtrace"; set -x; echo "After set -x"; [[ "$-" == *x* ]] && was_xtrace=1; echo "was_xtrace=$was_xtrace"; (( was_xtrace )) && echo "Would re-enable xtrace"'

Repository: openshift/release

Length of output: 272


Prevent Quay password leakage through xtrace during SSH invocation.

Line 17 enables xtrace, and the ssh command on lines 19–24 passes QUAY_PASS as an argument. With xtrace enabled, bash expands and outputs the full command including the password to CI logs before execution. Save the xtrace state before disabling it for credential operations, then restore it afterward, rather than always re-enabling.

Suggested fix
- set +x
- QUAY_PASS=$(cat /var/run/vault/osac-quay-creds/password)
- set -x
+was_xtrace=0
+[[ "$-" == *x* ]] && was_xtrace=1
+set +x
+QUAY_PASS=$(cat /var/run/vault/osac-quay-creds/password)

 timeout -s 9 90m ssh -F "${SHARED_DIR}/ssh_config" ci_machine bash -s \
     "${CLUSTER_TOOL_COMMIT}" \
     "${SNAPSHOT_REGISTRY}" \
     "${FLAVOR_NAME}" \
     "${QUAY_USER}" \
     "${QUAY_PASS}" \
     <<'REMOTE_EOF'
@@
 REMOTE_EOF
+
+(( was_xtrace )) && set -x

Per coding guidelines, step registry command scripts should avoid accidental disclosure of passwords, tokens, API keys, and cluster URLs via xtrace. Apply the same pattern to the podman login invocation at lines 51–55.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh`
around lines 15 - 25, The script currently enables xtrace around the ssh
invocation and passes QUAY_PASS as a positional argument (QUAY_PASS and the
timeout/ssh invocation block), which risks leaking the password; change this by
saving the current xtrace state (e.g., store "$(set +x; false || true)" or check
$- for xtrace), then disable xtrace before reading/using QUAY_PASS and before
the ssh command that includes "${QUAY_PASS}", and finally restore the original
xtrace state immediately after; apply the identical save/disable/restore pattern
around the podman login invocation (the podman login block that uses QUAY_PASS)
so credentials are never printed to CI logs.

Comment on lines +35 to +37
curl -fsSL "https://raw.githubusercontent.com/omer-vishlitzky/cluster-tool/${COMMIT}/cluster-tool" \
-o /usr/local/bin/cluster-tool
chmod +x /usr/local/bin/cluster-tool
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find the file
find . -name "osac-project-cluster-tool-snapshot-commands.sh" -type f

Repository: openshift/release

Length of output: 171


🏁 Script executed:

#!/bin/bash
# Read the file to verify the exact content at lines 35-37
if [ -f "ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh" ]; then
  echo "=== File content (full) ==="
  wc -l ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh
  echo ""
  echo "=== Lines 30-45 for context ==="
  sed -n '30,45p' ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh
fi

Repository: openshift/release

Length of output: 899


🏁 Script executed:

#!/bin/bash
# Check for any environment variable definitions or setup related to CLUSTER_TOOL_SHA256
# Search across the directory structure for related references
rg -i "cluster_tool_sha256|CLUSTER_TOOL_SHA256" --type sh

Repository: openshift/release

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Search for the COMMIT variable definition and usage context
rg -i "COMMIT" ci-operator/step-registry/osac-project/cluster-tool/snapshot/ -A 2 -B 2

Repository: openshift/release

Length of output: 3880


🏁 Script executed:

#!/bin/bash
# Check the ref.yaml file to see what env variables are defined
cat ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-ref.yaml

Repository: openshift/release

Length of output: 976


🏁 Script executed:

#!/bin/bash
# Check if there are any related steps that might define CLUSTER_TOOL_SHA256 or similar patterns
rg "SHA256|sha256|checksum" ci-operator/step-registry/osac-project/cluster-tool/ --type sh

Repository: openshift/release

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Look at the full context of the commands.sh file to understand the overall flow
cat -n ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh

Repository: openshift/release

Length of output: 2382


Add integrity verification for downloaded cluster-tool binary.

Lines [35-37] download and execute a binary from a mutable GitHub ref (main by default) without checksum verification. This poses a supply-chain risk, particularly in a step with mounted credentials that could be compromised if the tool is hijacked.

Add SHA256 verification before execution:

Suggested hardening
 curl -fsSL "https://raw.githubusercontent.com/omer-vishlitzky/cluster-tool/${COMMIT}/cluster-tool" \
     -o /usr/local/bin/cluster-tool
+echo "${CLUSTER_TOOL_SHA256}  /usr/local/bin/cluster-tool" | sha256sum -c -
 chmod +x /usr/local/bin/cluster-tool

Note: Requires defining CLUSTER_TOOL_SHA256 as an environment variable in the step reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh`
around lines 35 - 37, Download of the cluster-tool binary should include SHA256
verification: download to a temp path (instead of writing directly to
/usr/local/bin/cluster-tool), ensure CLUSTER_TOOL_SHA256 environment variable is
present, compute the SHA256 of the downloaded file (e.g., via sha256sum or
shasum -a 256), compare it to CLUSTER_TOOL_SHA256 and exit non‑zero on mismatch,
only move the verified file into /usr/local/bin/cluster-tool and then run chmod
+x on that path; update the curl -> temp file step and references to COMMIT and
/usr/local/bin/cluster-tool accordingly so the file is never executed or
installed unless the checksum matches.

Comment on lines +50 to +55
echo "=== Authenticating to registry ==="
set +x
podman login --root /home/cluster-tool/containers/storage \
"$(echo ${REGISTRY} | cut -d/ -f1)" \
-u "${QUAY_USER}" -p "${QUAY_PASS}"
set -x
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and examine lines around 50-55
if [ -f "ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh" ]; then
  echo "=== File exists, showing lines 40-65 for context ==="
  sed -n '40,65p' "ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh"
else
  echo "File not found, checking what exists in the directory"
  find . -path "*/osac-project/cluster-tool/snapshot/*-commands.sh" 2>/dev/null || echo "No matching files found"
fi

Repository: openshift/release

Length of output: 1022


🌐 Web query:

podman login --password-stdin option documentation

💡 Result:

The --password-stdin option for the podman login command is used to provide a registry password via standard input (STDIN) rather than through command-line arguments or interactive prompts [1][2]. This approach is commonly used in automated scripts and CI/CD pipelines to enhance security by avoiding the exposure of passwords in process lists or shell history [1][3]. Usage Examples: 1. Using a pipe: echo $PASSWORD | podman login -u --password-stdin &lt;registry_url&gt; [1][3] 2. Using a file redirection: podman login -u --password-stdin &lt; password.txt &lt;registry_url&gt; [1][3] When using this option, Podman reads the password from STDIN and uses it to authenticate with the specified registry [1][2]. If no registry is specified, Podman defaults to the first registry listed in the [registries.search] section of the registries.conf configuration file [1][2]. Credentials are typically stored in the default authentication file, located at ${XDG_RUNTIME_DIR}/containers/auth.json [1][2].

Citations:


🏁 Script executed:

# Check if printf is available and test the approach
which printf
echo "Testing printf syntax for password piping:"
QUAY_PASS="test_password"
# Verify the printf syntax doesn't add newline
printf '%s' "${QUAY_PASS}" | wc -c
echo "Password length:"
echo -n "${QUAY_PASS}" | wc -c

Repository: openshift/release

Length of output: 189


Use --password-stdin instead of passing password via -p flag.

The current approach on line 54 exposes the password in the process list, which can be accessed via process inspection. Use stdin-based authentication as per the coding guidelines requiring sensitive data to bypass process visibility.

Suggested fix
 set +x
-podman login --root /home/cluster-tool/containers/storage \
-    "$(echo ${REGISTRY} | cut -d/ -f1)" \
-    -u "${QUAY_USER}" -p "${QUAY_PASS}"
+printf '%s' "${QUAY_PASS}" | podman login --root /home/cluster-tool/containers/storage \
+    "$(echo "${REGISTRY}" | cut -d/ -f1)" \
+    -u "${QUAY_USER}" --password-stdin
 set -x
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/osac-project/cluster-tool/snapshot/osac-project-cluster-tool-snapshot-commands.sh`
around lines 50 - 55, The podman login call currently passes the password with
-p which exposes QUAY_PASS in the process list; change it to use
--password-stdin and pipe the password into podman login instead of using -p.
Locate the podman login invocation (the line using podman login --root ...
"$(echo ${REGISTRY} | cut -d/ -f1)" -u "${QUAY_USER}" -p "${QUAY_PASS}") and
replace the -p usage by supplying QUAY_PASS via stdin (keep the --root, registry
host extraction, and -u "${QUAY_USER}" as-is) so the password is not visible in
process arguments.

Pass the same SNO cluster specs (CNV, LVM, 56GB RAM, 2x200GB disks,
24 vCPUs) that the existing periodic E2E jobs use. Without this,
assisted-installer would provision a cluster with wrong defaults.
@omer-vishlitzky omer-vishlitzky force-pushed the osac-854-snapshot-nightly branch from 0e0c663 to 9f38a20 Compare May 17, 2026 21:09
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@omer-vishlitzky: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas N/A periodic Periodic changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@omer-vishlitzky
Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@omer-vishlitzky
Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@omer-vishlitzky
Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 18, 2026

@omer-vishlitzky: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas 9f38a20 link unknown /pj-rehearse periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@omer-vishlitzky
Copy link
Copy Markdown
Contributor Author

/retest

@omer-vishlitzky
Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-osac-project-osac-test-infra-main-snapshot-vmaas

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants