WIP: Add k3s-based CAPZ e2e test (no IPI cluster needed) by marek-veber · Pull Request #79611 · openshift/release

marek-veber · 2026-05-21T14:55:37Z

Summary

Adds a new capz-k3s-e2e Prow job that uses k3s as the management cluster instead of provisioning a full IPI OpenShift cluster
k3s runs directly inside the test pod via nested_podman capabilities (SYS_ADMIN + NET_ADMIN)
Saves ~40 minutes of IPI provisioning time and reduces Azure compute costs

Purpose

Draft PR to validate whether nested_podman provides enough capabilities for k3s to start inside a Prow test pod.

New files

ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh — starts k3s, deploys controllers, runs e2e tests
ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-ref.yaml — step ref with nested_podman: true
ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-workflow.yaml — workflow without IPI

Why k3s instead of Kind?

Kind-in-Podman fails because containerd inside the Kind node can't do overlay mounts 3 levels deep (Docker → Podman → containerd → overlay = failed to mount rootfs component: invalid argument). k3s runs everything as a single process with --snapshotter=native, avoiding nested container issues entirely.

🤖 Generated with Claude Code

Summary by CodeRabbit

This PR adds a lightweight Prow CI job to stolostron/capi-tests: capz-k3s-e2e. Instead of provisioning an IPI OpenShift workload cluster, the job starts a k3s management cluster inside the test pod (requires nested_podman with SYS_ADMIN + NET_ADMIN), deploys CAPI/CAPZ/ASO controllers into it, and runs CAPZ e2e phases 01–08. The goal is to avoid IPI provisioning (~40 minutes saved) and reduce Azure compute costs. The PR is a draft to validate whether nested_podman provides sufficient capabilities for k3s to start reliably inside Prow test pods.

Practical impact / what changed

Registers a new Prow job capz-k3s-e2e in stolostron/capi-tests (cluster_profile: openshift-org-azure, WORKLOAD_CLUSTER_NAMESPACE: capz-test-prow) with a 4h job timeout.
Adds a step script that:
- installs and starts k3s (default K3S_VERSION v1.31.6+k3s1) with --snapshotter=native, traefik disabled, kubeconfig written to /tmp (mode 644), and kubelet eviction args;
- manages lifecycle with a cleanup trap, waits up to ~300s for the k3s node Ready state, and exits on failure;
- configures the test suite to use k3s (DEPLOY_CHARTS=true, USE_K8S=false), installs gotestsum (pinned), and runs the e2e suite via gotestsum producing ${ARTIFACT_DIR}/junit-e2e.xml with a 150m test timeout.
- (Commit fix) downloads k3s to /tmp instead of /usr/local/bin to accommodate non-root, read-only CI pods.
Adds a step reference capz-test-k3s-e2e that enables nested_podman, requests 4 CPU and 8Gi memory, sets a 3h step timeout, exposes K3S_VERSION and WORKLOAD_CLUSTER_NAMESPACE env defaults, and documents the nested-podman requirement.
Adds a workflow capz-test-k3s-e2e that runs the step and posts capz-test-summary; workflow docs note the ~40m savings and nested-podman requirement.
Adds OWNERS and metadata JSON files for the new step/workflow.

Repositories / CI surface affected

ci-operator configuration and step-registry for stolostron/capi-tests: new job, step script, step ref, workflow, OWNERS, and metadata.

Rationale

Kind-in-Podman fails due to nested overlay mount issues when containerd runs inside Kind nodes; k3s runs as a single process with --snapshotter=native, avoiding nested overlay problems and making an in-pod management cluster feasible.

Status / notes

Draft PR for validating nested_podman capability in Prow pods.
Step timeout: 3h; job timeout: 4h. Requested resources: 4 CPU, 8Gi memory. gotestsum pinned in the step (per script).

openshift-ci · 2026-05-21T14:55:41Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-05-21T14:56:04Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a k3s-based CAPZ e2e test: a command script to bootstrap/run k3s and tests, a step reference with nested-podman and resources, a workflow to run/post the step, CI job registration in stolostron config, and an OWNERS file for the step registry.

Changes

CAPZ k3s e2e testing

Layer / File(s)	Summary
k3s e2e step reference and docs `ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-ref.yaml`	Defines `capz-test-k3s-e2e` step ref with `nested_podman: true`, resource requests (4 CPU / 8Gi), env defaults (`K3S_VERSION`, `WORKLOAD_CLUSTER_NAMESPACE`), timeout, and capability notes (SYS_ADMIN/NET_ADMIN).
k3s e2e test step (commands) `ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh`	Bootstraps an in-pod k3s server (start flags, kubeconfig), polls for node readiness, cleans up the server on exit, installs gotestsum, and runs the e2e suite producing a JUnit XML artifact.
Workflow orchestration `ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-workflow.yaml`	Adds `capz-test-k3s-e2e` workflow that runs the step and posts `capz-test-summary`, with docs describing k3s-based no-IPI execution and ~40-minute saving vs IPI.
CAPZ CI job registration `ci-operator/config/stolostron/capi-tests/stolostron-capi-tests-main.yaml`	Registers `capz-k3s-e2e` job with `cluster_profile: openshift-org-azure`, `WORKLOAD_CLUSTER_NAMESPACE: capz-test-prow`, `build_root.project_image.dockerfile_path: Dockerfile.prow`, release targets for OCP 4.19, default resource requests, and a 4h timeout.
OWNERS and metadata `ci-operator/step-registry/capz/test/k3s-e2e/OWNERS`, `ci-operator/step-registry/capz/test/k3s-e2e/*.metadata.json`	Adds approvers and reviewers: `radekcap`, `marek-veber`, and `mzazrivec`, and metadata files linking owners to the step and workflow.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels: ok-to-test, rehearsals-ack

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (11 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding a k3s-based CAPZ e2e test as an alternative to IPI cluster provisioning.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	This PR adds CI/prow configuration and shell scripts only—no Ginkgo test definitions with dynamic names. Test code is sourced externally from CAPZ repository.
Test Structure And Quality	✅ Passed	PR adds Prow CI configuration and setup scripts, not Ginkgo test code. Custom check for Ginkgo test quality is not applicable as no test code is present.
Microshift Test Compatibility	✅ Passed	PR adds only CI/Prow infrastructure configuration and shell scripts, not Ginkgo e2e tests. The custom check applies only when Ginkgo tests are added.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	PR adds CI infrastructure for k3s-based CAPZ tests but no new Ginkgo e2e tests. Test code sourced from capi-tests repository, not added here.
Topology-Aware Scheduling Compatibility	✅ Passed	PR adds only CI/test infrastructure files, not deployment manifests or operators. Check applies to production deployments, not ephemeral CI infrastructure.
Ote Binary Stdout Contract	✅ Passed	The OTE Binary Stdout Contract check applies only to Go test binaries. This PR contains only YAML configuration, shell scripts, JSON metadata, and OWNERS files—no Go test code.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	This PR adds only CI/infrastructure configuration files and setup scripts, not new Ginkgo e2e test code. No It()/Describe() test definitions are added. The actual tests run from capi-tests repo.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

marek-veber · 2026-05-21T14:57:19Z

/test all

openshift-merge-bot · 2026-05-21T14:58:34Z

@marek-veber, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: Workflow capz-test-k3s-e2e contains non-existent reference capz-test-teardown

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh (1)

52-64: ⚡ Quick win

Surface k3s diagnostics on readiness timeout.

When k3s fails to become ready, the script exits with only a fatal message and no diagnostics, which will make debugging the very thing this PR is validating much harder. Consider capturing k3s server output to a file and dumping it (plus kubectl get nodes/kubectl get pods -A) into ${ARTIFACT_DIR} before exiting.

🔧 Proposed change

 k3s server \
   --disable=traefik \
   --snapshotter=native \
   --write-kubeconfig="${K3S_KUBECONFIG}" \
   --write-kubeconfig-mode=644 \
   --kubelet-arg="eviction-hard=imagefs.available<1%,nodefs.available<1%" \
   --kubelet-arg="eviction-minimum-reclaim=imagefs.available=1%,nodefs.available=1%" \
-  &
+  >"${ARTIFACT_DIR}/k3s-server.log" 2>&1 &
 K3S_PID=$!
...
 if [ "${READY}" != true ]; then
   echo "[k3s] FATAL: k3s did not become ready within 300s"
+  KUBECONFIG="${K3S_KUBECONFIG}" kubectl get nodes -o wide >"${ARTIFACT_DIR}/k3s-nodes.txt" 2>&1 || true
+  KUBECONFIG="${K3S_KUBECONFIG}" kubectl get pods -A -o wide >"${ARTIFACT_DIR}/k3s-pods.txt" 2>&1 || true
   exit 1
 fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh`
around lines 52 - 64, When the readiness loop using
KUBECONFIG="${K3S_KUBECONFIG}" and the READY flag times out, capture diagnostics
before exiting: redirect the k3s server logs into a file under ${ARTIFACT_DIR}
(e.g., k3s-server.log), and run KUBECONFIG="${K3S_KUBECONFIG}" kubectl get nodes
--no-headers and kubectl get pods -A, saving their outputs into separate files
in ${ARTIFACT_DIR}; place this dump in the failure branch where the script
currently checks if [ "${READY}" != true ] and before the echo/exit so artifacts
are preserved for debugging.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ci-operator/config/stolostron/capi-tests/stolostron-capi-tests-main.yaml`:
- Around line 19-57: The CI config added two new test entries ("capz-e2e" and
"capz-k3s-e2e") under tests which requires regenerating downstream ProwJob
artifacts; run the repository's generation target (make update) to regenerate
files under ci-operator/jobs/, review and stage the changed generated artifacts,
and commit those updated artifacts alongside the change that introduced the new
tests so the new capz-e2e and capz-k3s-e2e jobs are present in the committed
outputs.

In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh`:
- Line 16: The script currently enables xtrace via the literal line "set -o
xtrace"; remove that default and ensure the script starts with the required safe
shell flags "set -euo pipefail" instead, and if you need -x for debugging, gate
enabling xtrace behind a DEBUG (or CI_DEBUG) environment variable (e.g., test if
DEBUG is set and then call "set -o xtrace"), making sure kubeconfig handling and
kubectl/k3s calls no longer run with -x by default.

---

Nitpick comments:
In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh`:
- Around line 52-64: When the readiness loop using
KUBECONFIG="${K3S_KUBECONFIG}" and the READY flag times out, capture diagnostics
before exiting: redirect the k3s server logs into a file under ${ARTIFACT_DIR}
(e.g., k3s-server.log), and run KUBECONFIG="${K3S_KUBECONFIG}" kubectl get nodes
--no-headers and kubectl get pods -A, saving their outputs into separate files
in ${ARTIFACT_DIR}; place this dump in the failure branch where the script
currently checks if [ "${READY}" != true ] and before the echo/exit so artifacts
are preserved for debugging.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2b9b4e37-2673-4efd-a25f-80be41b10a9e

📥 Commits

Reviewing files that changed from the base of the PR and between 5403786 and b283938.

📒 Files selected for processing (4)

ci-operator/config/stolostron/capi-tests/stolostron-capi-tests-main.yaml
ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh
ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-ref.yaml
ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-workflow.yaml

coderabbitai · 2026-05-21T15:01:49Z

+export USE_KUBECONFIG="${K3S_KUBECONFIG}"
+
+source openshift-ci/capz-test-env.sh
+set -o xtrace


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid enabling xtrace by default.

Per coding guidelines, step-registry command scripts should run with set -euo pipefail and only enable -x (here set -o xtrace) when actively debugging. Since this script also exports/handles a k3s kubeconfig path and runs kubectl/k3s commands, leaving xtrace on by default risks leaking command arguments into job logs and increases log noise. Consider removing it (or gating it behind a debug flag) before merging out of WIP.

🔧 Proposed change

-source openshift-ci/capz-test-env.sh -set -o xtrace +source openshift-ci/capz-test-env.sh +# Enable tracing only when debugging: +# set -o xtrace

As per coding guidelines: "Step registry script files must use set -euo pipefail (without -x) as default and only enable -x when actively debugging".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

set -o xtrace

source openshift-ci/capz-test-env.sh

# Enable tracing only when debugging:

# set -o xtrace

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh` at line 16, The script currently enables xtrace via the literal line "set -o xtrace"; remove that default and ensure the script starts with the required safe shell flags "set -euo pipefail" instead, and if you need -x for debugging, gate enabling xtrace behind a DEBUG (or CI_DEBUG) environment variable (e.g., test if DEBUG is set and then call "set -o xtrace"), making sure kubeconfig handling and kubectl/k3s calls no longer run with -x by default.

marek-veber · 2026-05-21T16:29:08Z

/test all

openshift-ci · 2026-05-21T16:33:02Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marek-veber

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~ci-operator/config/stolostron/capi-tests/OWNERS~~ [marek-veber]
~~ci-operator/jobs/stolostron/capi-tests/OWNERS~~ [marek-veber]
~~ci-operator/step-registry/capz/test/OWNERS~~ [marek-veber]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-merge-bot · 2026-05-21T16:40:32Z

@marek-veber, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

couldn't prepare candidate: couldn't checkout base SHA 96c469107b7ac535f2a4021db6a44ef6e14b0ec2: error checking out "96c469107b7ac535f2a4021db6a44ef6e14b0ec2": exit status 128 fatal: unable to read tree (96c469107b7ac535f2a4021db6a44ef6e14b0ec2)

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

marek-veber · 2026-05-21T16:41:41Z

/test all

marek-veber · 2026-05-21T16:53:21Z

/retest

marek-veber · 2026-05-21T17:03:25Z

/pj-rehearse capz-k3s-e2e

openshift-merge-bot · 2026-05-21T17:03:29Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

openshift-merge-bot · 2026-05-21T17:06:23Z

@marek-veber: job(s): capz-k3s-e2e either don't exist or were not found to be affected, and cannot be rehearsed

marek-veber · 2026-05-21T17:16:44Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-21T17:16:47Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

… teardown ref - Add missing OWNERS file for k3s-e2e step registry - Add grace_period to ref (required when using trap) - Remove non-existent capz-test-teardown reference - Remove non-existent capz-e2e job (missing step registrations) - Fix cluster profile: azure4 -> openshift-org-azure - Remove undeclared DEPLOY_CHARTS env override (set in script) - Regenerate ci-operator config and job configs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

marek-veber · 2026-05-21T17:21:06Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-21T17:21:09Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

marek-veber · 2026-05-21T18:59:37Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-21T18:59:40Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

marek-veber · 2026-05-22T08:17:28Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-22T08:17:31Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

CI pods run as non-root with read-only /usr/local/bin, causing curl write error (exit 23). Use /tmp and prepend to PATH. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh`:
- Around line 32-34: Replace downloading to and prepending the world-writable
/tmp with a secure dedicated temp directory and make the download fail fast:
create a private dir via mktemp -d, download K3S from K3S_URL into that dir
using curl with fail-on-error flags (e.g. -f/--fail and -L), make the downloaded
binary executable (the file referenced as /tmp/k3s in the diff), and update PATH
to include only that temp dir (not /tmp) so the temp dir precedes system bins;
ensure you also clean up the temp dir on exit or error.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d62306fe-879d-47bc-8ee4-b5a8d49cf0f6

📥 Commits

Reviewing files that changed from the base of the PR and between 66db1f3 and c78533b.

📒 Files selected for processing (1)

ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh

coderabbitai · 2026-05-22T10:42:28Z

+curl -sLo /tmp/k3s "${K3S_URL}"
+chmod +x /tmp/k3s
+export PATH="/tmp:${PATH}"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid prepending world-writable /tmp to PATH.

On Line 34, putting /tmp first in PATH allows command shadowing risk (for tools invoked later like kubectl/go). Use a dedicated temp dir and fail fast on HTTP errors.

🔧 Proposed change

-curl -sLo /tmp/k3s "${K3S_URL}" -chmod +x /tmp/k3s -export PATH="/tmp:${PATH}" +K3S_BIN_DIR="$(mktemp -d)" +K3S_BIN="${K3S_BIN_DIR}/k3s" +curl -fsSLo "${K3S_BIN}" "${K3S_URL}" +chmod 0755 "${K3S_BIN}" +export PATH="${K3S_BIN_DIR}:${PATH}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

curl -sLo /tmp/k3s "${K3S_URL}"

chmod +x /tmp/k3s

export PATH="/tmp:${PATH}"

K3S_BIN_DIR="$(mktemp -d)"

K3S_BIN="${K3S_BIN_DIR}/k3s"

curl -fsSLo "${K3S_BIN}" "${K3S_URL}"

chmod 0755 "${K3S_BIN}"

export PATH="${K3S_BIN_DIR}:${PATH}"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/capz/test/k3s-e2e/capz-test-k3s-e2e-commands.sh` around lines 32 - 34, Replace downloading to and prepending the world-writable /tmp with a secure dedicated temp directory and make the download fail fast: create a private dir via mktemp -d, download K3S from K3S_URL into that dir using curl with fail-on-error flags (e.g. -f/--fail and -L), make the downloaded binary executable (the file referenced as /tmp/k3s in the diff), and update PATH to include only that temp dir (not /tmp) so the temp dir precedes system bins; ensure you also clean up the temp dir on exit or error.

marek-veber · 2026-05-22T14:11:10Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-22T14:11:13Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

CI pods run as random non-root UID. k3s requires --rootless flag in this case. Also install slirp4netns dependency, use writable data-dir, and detect early if the k3s process dies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

marek-veber · 2026-05-22T21:36:29Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-22T21:36:35Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

openshift-merge-bot · 2026-05-22T21:39:42Z

[REHEARSALNOTIFIER]
@marek-veber: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name	Repo	Type	Reason
pull-ci-stolostron-capi-tests-main-capz-k3s-e2e	stolostron/capi-tests	presubmit	Presubmit changed

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

marek-veber · 2026-05-22T22:07:50Z

/pj-rehearse abort

openshift-merge-bot · 2026-05-22T22:07:53Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

marek-veber · 2026-05-22T22:09:04Z

/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e

openshift-merge-bot · 2026-05-22T22:09:07Z

@marek-veber: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

openshift-ci · 2026-05-22T22:26:34Z

@marek-veber: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/rehearse/stolostron/capi-tests/main/capz-k3s-e2e	`921cac9`	link	unknown	`/pj-rehearse pull-ci-stolostron-capi-tests-main-capz-k3s-e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2026

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 21, 2026

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

marek-veber and others added 3 commits May 21, 2026 19:20

Add k3s-based CAPZ e2e test (no IPI cluster needed)

8c39dee

Add generated step registry metadata for k3s-e2e

66db1f3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

marek-veber force-pushed the capz-k3s-e2e branch from 8395458 to 66db1f3 Compare May 21, 2026 17:20

fix: download k3s to /tmp instead of /usr/local/bin

c78533b

CI pods run as non-root with read-only /usr/local/bin, causing curl write error (exit 23). Use /tmp and prepend to PATH. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

-curl -sLo /tmp/k3s "${K3S_URL}"
-chmod +x /tmp/k3s
-export PATH="/tmp:${PATH}"
+K3S_BIN_DIR="$(mktemp -d)"
+K3S_BIN="${K3S_BIN_DIR}/k3s"
+curl -fsSLo "${K3S_BIN}" "${K3S_URL}"
+chmod 0755 "${K3S_BIN}"
+export PATH="${K3S_BIN_DIR}:${PATH}"

Conversation

marek-veber commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Purpose

New files

Why k3s instead of Kind?

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented May 21, 2026

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-ci Bot commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

marek-veber commented May 21, 2026

Uh oh!

openshift-merge-bot Bot commented May 21, 2026

Uh oh!

marek-veber commented May 22, 2026

Uh oh!

openshift-merge-bot Bot commented May 22, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

marek-veber commented May 22, 2026

Uh oh!

openshift-merge-bot Bot commented May 22, 2026

Uh oh!

marek-veber commented May 22, 2026

Uh oh!

openshift-merge-bot Bot commented May 22, 2026

Uh oh!

openshift-merge-bot Bot commented May 22, 2026

Uh oh!

marek-veber commented May 22, 2026

Uh oh!

openshift-merge-bot Bot commented May 22, 2026

marek-veber commented May 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading