Skip to content

CNTRLPLANE-3364: ci implementation for vault deployment#78947

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
sandeepknd:vault-install
May 9, 2026
Merged

CNTRLPLANE-3364: ci implementation for vault deployment#78947
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
sandeepknd:vault-install

Conversation

@sandeepknd
Copy link
Copy Markdown
Contributor

@sandeepknd sandeepknd commented May 6, 2026

ci implementation for vault deployment.
Please find the execution log, tested locally.

$ sh etcd-encryption-vault-install-commands.sh
=========================================
Vault Enterprise Installation via Helm
=========================================
Version: 2.0.0-ent
Namespace: vault-kms

Helm already installed: v3.19.0+g3d8990f

Creating namespace vault-kms...
namespace/vault-kms created
Adding restricted SCC for Vault service account...
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:restricted added: "vault"
Creating Vault license secret from mounted credential...
secret/vault-license created
Adding HashiCorp Helm repository...
"hashicorp" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "hashicorp" chart repository
Update Complete. ⎈Happy Helming!⎈

Installing Vault Enterprise v2.0.0-ent in dev mode...
Release "vault" does not exist. Installing it now.
NAME: vault
LAST DEPLOYED: Thu May  7 19:19:57 2026
NAMESPACE: vault-kms
STATUS: deployed
REVISION: 1
NOTES:
Thank you for installing HashiCorp Vault!

Now that you have deployed Vault, you should look over the docs on using
Vault with Kubernetes available here:

https://developer.hashicorp.com/vault/docs


Your release is named vault. To learn more about the release, try:

  $ helm status vault
  $ helm get manifest vault
Waiting for Vault pod to be ready...
pod/vault-0 condition met
=========================================
Vault Enterprise Installation Complete
=========================================

Summary:
  - Namespace: vault-kms
  - Version: 2.0.0-ent
  - Service: vault.vault-kms.svc:8200
  - Pod: vault-0 (Ready)

Next step: Run etcd-encryption-vault-configure to configure Vault for KMS


sh ../vault-configure/etcd-encryption-vault-configure-commands.sh 
=========================================
Vault Configuration for KMS
=========================================
Namespace: vault-kms

Configuring Vault for KMS...

Enabling transit secret engine...
Success! Enabled the transit secrets engine at: transit/
Creating transit encryption key...
Key                       Value
---                       -----
allow_plaintext_backup    false
auto_rotate_period        0s
deletion_allowed          false
derived                   false
exportable                false
imported_key              false
keys                      map[1:1778161850]
latest_version            1
min_available_version     0
min_decryption_version    1
min_encryption_version    0
name                      kms-key
supports_decryption       true
supports_derivation       true
supports_encryption       true
supports_signing          false
type                      aes256-gcm96
Enabling AppRole authentication...
Success! Enabled approle auth method at: approle/
Creating KMS policy...
Success! Uploaded policy: kms-policy
Creating AppRole role...
Success! Data written to: auth/approle/role/kms-plugin
Retrieving AppRole credentials...
Creating vault-credentials secret...
secret/vault-credentials created
Vault credentials saved to vault-credentials secret

=========================================
Vault Configuration Complete
=========================================

Summary:
  - Vault Service: vault.vault-kms.svc:8200
  - Credentials Secret: vault-credentials (namespace: vault-kms)
  - Transit Key: kms-key
  - ROLE_ID: f63cf679-e665-9826-d708-7a8517b3a3a5

This PR adds CI automation to deploy HashiCorp Vault Enterprise for KMS-backed etcd-encryption testing used by the cluster-kube-apiserver-operator jobs in the openshift/release repository.

What changed (practical impact)

  • Adds a reusable CI step ref etcd-encryption-vault-install that installs and configures Vault Enterprise via Helm (ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml). The step exposes env vars to control Vault version, chart version, namespace, image repo, and the KMS transit key name, and documents prerequisites and outputs (service endpoint + secret fields).
  • Implements the step with etcd-encryption-vault-install-commands.sh which:
    • Ensures Helm is available, adds the HashiCorp Helm repo, creates the Vault namespace, and applies required SCC for the vault service account.
    • Optionally creates a vault-license secret from a mounted license file (/var/run/vault-license/license) when present.
    • Generates Helm values and installs the HashiCorp Vault Enterprise chart, waits for pods, initializes/unseals Vault (if needed), enables the transit engine, creates a transit KMS key, configures AppRole auth and policy, and writes role-id, secret-id, root-token and unseal-key into a vault-credentials secret for consuming tests.
  • Adds OWNERS for the new step (approvers/reviewers: jianlinliu, wangke19) and matching metadata JSON to expose the step ref.
  • Updates the cluster-kube-apiserver-operator CI config (ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml) to add the etcd-encryption-vault-install test ref before the operator e2e test in the e2e-gcp-operator-encryption-kms job so Vault is deployed/configured prior to e2e execution. The job file includes a commented placeholder for mounting a vault-license credential (mount path /var/run/vault-license) — the license mount is present as a comment/placeholder rather than an active credentials entry.

Why this matters

  • Enables in-CI provisioning of Vault Enterprise so the kube-apiserver operator’s KMS encryption e2e tests can run against a Vault-backed KMS without manual setup.
  • Centralizes Vault setup in a reusable step and persists credentials for consuming tests.

Files added/modified

  • Added: ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml
  • Added: ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
  • Added: ci-operator/step-registry/etcd-encryption/vault-install/OWNERS
  • Added: ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.metadata.json
  • Modified: ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml (inserted step ref; license credential mount present as a commented placeholder; reordered test refs so etcd-encryption-vault-install is listed before existing openshift-e2e-test ref)

Notes & follow-ups

  • The step expects a Vault Enterprise license to be made available; the script will create a vault-license secret from /var/run/vault-license/license if mounted. The job currently contains only a commented credential placeholder — a real test-credentials entry must be added to provide the license for fully automated runs.
  • openshift-ci-robot validated the JIRA reference CNTRLPLANE-3364 but warned the referenced Jira is missing a target version for the target branch (expected "5.0.0"). Update the Jira target version to satisfy repo checks.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 6, 2026

@sandeepknd: This pull request references CNTRLPLANE-3364 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

ci implementation for vault deployment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new ci-operator step and script to install/configure Vault Enterprise via Helm for etcd-encryption KMS e2e, adds OWNERS/metadata for the step, and wires the step into the e2e-gcp-operator-encryption-kms-ote job (adds a commented vault-license mount and places etcd-encryption-vault-install before the e2e test).

Changes

Vault Installation for KMS E2E Testing

Layer / File(s) Summary
Step Definition
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml
Adds a step reference that runs a Helm-based Vault Enterprise install script, with resource requests, env defaults (VAULT_VERSION, VAULT_CHART_VERSION, VAULT_NAMESPACE, VAULT_IMAGE_REPOSITORY, VAULT_KMS_KEY_NAME), documentation, prerequisites, and outputs (produces vault-credentials secret and service endpoint).
Installation Script
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
New executable script: strict shell options, exports KUBECONFIG, idempotency checks for namespace/Helm release, bootstraps Helm if missing, creates namespace & restricted SCC, optionally creates vault-license secret from mounted file, generates Helm values, installs hashicorp/vault chart, waits for pods Ready, runs vault status, initializes/unseals if needed, enables transit, creates transit key/AppRole/policy, retrieves role_id/secret_id/root-token/unseal-key and saves them into a vault-credentials secret.
Owners / Metadata
ci-operator/step-registry/etcd-encryption/vault-install/OWNERS, .../etcd-encryption-vault-install-ref.metadata.json
Adds approvers and reviewers entries (jianlinliu, wangke19) and metadata mapping the ref YAML to owners.
Job Integration
ci-operator/config/openshift/cluster-kube-apiserver-operator/.../openshift-cluster-kube-apiserver-operator-main.yaml
In the e2e-gcp-operator-encryption-kms-ote job, inserts a commented credentials mount placeholder for a vault-license secret and updates test references to list etcd-encryption-vault-install before openshift-e2e-test.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CI as CI Job
  participant Script as Vault Install Script
  participant Helm as Helm
  participant K8s as Kubernetes API
  participant VaultPod as Vault Pod
  participant E2E as e2e Test Job

  CI->>Script: run etcd-encryption-vault-install
  Script->>Helm: add repo and install Vault chart
  Helm->>K8s: create namespace and resources
  K8s->>VaultPod: schedule Vault pods
  Script->>VaultPod: wait ready and run status
  Script->>VaultPod: init/unseal, enable transit, create key and AppRole
  Script->>K8s: create vault-credentials secret
  CI->>E2E: start openshift-e2e-test after vault-install
  E2E->>VaultPod: use credentials to run KMS tests
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Topology-Aware Scheduling Compatibility ❓ Inconclusive No result was produced after verification. Marking as INCONCLUSIVE. Re-run the check or adjust instructions to produce a final result.
✅ Passed checks (11 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR contains no Ginkgo test files. Changes include YAML configuration, bash scripts, and metadata for CI/CD infrastructure, not Go test code with test declarations.
Test Structure And Quality ✅ Passed Check not applicable: PR contains no Ginkgo test code. Files changed are shell scripts, YAML configs, and metadata files only.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. All changes are CI infrastructure (YAML configs, Bash scripts, and metadata). Check only applies when tests are added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR does not add any Ginkgo e2e tests. Changes are CI infrastructure setup (Vault installation script and step configuration) only. The custom check for SNO test compatibility is not applicable.
Ote Binary Stdout Contract ✅ Passed Not applicable. PR contains only CI configuration YAML, bash setup scripts, and metadata—no OTE binaries or Go test code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds Bash setup script and CI infrastructure for Vault installation, not Ginkgo e2e tests. Custom check only applies to new Ginkgo test code with IPv4/connectivity issues.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing CI configuration for Vault deployment across multiple new and modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from ingvagabund and xueqzhan May 6, 2026 19:51
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: file /var/tmp/gitrepo2532661634/ci-operator/step-registry/etcd-encryption/vault-install/vault-install-commands.sh has incorrect prefix. Prefix should be etcd-encryption-vault-install
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

readOnly: true

injector:
enabled: false
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, it is enbled.
It denotes the Helm chart would deploy:

  • vault-agent-injector Deployment (separate pod)
  • MutatingWebhookConfiguration (webhook server)
  • Service for the webhook
  • Additional RBAC resources
    we don't need any of these.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we can disabled that

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/etcd-encryption/vault-install/vault-install-commands.sh`:
- Around line 15-31: The preflight uses helm (helm list and helm -n checks)
before ensuring Helm is present, so guard or install Helm first: ensure command
-v helm is checked and, if missing, perform the Helm bootstrap (the block that
sets HELM_VERSION and downloads helm) before any use of helm, or wrap the
installed-release check (the helm list / INSTALLED_VERSION extraction lines)
with a command -v helm guard; update the logic around VAULT_NAMESPACE/helm list
and the INSTALLED_VERSION extraction to only run after Helm is confirmed
available.
- Around line 176-253: The current if uses a piped `oc exec ... vault status |
grep -q 'Initialized.*false'` which treats any `oc exec` failure as "already
initialized"; change it to first run the status command into a variable or
capture its exit code (e.g., STATUS_OUT=$(oc exec vault-0 -n
"${VAULT_NAMESPACE}" -- vault status 2>&1) || { echo "vault status failed:
$STATUS_OUT"; exit 1; }) and then test the content with grep (grep -q
'Initialized.*false' <<<"$STATUS_OUT") so you fail fast on `oc exec` failures
and only branch on the parsed status. Use the same VAULT_NAMESPACE/vault-0
references from the existing check.
- Around line 31-37: The script currently downloads Helm with curl (the curl
command that writes /tmp/helm.tar.gz using HELM_VERSION) and extracts it without
integrity checks; update the flow to also download the corresponding SHA256
checksum (from get.helm.sh for the same HELM_VERSION) and verify the archive
before extraction — e.g., fetch the .sha256 (or checksums file), validate
/tmp/helm.tar.gz with sha256sum -c (or compare computed sha256sum) and exit
non‑zero on mismatch, then proceed to tar -xzf, mv, chmod, etc.; ensure the
verification step occurs between the curl download and the tar extraction so a
failed checksum prevents execution of the downloaded Helm binary.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/vault-install-ref.yaml`:
- Line 40: Update the documentation entry that currently states "Creates
vault-license secret from VAULT_LICENSE env var (if provided)" to reflect the
actual input mechanism: the step reads the license from a mounted file at
/var/run/vault-license/license rather than from the VAULT_LICENSE environment
variable; reference the actual file path (/var/run/vault-license/license) and
the secret name vault-license in the docs so users mount the secret correctly
for the vault-install step (vault-install-ref.yaml).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f2e8d297-f13d-42be-a402-dfe9cba705d8

📥 Commits

Reviewing files that changed from the base of the PR and between 47cf68f and 2044302.

📒 Files selected for processing (3)
  • ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml
  • ci-operator/step-registry/etcd-encryption/vault-install/vault-install-commands.sh
  • ci-operator/step-registry/etcd-encryption/vault-install/vault-install-ref.yaml

Comment on lines +31 to +37
curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz" -o /tmp/helm.tar.gz
tar -xzf /tmp/helm.tar.gz -C /tmp
mkdir -p /tmp/bin
mv /tmp/linux-amd64/helm /tmp/bin/helm
chmod +x /tmp/bin/helm
export PATH="/tmp/bin:$PATH"
rm -rf /tmp/helm.tar.gz /tmp/linux-amd64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

cat -n "ci-operator/step-registry/etcd-encryption/vault-install/vault-install-commands.sh" | sed -n '25,45p'

Repository: openshift/release

Length of output: 769


🌐 Web query:

Helm download integrity verification SHA256 checksum get.helm.sh

💡 Result:

To verify the integrity of Helm binaries downloaded from get.helm.sh using SHA256 checksums, follow these steps based on official documentation and practices [1][2]. 1. Download the binary from get.helm.sh, e.g., for Linux amd64 latest version (check https://github.com/helm/helm/releases for current version, e.g., v3.20.2 as of April 2026 [3][4]): curl -LO https://get.helm.sh/helm-v3.20.2-linux-amd64.tar.gz [5][6] 2. Download the corresponding SHA256 checksum file: curl -LO https://get.helm.sh/helm-v3.20.2-linux-amd64.tar.gz.sha256 [5][6] 3. Verify the checksum (the.sha256 file contains the raw hex digest): sha256sum -c helm-v3.20.2-linux-amd64.tar.gz.sha256sum # or use.sha256sum file if preferred [7] Or manually: grep equivalent: computed=$(sha256sum helm-v3.20.2-linux-amd64.tar.gz | cut -d' ' -f1); expected=$(cat helm-v3.20.2-linux-amd64.tar.gz.sha256); if [ "$computed" = "$expected" ]; then echo OK; else echo FAIL; fi [1][5] The official install scripts (get-helm-3 or get-helm-4) automatically download from get.helm.sh and verify the SHA256 checksum before unpacking [5][6][2]. Example: curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh./get_helm.sh For full supply chain security, also verify GPG signatures (.asc files from GitHub releases) using Helm's KEYS file [8][9][10]. Note: Chart provenance verification uses separate 'helm verify' for.prov files, not binary downloads [11][12].

Citations:


🏁 Script executed:

curl -fsSL "https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3" | head -100

Repository: openshift/release

Length of output: 3563


🏁 Script executed:

curl -fsSL "https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3" 2>/dev/null | grep -A 20 "verifyChecksum"

Repository: openshift/release

Length of output: 1501


🏁 Script executed:

curl -fsSL "https://get.helm.sh/helm-v3.14.0-linux-amd64.tar.gz.sha256" 2>/dev/null

Repository: openshift/release

Length of output: 126


Downloaded Helm binary should be integrity-verified before execution.

The script downloads and executes Helm without checksum verification, which is a CI supply-chain risk. SHA256 checksums are available from get.helm.sh for verification.

Suggested hardening
 curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz" -o /tmp/helm.tar.gz
+curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz.sha256" -o /tmp/helm.tar.gz.sha256
+expected_sum=$(cat /tmp/helm.tar.gz.sha256)
+actual_sum=$(sha256sum /tmp/helm.tar.gz | awk '{print $1}')
+if [ "$actual_sum" != "$expected_sum" ]; then echo "Checksum verification failed"; exit 1; fi
 tar -xzf /tmp/helm.tar.gz -C /tmp
 ...
-rm -rf /tmp/helm.tar.gz /tmp/linux-amd64
+rm -rf /tmp/helm.tar.gz /tmp/helm.tar.gz.sha256 /tmp/linux-amd64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/vault-install-commands.sh`
around lines 31 - 37, The script currently downloads Helm with curl (the curl
command that writes /tmp/helm.tar.gz using HELM_VERSION) and extracts it without
integrity checks; update the flow to also download the corresponding SHA256
checksum (from get.helm.sh for the same HELM_VERSION) and verify the archive
before extraction — e.g., fetch the .sha256 (or checksums file), validate
/tmp/helm.tar.gz with sha256sum -c (or compare computed sha256sum) and exit
non‑zero on mismatch, then proceed to tar -xzf, mv, chmod, etc.; ensure the
verification step occurs between the curl download and the tar extraction so a
failed checksum prevents execution of the downloaded Helm binary.

@sandeepknd sandeepknd changed the title CNTRLPLANE-3364: ci implementation for vault deployment [WIP]CNTRLPLANE-3364: ci implementation for vault deployment May 6, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 6, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: name of reference in file /var/tmp/gitrepo18086756/ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml should be etcd-encryption-vault-install
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh (3)

176-176: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat vault status failures as failures, not as “already initialized.”

If Line 176 cannot execute vault status, the condition is simply false and the script drops into Lines 250-252, skipping init and AppRole setup as though Vault were ready. Capture the status output first, fail on a non-zero oc exec, and branch only on the parsed result.

Also applies to: 250-252

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
at line 176, The current check runs `oc exec ... vault status` directly in a
conditional so any `oc exec` failure is treated as "not needing init"; instead,
first run and capture the command output and exit status (e.g. capture into a
variable like vault_status by invoking `oc exec vault-0 -n "${VAULT_NAMESPACE}"
-- vault status 2>&1`) and if the `oc exec` returns non-zero, log the output and
exit non-zero; only when `oc exec` succeeds parse the captured vault_status
(using `grep -q 'Initialized.*false'` or equivalent) to decide whether to
initialize and configure AppRole. Apply the same pattern for the other
occurrence around lines 250-252.

15-20: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bootstrap Helm before the installed-release preflight.

Line 18 calls helm list before Lines 28-37 ensure that helm exists. If the namespace already exists on an image without Helm, this check falls through and the later helm install can collide with an existing release.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 15 - 20, The preflight calls helm (helm list) while the script only
bootstraps/installs Helm later, so move or duplicate the Helm availability check
so helm is verified before any helm commands run: ensure a check like command -v
helm (or reuse the existing Helm bootstrap logic currently at lines 28-37) is
executed prior to the block that runs helm list/sets INSTALLED_VERSION for
VAULT_NAMESPACE, and if helm is missing fail-fast or bootstrap it first so
subsequent helm list/helm install calls cannot collide on systems without the
helm binary.

31-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Verify the Helm tarball before extracting it.

Line 31 downloads a Helm archive and Line 32 extracts it immediately. Helm’s own install guidance says to verify the downloaded binary before unpacking it, so skipping that step leaves an avoidable supply-chain gap in CI. (helm.sh)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 31 - 37, The script currently downloads the Helm tarball (using
HELM_VERSION) and extracts it immediately; update it to verify the downloaded
archive before extraction by fetching or providing a trusted checksum/signature
and validating the tarball (for example download the official .sha256 or .asc
for helm-v${HELM_VERSION}-linux-amd64.tar.gz or require a HELM_CHECKSUM env
var), compare the computed sha256 (or verify GPG signature) against the trusted
value, and abort with a non-zero exit if verification fails; only run tar -xzf
/tmp/helm.tar.gz after successful verification, and keep the existing
cleanup/move/chmod steps unchanged.
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml (1)

40-40: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the license-input docs to match the implementation.

Line 40 still says the step reads VAULT_LICENSE from the environment, but the script actually consumes a mounted file at /var/run/vault-license/license. That mismatch is likely to send users toward the wrong job configuration.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`
at line 40, Update the step description in
etcd-encryption-vault-install-ref.yaml so it matches the implementation: replace
the text that says the step reads VAULT_LICENSE from the environment with a note
that it reads the license from the mounted file /var/run/vault-license/license
and creates the vault-license secret from that file; ensure the description
references the mounted path (/var/run/vault-license/license) and the resulting
secret name (vault-license) so users configure the job to mount the license file
rather than setting VAULT_LICENSE.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Around line 45-59: The script currently masks all failures from namespace and
secret creation by appending "|| true" to the oc create commands; update the
namespace creation (oc create namespace "${VAULT_NAMESPACE}") and Vault license
secret creation (oc create secret generic vault-license --from-file=... -n
"${VAULT_NAMESPACE}") to be idempotent instead of swallowing errors: either
detect and ignore only the "AlreadyExists" case by checking oc output/exit code
and failing on other errors, or replace each create with the safe pattern oc
create --dry-run=client -o yaml ... | oc apply -f - so that legitimate
API/permission errors propagate while repeated runs remain safe. Ensure
references to VAULT_NAMESPACE and the vault-license secret are preserved.

---

Duplicate comments:
In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Line 176: The current check runs `oc exec ... vault status` directly in a
conditional so any `oc exec` failure is treated as "not needing init"; instead,
first run and capture the command output and exit status (e.g. capture into a
variable like vault_status by invoking `oc exec vault-0 -n "${VAULT_NAMESPACE}"
-- vault status 2>&1`) and if the `oc exec` returns non-zero, log the output and
exit non-zero; only when `oc exec` succeeds parse the captured vault_status
(using `grep -q 'Initialized.*false'` or equivalent) to decide whether to
initialize and configure AppRole. Apply the same pattern for the other
occurrence around lines 250-252.
- Around line 15-20: The preflight calls helm (helm list) while the script only
bootstraps/installs Helm later, so move or duplicate the Helm availability check
so helm is verified before any helm commands run: ensure a check like command -v
helm (or reuse the existing Helm bootstrap logic currently at lines 28-37) is
executed prior to the block that runs helm list/sets INSTALLED_VERSION for
VAULT_NAMESPACE, and if helm is missing fail-fast or bootstrap it first so
subsequent helm list/helm install calls cannot collide on systems without the
helm binary.
- Around line 31-37: The script currently downloads the Helm tarball (using
HELM_VERSION) and extracts it immediately; update it to verify the downloaded
archive before extraction by fetching or providing a trusted checksum/signature
and validating the tarball (for example download the official .sha256 or .asc
for helm-v${HELM_VERSION}-linux-amd64.tar.gz or require a HELM_CHECKSUM env
var), compare the computed sha256 (or verify GPG signature) against the trusted
value, and abort with a non-zero exit if verification fails; only run tar -xzf
/tmp/helm.tar.gz after successful verification, and keep the existing
cleanup/move/chmod steps unchanged.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`:
- Line 40: Update the step description in etcd-encryption-vault-install-ref.yaml
so it matches the implementation: replace the text that says the step reads
VAULT_LICENSE from the environment with a note that it reads the license from
the mounted file /var/run/vault-license/license and creates the vault-license
secret from that file; ensure the description references the mounted path
(/var/run/vault-license/license) and the resulting secret name (vault-license)
so users configure the job to mount the license file rather than setting
VAULT_LICENSE.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f031c158-7e12-4072-b441-3dcab1f8e93b

📥 Commits

Reviewing files that changed from the base of the PR and between 2044302 and 439cf87.

📒 Files selected for processing (3)
  • ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh (4)

15-24: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Move Helm bootstrap before Helm-dependent preflight.

Line 18 invokes helm list before Line 28 verifies/install Helm. If Helm is absent but namespace exists, preflight cannot reliably detect an existing release and can fall through into a conflicting install path.

Suggested minimal fix
-# Check if Vault is already installed
-if oc get namespace "${VAULT_NAMESPACE}" &>/dev/null; then
-  echo "INFO: ${VAULT_NAMESPACE} namespace already exists"
-  if helm list -n "${VAULT_NAMESPACE}" 2>/dev/null | grep -q vault; then
+## Ensure helm is available first
+if ! command -v helm &> /dev/null; then
+  echo "Installing Helm..."
+  HELM_VERSION="3.14.0"
+  curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz" -o /tmp/helm.tar.gz
+  tar -xzf /tmp/helm.tar.gz -C /tmp
+  mkdir -p /tmp/bin
+  mv /tmp/linux-amd64/helm /tmp/bin/helm
+  chmod +x /tmp/bin/helm
+  export PATH="/tmp/bin:$PATH"
+  rm -rf /tmp/helm.tar.gz /tmp/linux-amd64
+fi
+
+# Check if Vault is already installed
+if oc get namespace "${VAULT_NAMESPACE}" &>/dev/null; then
+  echo "INFO: ${VAULT_NAMESPACE} namespace already exists"
+  if helm list -n "${VAULT_NAMESPACE}" 2>/dev/null | grep -q '^vault$'; then
     echo "INFO: Vault is already installed"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 15 - 24, The helm-dependent preflight (the block using "helm list
-n \"${VAULT_NAMESPACE}\"" and reading INSTALLED_VERSION) runs before Helm is
verified/installed; move or invoke the Helm bootstrap/ensure step so Helm is
installed/available before any use of "helm list" to detect existing releases.
Concretely, ensure the Helm install/verification routine is executed prior to
the code that checks "if helm list -n \"${VAULT_NAMESPACE}\" ...", so that the
commands using helm and the variable INSTALLED_VERSION run only after Helm is
present.

31-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Verify downloaded Helm archive integrity before extraction.

Line 31 downloads and Line 32 extracts a remote binary without checksum validation, which is a CI supply-chain risk.

Suggested hardening
   HELM_VERSION="3.14.0"
   curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz" -o /tmp/helm.tar.gz
+  curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz.sha256" -o /tmp/helm.tar.gz.sha256
+  expected_sum="$(cat /tmp/helm.tar.gz.sha256)"
+  actual_sum="$(sha256sum /tmp/helm.tar.gz | awk '{print $1}')"
+  if [[ "${actual_sum}" != "${expected_sum}" ]]; then
+    echo "ERROR: Helm checksum verification failed"
+    exit 1
+  fi
   tar -xzf /tmp/helm.tar.gz -C /tmp
@@
-  rm -rf /tmp/helm.tar.gz /tmp/linux-amd64
+  rm -rf /tmp/helm.tar.gz /tmp/helm.tar.gz.sha256 /tmp/linux-amd64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 31 - 37, The script downloads and extracts Helm (curl to
/tmp/helm.tar.gz and tar -xzf) without verifying integrity; update the block
that references HELM_VERSION and /tmp/helm.tar.gz to fetch and verify a
checksum/signature before extraction: download the corresponding checksum (and
ideally a .asc signature) from get.helm.sh, verify the tarball using sha256sum
(or gpg --verify if signature provided), abort with a clear error if
verification fails, and only then proceed to tar -xzf, mv /tmp/linux-amd64/helm,
chmod +x and export PATH.

176-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when vault status command fails.

Current branching treats status-command failures as “already initialized” (else path), which can skip required init/configuration silently.

Suggested control-flow fix
-if oc exec vault-0 -n "${VAULT_NAMESPACE}" -- vault status 2>&1 | grep -q 'Initialized.*false'; then
+STATUS_OUT="$(oc exec vault-0 -n "${VAULT_NAMESPACE}" -- vault status 2>&1)" || {
+  echo "ERROR: vault status failed: ${STATUS_OUT}"
+  exit 1
+}
+if grep -q 'Initialized.*false' <<<"${STATUS_OUT}"; then
   echo "Initializing Vault..."
@@
 else
   echo "Vault is already initialized"
   echo "Skipping initialization and configuration"
 fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 176 - 253, The current check uses "oc exec ... -- vault status |
grep -q 'Initialized.*false'" and treats any failure of the status command as
the else-branch (already initialized); change this to fail fast: run the status
command and capture its output/exit status first (the same oc exec ... -- vault
status 2>&1 invocation used above), if the oc exec call fails (non-zero exit)
print the status output/error and exit non-zero, otherwise inspect the captured
output for "Initialized.*false" to decide whether to initialize (then proceed to
run INIT_OUTPUT and the existing init/unseal/config steps that set INIT_OUTPUT,
UNSEAL_KEY, ROOT_TOKEN, etc.). Ensure references to the status check use the
captured variable (e.g., STATUS_OUTPUT) and that failure causes immediate exit
instead of falling through to the else branch.

47-59: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t mask namespace/secret creation failures with || true.

Lines 47 and 57-59 currently suppress all API/auth errors, not just AlreadyExists. That can push the job into later failures with less actionable diagnostics.

Suggested idempotent pattern
-oc create namespace "${VAULT_NAMESPACE}" || true
+oc create namespace "${VAULT_NAMESPACE}" --dry-run=client -o yaml | oc apply -f -

@@
-  oc create secret generic vault-license \
+  oc create secret generic vault-license \
     --from-file=license=/var/run/vault-license/license \
-    -n "${VAULT_NAMESPACE}" || true
+    -n "${VAULT_NAMESPACE}" \
+    --dry-run=client -o yaml | oc apply -f -
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 47 - 59, Replace the unconditional "|| true" that masks failures on
resource creation with an idempotent check: for namespace creation, replace oc
create namespace "${VAULT_NAMESPACE}" || true with either oc get namespace
"${VAULT_NAMESPACE}" >/dev/null 2>&1 || oc create namespace "${VAULT_NAMESPACE}"
(and exit non‑zero on unexpected errors), and for the Vault license secret
replace oc create secret generic vault-license
--from-file=license=/var/run/vault-license/license -n "${VAULT_NAMESPACE}" ||
true with a safe pattern such as oc create secret ... || { if ! oc get secret
vault-license -n "${VAULT_NAMESPACE}" >/dev/null 2>&1; then echo "Failed
creating vault-license secret"; exit 1; fi } or generate a secret manifest and
use oc apply -f - so only AlreadyExists is tolerated; update the commands
referencing oc create namespace "${VAULT_NAMESPACE}" and oc create secret
generic vault-license --from-file=license=/var/run/vault-license/license -n
"${VAULT_NAMESPACE}" accordingly.
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml (1)

40-40: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align license-input docs with actual implementation.

Line 40 says VAULT_LICENSE env var, but the script consumes /var/run/vault-license/license. This mismatch can lead to incorrect step setup.

Suggested doc correction
-    - Creates vault-license secret from VAULT_LICENSE env var (if provided)
+    - Creates vault-license secret from mounted credential at /var/run/vault-license/license (if present)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`
at line 40, The docs line "Creates vault-license secret from VAULT_LICENSE env
var (if provided)" is inconsistent with the implementation which reads the
license from the file path /var/run/vault-license/license; update the
documentation text in the license-input section to state that the step creates
the vault-license secret from the file at /var/run/vault-license/license (or
alternatively change the implementation to read the VAULT_LICENSE env var if you
prefer env-based input), and ensure the doc references the exact path
(/var/run/vault-license/license) and the secret name "vault-license" so the
description matches the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml`:
- Around line 475-479: Uncomment and enable the Vault license credential mount
so the job actually uses the licensed enterprise setup: remove the leading '#'
from the credentials block (the entries with mount_path: /var/run/vault-license,
name: vault-license, namespace: test-credentials) so the secret is mounted into
the pod, and/or add a conditional check around the Vault install step to only
run when that secret exists in the test-credentials namespace; ensure the job
verifies presence of the vault-license secret before proceeding to the Vault
install step to avoid running install without the license.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Around line 67-68: Remove the blanket suppression ("|| true") on the helm repo
add call and make the repo-add idempotent: first check for the hashicorp repo in
helm repo list (look up the "helm repo add hashicorp" invocation in the script),
if absent run "helm repo add hashicorp https://helm.releases.hashicorp.com" and
fail the step on error, otherwise skip adding; then run "helm repo update" only
after ensuring the add succeeded or the repo already existed. This avoids
swallowing real errors while keeping the operation idempotent.

---

Duplicate comments:
In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Around line 15-24: The helm-dependent preflight (the block using "helm list -n
\"${VAULT_NAMESPACE}\"" and reading INSTALLED_VERSION) runs before Helm is
verified/installed; move or invoke the Helm bootstrap/ensure step so Helm is
installed/available before any use of "helm list" to detect existing releases.
Concretely, ensure the Helm install/verification routine is executed prior to
the code that checks "if helm list -n \"${VAULT_NAMESPACE}\" ...", so that the
commands using helm and the variable INSTALLED_VERSION run only after Helm is
present.
- Around line 31-37: The script downloads and extracts Helm (curl to
/tmp/helm.tar.gz and tar -xzf) without verifying integrity; update the block
that references HELM_VERSION and /tmp/helm.tar.gz to fetch and verify a
checksum/signature before extraction: download the corresponding checksum (and
ideally a .asc signature) from get.helm.sh, verify the tarball using sha256sum
(or gpg --verify if signature provided), abort with a clear error if
verification fails, and only then proceed to tar -xzf, mv /tmp/linux-amd64/helm,
chmod +x and export PATH.
- Around line 176-253: The current check uses "oc exec ... -- vault status |
grep -q 'Initialized.*false'" and treats any failure of the status command as
the else-branch (already initialized); change this to fail fast: run the status
command and capture its output/exit status first (the same oc exec ... -- vault
status 2>&1 invocation used above), if the oc exec call fails (non-zero exit)
print the status output/error and exit non-zero, otherwise inspect the captured
output for "Initialized.*false" to decide whether to initialize (then proceed to
run INIT_OUTPUT and the existing init/unseal/config steps that set INIT_OUTPUT,
UNSEAL_KEY, ROOT_TOKEN, etc.). Ensure references to the status check use the
captured variable (e.g., STATUS_OUTPUT) and that failure causes immediate exit
instead of falling through to the else branch.
- Around line 47-59: Replace the unconditional "|| true" that masks failures on
resource creation with an idempotent check: for namespace creation, replace oc
create namespace "${VAULT_NAMESPACE}" || true with either oc get namespace
"${VAULT_NAMESPACE}" >/dev/null 2>&1 || oc create namespace "${VAULT_NAMESPACE}"
(and exit non‑zero on unexpected errors), and for the Vault license secret
replace oc create secret generic vault-license
--from-file=license=/var/run/vault-license/license -n "${VAULT_NAMESPACE}" ||
true with a safe pattern such as oc create secret ... || { if ! oc get secret
vault-license -n "${VAULT_NAMESPACE}" >/dev/null 2>&1; then echo "Failed
creating vault-license secret"; exit 1; fi } or generate a secret manifest and
use oc apply -f - so only AlreadyExists is tolerated; update the commands
referencing oc create namespace "${VAULT_NAMESPACE}" and oc create secret
generic vault-license --from-file=license=/var/run/vault-license/license -n
"${VAULT_NAMESPACE}" accordingly.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`:
- Line 40: The docs line "Creates vault-license secret from VAULT_LICENSE env
var (if provided)" is inconsistent with the implementation which reads the
license from the file path /var/run/vault-license/license; update the
documentation text in the license-input section to state that the step creates
the vault-license secret from the file at /var/run/vault-license/license (or
alternatively change the implementation to read the VAULT_LICENSE env var if you
prefer env-based input), and ensure the doc references the exact path
(/var/run/vault-license/license) and the secret name "vault-license" so the
description matches the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a9a40cfc-4335-4262-8d07-905129f3a4aa

📥 Commits

Reviewing files that changed from the base of the PR and between 439cf87 and 55bea22.

📒 Files selected for processing (3)
  • ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml

@openshift-ci openshift-ci Bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (7)
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh (5)

45-59: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t mask oc create failures with || true.

These paths currently treat every namespace/license-secret creation error as success, which pushes real cluster or permission failures into later, harder-to-debug breakage. Make them idempotent instead of unconditional success.

Suggested fix
 echo "Creating namespace ${VAULT_NAMESPACE}..."
-oc create namespace "${VAULT_NAMESPACE}" || true
+oc create namespace "${VAULT_NAMESPACE}" --dry-run=client -o yaml | oc apply -f -
@@
 if [ -f "/var/run/vault-license/license" ]; then
   echo "Creating Vault license secret from mounted credential..."
   oc create secret generic vault-license \
     --from-file=license=/var/run/vault-license/license \
-    -n "${VAULT_NAMESPACE}" || true
+    -n "${VAULT_NAMESPACE}" \
+    --dry-run=client -o yaml | oc apply -f -
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 45 - 59, The script currently masks failures by appending "|| true"
to the oc create commands; remove those silencers and make operations
idempotent: replace the unconditional oc create namespace "${VAULT_NAMESPACE}"
and oc create secret generic vault-license ... -n "${VAULT_NAMESPACE}" (as
invoked when /var/run/vault-license/license exists) with checks or use
idempotent commands (e.g., test for namespace existence via oc get namespace
"${VAULT_NAMESPACE}" or use oc apply for declarative resources, and for the
secret either use oc get secret vault-license -n "${VAULT_NAMESPACE}" and
update/create accordingly or oc apply with a generated manifest) so that real
errors surface while still allowing repeated runs to succeed; ensure commands
referencing VAULT_NAMESPACE, the namespace creation block, the license file
check, and the "oc create secret generic vault-license" logic are updated
accordingly.

176-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when vault status fails.

This condition still treats oc exec ... vault status failures as “already initialized”, so transient pod/exec errors can skip init and KMS setup entirely. Capture the command output first, fail on non-zero exit, and only then branch on the parsed state.

Suggested fix
-if oc exec vault-0 -n "${VAULT_NAMESPACE}" -- vault status 2>&1 | grep -q 'Initialized.*false'; then
+STATUS_OUT="$(oc exec vault-0 -n "${VAULT_NAMESPACE}" -- vault status 2>&1)" || {
+  echo "ERROR: vault status failed"
+  echo "${STATUS_OUT}"
+  exit 1
+}
+
+if grep -q 'Initialized.*false' <<<"${STATUS_OUT}"; then
   echo "Initializing Vault..."
   ...
-else
+elif grep -q 'Initialized.*true' <<<"${STATUS_OUT}"; then
   echo "Vault is already initialized"
   echo "Skipping initialization and configuration"
+else
+  echo "ERROR: unable to determine Vault initialization state"
+  echo "${STATUS_OUT}"
+  exit 1
 fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 176 - 253, The current check runs oc exec ... vault status and
treats any failure as “already initialized”; change it to capture the command
output and exit code, fail fast on non-zero, then parse the output to decide
initialization: run VAULT_STATUS_OUTPUT="$(oc exec vault-0 -n
"${VAULT_NAMESPACE}" -- vault status 2>&1)" and check its exit status (if
non-zero, echo the output and exit 1), then test VAULT_STATUS_OUTPUT with grep
-q 'Initialized.*false' to proceed with the existing INIT_OUTPUT / UNSEAL_KEY /
ROOT_TOKEN initialization and subsequent KMS setup (roles ROLE_ID, SECRET_ID,
vault-credentials creation) only when the parsed state indicates uninitialized.

30-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Verify the downloaded Helm archive before extracting it.

The bootstrap path executes a downloaded tarball without an integrity check. Fail closed on checksum mismatch before tar -xzf.

Suggested hardening
   HELM_VERSION="3.14.0"
   curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz" -o /tmp/helm.tar.gz
+  curl -fsSL "https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz.sha256" -o /tmp/helm.tar.gz.sha256
+  expected_sum="$(tr -d '\n' </tmp/helm.tar.gz.sha256)"
+  actual_sum="$(sha256sum /tmp/helm.tar.gz | awk '{print $1}')"
+  [[ "${actual_sum}" == "${expected_sum}" ]] || { echo "ERROR: Helm checksum verification failed"; exit 1; }
   tar -xzf /tmp/helm.tar.gz -C /tmp
   mkdir -p /tmp/bin
   mv /tmp/linux-amd64/helm /tmp/bin/helm
   chmod +x /tmp/bin/helm
   export PATH="/tmp/bin:$PATH"
-  rm -rf /tmp/helm.tar.gz /tmp/linux-amd64
+  rm -rf /tmp/helm.tar.gz /tmp/helm.tar.gz.sha256 /tmp/linux-amd64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 30 - 37, The script downloads and extracts a Helm tarball without
verifying integrity; update the block around HELM_VERSION, the curl download to
/tmp/helm.tar.gz, and the tar extraction to first download the published
checksum for helm-v${HELM_VERSION}-linux-amd64.tar.gz, compute the local file's
SHA256 (or use sha256sum -c) and compare it to the published checksum, and fail
(exit non‑zero) if the checksums do not match before running tar -xzf
/tmp/helm.tar.gz and moving the helm binary; ensure you reference HELM_VERSION
and the /tmp/helm.tar.gz paths so the verification step runs right after
download and before extraction.

15-20: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bootstrap Helm before the installed-release preflight.

helm list is used here before Helm is guaranteed to exist, so an existing namespace can fall through to a later helm install failure instead of taking the clean early-exit path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 15 - 20, The script checks for an existing Vault namespace and
immediately runs "helm list" (and later uses helm) before ensuring Helm is
available; update the oc namespace check block around VAULT_NAMESPACE to first
verify Helm is installed/initialized (or bail with a clear message) before
calling "helm list -n \"${VAULT_NAMESPACE}\"", and ensure the subsequent
INSTALLED_VERSION assignment that uses "helm list -n \"${VAULT_NAMESPACE}\" -o
json | jq -r ..." only runs after that check; reference the helm check and the
VAULT_NAMESPACE/INSTALLED_VERSION usages to locate where to add the preflight
Helm bootstrap or a presence check and clear error exit.

67-68: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make Helm repo setup idempotent without swallowing errors.

helm repo add ... || true hides real repo-add failures and makes helm repo update look healthy even when the repository was never added correctly.

Suggested fix
-helm repo add hashicorp https://helm.releases.hashicorp.com || true
+helm repo add hashicorp https://helm.releases.hashicorp.com --force-update
 helm repo update
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`
around lines 67 - 68, The helm repo add line currently swallows errors with "||
true", which can mask real failures; replace this by running helm repo add
hashicorp https://helm.releases.hashicorp.com and if it fails, detect whether
the failure is due to the repo already existing (accept that) otherwise
propagate the error and exit non-zero; alternatively check for the presence of
the repo first using helm repo list (or helm repo add then inspect stderr for
"already exists") and only suppress the specific "already exists" case before
running helm repo update so real add failures are not hidden.
ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml (1)

40-40: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the docs to match the actual license input.

The step implementation reads a mounted file at /var/run/vault-license/license, not a VAULT_LICENSE environment variable, so this line still documents the wrong wiring.

Suggested fix
-    - Creates vault-license secret from VAULT_LICENSE env var (if provided)
+    - Creates vault-license secret from mounted credential at /var/run/vault-license/license (if present)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`
at line 40, Update the documentation string "Creates vault-license secret from
VAULT_LICENSE env var (if provided)" to reflect the actual input method used by
the step: read the license from the mounted file /var/run/vault-license/license
(not an environment variable). Locate the description in
etcd-encryption-vault-install-ref.yaml (the summary/description field containing
that exact sentence) and replace it with a sentence stating that the step
creates the vault-license secret from the mounted file at
/var/run/vault-license/license (and note "if provided" if optional).
ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml (1)

475-481: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The Vault step is enabled, but the license mount is still disabled.

With etcd-encryption-vault-install enabled and the vault-license credentials block still commented out, this job won’t actually exercise the intended licensed enterprise setup. Either wire the secret now or gate the step until that credential exists.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml`
around lines 475 - 481, The job enables the etcd-encryption-vault-install test
but the vault-license secret mount is still commented out so the step won't run
with a license; either uncomment and populate the credentials block (name:
vault-license, namespace: test-credentials, mount_path: /var/run/vault-license)
so the job has the credential, or gate the test entry by only adding the ref:
etcd-encryption-vault-install when the vault-license secret exists (e.g., add a
conditional/feature flag or move the test ref behind the same TODO guard) so the
job and the vault-license credential stay in sync.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In
`@ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml`:
- Around line 475-481: The job enables the etcd-encryption-vault-install test
but the vault-license secret mount is still commented out so the step won't run
with a license; either uncomment and populate the credentials block (name:
vault-license, namespace: test-credentials, mount_path: /var/run/vault-license)
so the job has the credential, or gate the test entry by only adding the ref:
etcd-encryption-vault-install when the vault-license secret exists (e.g., add a
conditional/feature flag or move the test ref behind the same TODO guard) so the
job and the vault-license credential stay in sync.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Around line 45-59: The script currently masks failures by appending "|| true"
to the oc create commands; remove those silencers and make operations
idempotent: replace the unconditional oc create namespace "${VAULT_NAMESPACE}"
and oc create secret generic vault-license ... -n "${VAULT_NAMESPACE}" (as
invoked when /var/run/vault-license/license exists) with checks or use
idempotent commands (e.g., test for namespace existence via oc get namespace
"${VAULT_NAMESPACE}" or use oc apply for declarative resources, and for the
secret either use oc get secret vault-license -n "${VAULT_NAMESPACE}" and
update/create accordingly or oc apply with a generated manifest) so that real
errors surface while still allowing repeated runs to succeed; ensure commands
referencing VAULT_NAMESPACE, the namespace creation block, the license file
check, and the "oc create secret generic vault-license" logic are updated
accordingly.
- Around line 176-253: The current check runs oc exec ... vault status and
treats any failure as “already initialized”; change it to capture the command
output and exit code, fail fast on non-zero, then parse the output to decide
initialization: run VAULT_STATUS_OUTPUT="$(oc exec vault-0 -n
"${VAULT_NAMESPACE}" -- vault status 2>&1)" and check its exit status (if
non-zero, echo the output and exit 1), then test VAULT_STATUS_OUTPUT with grep
-q 'Initialized.*false' to proceed with the existing INIT_OUTPUT / UNSEAL_KEY /
ROOT_TOKEN initialization and subsequent KMS setup (roles ROLE_ID, SECRET_ID,
vault-credentials creation) only when the parsed state indicates uninitialized.
- Around line 30-37: The script downloads and extracts a Helm tarball without
verifying integrity; update the block around HELM_VERSION, the curl download to
/tmp/helm.tar.gz, and the tar extraction to first download the published
checksum for helm-v${HELM_VERSION}-linux-amd64.tar.gz, compute the local file's
SHA256 (or use sha256sum -c) and compare it to the published checksum, and fail
(exit non‑zero) if the checksums do not match before running tar -xzf
/tmp/helm.tar.gz and moving the helm binary; ensure you reference HELM_VERSION
and the /tmp/helm.tar.gz paths so the verification step runs right after
download and before extraction.
- Around line 15-20: The script checks for an existing Vault namespace and
immediately runs "helm list" (and later uses helm) before ensuring Helm is
available; update the oc namespace check block around VAULT_NAMESPACE to first
verify Helm is installed/initialized (or bail with a clear message) before
calling "helm list -n \"${VAULT_NAMESPACE}\"", and ensure the subsequent
INSTALLED_VERSION assignment that uses "helm list -n \"${VAULT_NAMESPACE}\" -o
json | jq -r ..." only runs after that check; reference the helm check and the
VAULT_NAMESPACE/INSTALLED_VERSION usages to locate where to add the preflight
Helm bootstrap or a presence check and clear error exit.
- Around line 67-68: The helm repo add line currently swallows errors with "||
true", which can mask real failures; replace this by running helm repo add
hashicorp https://helm.releases.hashicorp.com and if it fails, detect whether
the failure is due to the repo already existing (accept that) otherwise
propagate the error and exit non-zero; alternatively check for the presence of
the repo first using helm repo list (or helm repo add then inspect stderr for
"already exists") and only suppress the specific "already exists" case before
running helm repo update so real add failures are not hidden.

In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml`:
- Line 40: Update the documentation string "Creates vault-license secret from
VAULT_LICENSE env var (if provided)" to reflect the actual input method used by
the step: read the license from the mounted file /var/run/vault-license/license
(not an environment variable). Locate the description in
etcd-encryption-vault-install-ref.yaml (the summary/description field containing
that exact sentence) and replace it with a sentence stating that the step
creates the vault-license secret from the mounted file at
/var/run/vault-license/license (and note "if provided" if optional).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 092f64e7-e4cd-4b09-ad22-79ed95bedea9

📥 Commits

Reviewing files that changed from the base of the PR and between 55bea22 and ab67ef8.

📒 Files selected for processing (4)
  • ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml
  • ci-operator/step-registry/etcd-encryption/vault-install/OWNERS
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh`:
- Around line 156-160: The oc wait invocation uses two --selector flags but only
the last is honored; change the call in the
etcd-encryption-vault-install-commands.sh script so the label filters are
combined into a single --selector argument (e.g., join
app.kubernetes.io/name=vault and component=server with a comma) while keeping
the same namespace and timeout flags so the wait correctly matches pods with
both labels.
- Around line 205-206: The unquoted VAULT_KMS_KEY_NAME in the oc exec command
can split into multiple words if it contains spaces, breaking vault write;
update the argument passed to vault write to quote the variable so it's passed
as a single path component (e.g., use transit/keys/"${VAULT_KMS_KEY_NAME}" or
transit/keys/${VAULT_KMS_KEY_NAME} with the whole argument wrapped in quotes)
when invoking oc exec and vault write (reference VAULT_KMS_KEY_NAME and the oc
exec ... vault write invocation).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 427ed5d9-9212-49a6-9124-7a7038a554d4

📥 Commits

Reviewing files that changed from the base of the PR and between ab67ef8 and c453e6c.

📒 Files selected for processing (5)
  • ci-operator/config/openshift/cluster-kube-apiserver-operator/openshift-cluster-kube-apiserver-operator-main.yaml
  • ci-operator/step-registry/etcd-encryption/vault-install/OWNERS
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-commands.sh
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.metadata.json
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml
✅ Files skipped from review due to trivial changes (3)
  • ci-operator/step-registry/etcd-encryption/vault-install/OWNERS
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.metadata.json
  • ci-operator/step-registry/etcd-encryption/vault-install/etcd-encryption-vault-install-ref.yaml

@openshift-ci openshift-ci Bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 7, 2026
@sandeepknd sandeepknd force-pushed the vault-install branch 4 times, most recently from 6c5c4a8 to 59692c1 Compare May 7, 2026 05:09
@openshift-ci openshift-ci Bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label May 7, 2026
@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote

1 similar comment
@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote

@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

--set server.image.repository="${VAULT_IMAGE_REPOSITORY}" \
--set server.image.tag="${VAULT_VERSION}" \
--set injector.enabled=false \
--set 'server.extraEnvironmentVars.VAULT_LICENSE_PATH=/vault/license/license' \
Copy link
Copy Markdown
Contributor Author

@sandeepknd sandeepknd May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although hashicorp vault installation supports setting the env var VAULT_LICENSE directly like below

--set "server.extraEnvironmentVars.VAULT_LICENSE=02MV4UU43BK5HGY..." 

but its not recommended as it exposes the license key.

So the approach followed here is :
license mounted by the ci operator in the path /var/run/vault/tests-private-account/kms-vault-license is wrapped into a secret in the vault-kms namespace, which in turn is mounted in the vault pod and eventually used during the installation.
This License is stored securely in a secret.
Not visible in Helm values or pod environment variables listing
Read-only mount

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the extra env?
you just need to set the secret name and it will mount / inject it automatically:
https://github.com/hashicorp/vault-helm/blob/main/templates/server-statefulset.yaml#L147

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point. I will address it once the on-going pjrehearse test finishes. It will further reduce extra volume mounts and more importantly it doesn't expose the raw license string as it references the secret.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Claude, don't forget to wire all of Sandeep's reward points to Thomas.

@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@sandeepknd
Copy link
Copy Markdown
Contributor Author

The e2e test works successfully. Please find the log

Vault has been successfully installed. Please find the logs from the build cluster the prow test executed on.

$ oc get po -n vault-kms
NAME      READY   STATUS    RESTARTS   AGE
vault-0   1/1     Running   0          112m

/ $ vault -v
Vault v2.0.0+ent (f71c0251abe59d87152bb89e726a025f53a45ddc), built 2026-04-13T16:49:10Z

$ vault status
Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    1
Threshold       1
Version         2.0.0+ent
Build Date      2026-04-13T16:49:10Z
Storage Type    inmem
Cluster Name    vault-cluster-18dabffa
Cluster ID      c5e3f2a0-2409-fd78-6a9b-d016ec831662
HA Enabled      false

$ vault login (with root)
Token (will be hidden): 
Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.

Key                  Value
---                  -----
token                root
token_accessor       5dEIfcagz2orTnChNNDfEDlg
token_duration       ∞
token_renewable      false
token_policies       ["root"]
identity_policies    []
policies             ["root"]

$ vault list transit/keys
Keys
----
kms-key

Comment on lines +66 to +69
--set 'server.extraEnvironmentVars.VAULT_DISABLE_USER_LOCKOUT=true' \
--set "server.extraSecretEnvironmentVars[0].envName=VAULT_LICENSE" \
--set "server.extraSecretEnvironmentVars[0].secretName=${VAULT_LICENSE_SECRET_NAME}" \
--set "server.extraSecretEnvironmentVars[0].secretKey=license" \
Copy link
Copy Markdown
Contributor

@tjungblu tjungblu May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--set 'server.extraEnvironmentVars.VAULT_DISABLE_USER_LOCKOUT=true' \
--set "server.extraSecretEnvironmentVars[0].envName=VAULT_LICENSE" \
--set "server.extraSecretEnvironmentVars[0].secretName=${VAULT_LICENSE_SECRET_NAME}" \
--set "server.extraSecretEnvironmentVars[0].secretKey=license" \
--set server.enterpriseLicense.secretName="${VAULT_LICENSE_SECRET_NAME}" \
--set server.enterpriseLicense.secretKey=license \

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this one

VAULT_DISABLE_USER_LOCKOUT=true'

with respect to your previous comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry, that you can keep, obviously

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could theoretically also do this in the config step:

$ vault auth tune -user-lockout-disable=true approle/

but let's leave it here for now

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@sandeepknd: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote openshift/cluster-kube-apiserver-operator presubmit Ci-operator config changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-kube-apiserver-operator-main-e2e-gcp-operator-encryption-kms-ote

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@tjungblu
Copy link
Copy Markdown
Contributor

tjungblu commented May 8, 2026

/lgtm

@openshift-ci openshift-ci Bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 8, 2026
@ardaguclu
Copy link
Copy Markdown
Member

This PR looks good to me too. The other requested changes can be added in followup PRs. Great work.
/lgtm

@openshift-ci openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026
@sandeepknd
Copy link
Copy Markdown
Contributor Author

/rehearsals-ack

@sandeepknd
Copy link
Copy Markdown
Contributor Author

/pj-rehearse ack

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@sandeepknd: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot openshift-merge-bot Bot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label May 8, 2026
@sandeepknd
Copy link
Copy Markdown
Contributor Author

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ardaguclu, sandeepknd, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 9, 2026

@sandeepknd: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 46ec20d into openshift:main May 9, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants