Skip to content

CNTRLPLANE-3364: utility for vault deployer#2189

Closed
sandeepknd wants to merge 1 commit intoopenshift:masterfrom
sandeepknd:deploy_vault
Closed

CNTRLPLANE-3364: utility for vault deployer#2189
sandeepknd wants to merge 1 commit intoopenshift:masterfrom
sandeepknd:deploy_vault

Conversation

@sandeepknd
Copy link
Copy Markdown

@sandeepknd sandeepknd commented Apr 28, 2026

Introduces a Kubernetes test deployer for local HashiCorp Vault Enterprise deployment, init/unseal, transit AppRole KMS configuration, unit and optional integration tests.

$ INTEGRATION_TEST=true VAULT_LICENSE=$(cat vault.hclic) go test -v -run TestVaultDeployerIntegration
=== RUN   TestVaultDeployerIntegration
    vault_kube_kms_deployer_test.go:62: Deploying Vault...
    vault_kube_kms_deployer_test.go:63: Deploying Vault Enterprise in namespace "vault-kms-test-20260430-194817" (replicas: 1)
    vault_kube_kms_deployer_test.go:63: Applied vault_namespace.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_serviceaccount.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_scc_rolebinding.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_role.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_role_binding.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_configmap.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Applied vault_service.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Created vault-license secret from VAULT_LICENSE env var
    vault_kube_kms_deployer_test.go:63: Applied vault_deployment.yaml (changed=true)
    vault_kube_kms_deployer_test.go:63: Waiting for Vault deployment to be ready...
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
    vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=1 available=1
    vault_kube_kms_deployer_test.go:63: Reading vault credentials from secret vault-kms-test-20260430-194817/vault-credentials
    vault_kube_kms_deployer_test.go:63: Vault AppRole credentials loaded from secret
    vault_kube_kms_deployer_test.go:63: Vault Enterprise deployed. Service: http://vault.vault-kms-test-20260430-194817.svc:8200
    vault_kube_kms_deployer_test.go:72: Namespace vault-kms-test-20260430-194817 created
    vault_kube_kms_deployer_test.go:79: Deployment ready: 1/1 replicas
    vault_kube_kms_deployer_test.go:92: Vault credentials secret verified
    vault_kube_kms_deployer_test.go:94: Integration test passed!
--- PASS: TestVaultDeployerIntegration (21.53s)
PASS
ok  	github.com/openshift/library-go/test/library/encryption/kms	21.553s

$ INTEGRATION_TEST=true go test -v -run TestVaultCleanup
=== RUN   TestVaultCleanup
    vault_kube_kms_deployer_test.go:139: Found namespace to cleanup: vault-kms-test-20260430-194343
    vault_kube_kms_deployer_test.go:145: Deleting namespace vault-kms-test-20260430-194343...
    vault_kube_kms_deployer_test.go:155: Verifying namespace deletion...
    vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (1/30)
    vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (2/30)
    vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (3/30)
    vault_kube_kms_deployer_test.go:159: Namespace successfully deleted
    vault_kube_kms_deployer_test.go:160: Cleanup test passed!
--- PASS: TestVaultCleanup (15.22s)
PASS
ok  	github.com/openshift/library-go/test/library/encryption/kms	15.264s

Summary by CodeRabbit

  • New Features
    • Added an automated Vault-on-Kubernetes deployer for integration testing: deploys Vault, handles license input, waits for readiness, initializes and unseals Vault, configures transit KMS and AppRole auth, stores credentials, and provides cleanup and getters.
  • Tests
    • Added unit and integration tests covering configuration/defaults, manifest templating, license scenarios, deployment modes, AppRole credential handling with masked logging, deployment/validation, and cleanup.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 28, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a Vault Enterprise Kubernetes test deployer and accompanying tests: deploys Vault into a namespace, reads/creates license secret, initializes and unseals Vault, configures transit KMS and AppRole auth, stores AppRole credentials, exposes getters and Cleanup, and includes unit and integration tests.

Changes

Cohort / File(s) Summary
Vault Kubernetes Deployer
test/library/encryption/kms/vault_kube_kms_deployer.go
New VaultDeployer and VaultConfig implementation: templated manifests, apply/wait deploy flow, license reading/secret creation, exec helpers (SPDY, heredoc/env injection, redaction), Vault init/unseal, transit engine and AppRole setup, persistence of secrets, getters and Cleanup.
Vault Deployer Tests
test/library/encryption/kms/vault_kube_kms_deployer_test.go
New unit and integration tests: config and default validation, manifest templating checks, license read cases, getter assertions, deployment-mode manifest differences (emptyDir vs PVC), plus gated integration tests for deploy, readiness, credential handling, and cleanup/ClusterRoleBinding removal.

Sequence Diagram

sequenceDiagram
    actor TestHarness as Test Harness
    participant K8sAPI as Kubernetes API
    participant VaultPod as Vault Pod
    participant VaultCLI as Vault CLI (in Pod)

    TestHarness->>K8sAPI: Apply shared manifests & deployment YAML
    K8sAPI->>VaultPod: Create Pod, Service, RBAC, PVC/emptyDir
    TestHarness->>K8sAPI: Poll Pod status until Running+Ready
    K8sAPI-->>TestHarness: Pod Running+Ready
    TestHarness->>VaultCLI: Exec `vault operator init -format=json`
    VaultCLI-->>TestHarness: Return root_token + unseal_keys
    TestHarness->>VaultCLI: Exec `vault operator unseal` (use threshold keys)
    VaultCLI-->>TestHarness: Vault unsealed
    TestHarness->>VaultCLI: Enable transit, create key, enable AppRole, write policy, create role
    VaultCLI-->>TestHarness: Return role_id & secret_id
    TestHarness->>K8sAPI: Create/update secrets (vault-root-token, vault-license, approle creds)
    TestHarness->>K8sAPI: Delete namespace (Cleanup)
    K8sAPI-->>TestHarness: Namespace and resources removed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 inconclusive)

Check name Status Explanation Resolution
Test Structure And Quality ❓ Inconclusive Test files mentioned in PR summary could not be located in the repository, preventing evaluation of test structure and quality. Ensure the PR branch is properly checked out and test files are accessible to evaluate Ginkgo compliance and assertion quality.
Topology-Aware Scheduling Compatibility ❓ Inconclusive PR adds Vault Kubernetes deployer utility but actual manifest templates and implementation files are not accessible for detailed topology-aware scheduling analysis. Provide access to vault_kube_kms_deployer.go and deployment manifest templates to verify proper handling of SNO, Two-Node, and HyperShift topologies.
Ipv6 And Disconnected Network Test Compatibility ❓ Inconclusive Test files vault_kube_kms_deployer.go and vault_kube_kms_deployer_test.go could not be located in the repository at the specified path. Verify the actual file paths of the new tests in the pull request and examine their contents for IPv4 assumptions and external connectivity requirements.
✅ Passed checks (9 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test function names in the Vault deployer test suite are static and deterministic with no dynamic information, timestamps, UUIDs, or generated identifiers.
Microshift Test Compatibility ✅ Passed Tests are unit tests using testing.T, not Ginkgo e2e tests. Code uses only standard Kubernetes APIs compatible with MicroShift.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR adds standard Go unit and integration tests (testing.T), not Ginkgo e2e tests, so the SNO compatibility check for Ginkgo e2e tests is not applicable.
Ote Binary Stdout Contract ✅ Passed Files are in kms package with proper logging via testing.TB.Logf(), no process-level entry points, and no stdout contract violations.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "CNTRLPLANE-3364: utility for vault deployer" clearly and concisely summarizes the main change—adding a new Vault deployer utility for Kubernetes tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 28, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (5)
test/library/encryption/kms/GUIDE.md (1)

65-65: Add language specifiers to fenced code blocks.

Multiple code blocks are missing language specifiers, which affects syntax highlighting and violates markdownlint rules. Add appropriate language identifiers (e.g., text, plaintext, or specific languages).

Affected lines: 65, 117, 161, 190, 220, 232, 242, 261, 290.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` at line 65, Several fenced code blocks
in GUIDE.md lack language specifiers (affecting syntax highlighting and
markdownlint). Edit the affected fenced code blocks (those around the examples
referenced) and add appropriate language identifiers such as text, plaintext, or
the specific language (e.g., bash, json) as appropriate; search for
triple-backtick fences in GUIDE.md (examples near the noted sections) and update
each opening ``` to include the correct specifier to satisfy markdownlint and
enable highlighting.
test/library/encryption/kms/vault_kube_kms_deployer.go (3)

330-372: Consider removing or documenting unused applyManifests function.

The applyManifests function appears unused since deployment now goes through Helm (installVaultWithHelm). If this is intentional dead code for future use, add a comment explaining its purpose. Otherwise, consider removing it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 330 -
372, The applyManifests method on VaultDeployer appears to be dead/unused now
that installVaultWithHelm handles deployment; either remove the entire
applyManifests implementation (and any references to vaultSharedManifestFiles,
vaultDeploymentManifestFile, wrapAssetWithTemplateData) to avoid stale code, or
add a concise doc comment above func (d *VaultDeployer) applyManifests
explaining why it is retained (e.g., "kept for tests/legacy/manual K8s manifest
application; not used by installVaultWithHelm") and mark it as intentional dead
code so reviewers understand its purpose.

152-153: Replace arbitrary sleep with proper readiness check.

The time.Sleep(5 * time.Second) is a code smell. Consider polling Vault's health endpoint until it's ready to accept commands.

♻️ Suggested improvement
-	// Wait a bit for Vault to be ready to accept commands
-	time.Sleep(5 * time.Second)
+	// Wait for Vault to be ready to accept commands
+	if err := d.waitForVaultReady(ctx); err != nil {
+		return fmt.Errorf("Vault not ready to accept commands: %w", err)
+	}

Add a new method that polls the Vault status endpoint.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 152 -
153, Replace the arbitrary time.Sleep(5 * time.Second) with a proper readiness
poll: implement a new helper method (e.g., waitForVaultReady or pollVaultHealth)
that repeatedly queries Vault's health endpoint (GET /v1/sys/health) with a
timeout and backoff, checks the response status/body for readiness (initialized
and not sealed / not standby or HTTP 200), and returns an error on timeout; then
call this helper from where time.Sleep is currently used (replace time.Sleep
with waitForVaultReady) so the deployer waits deterministically for Vault to
accept commands.

711-714: Remove unused escapeForShell function.

The escapeForShell function is defined but never used in the codebase.

-// escapeForShell escapes a string for safe use in shell commands
-func escapeForShell(s string) string {
-	return strings.ReplaceAll(s, "'", "'\\''")
-}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 711 -
714, Remove the unused helper function escapeForShell by deleting its entire
definition (the escapeForShell(s string) function) and also remove any
now-unused imports (e.g., "strings") that were only required for that function
so the package still builds cleanly; if you prefer to keep similar behavior,
replace any ad-hoc shell-escaping usages with a single shared utility and ensure
it's actually referenced, otherwise simply remove the function and tidy imports.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (1)

524-532: Consider using exponential backoff for namespace deletion polling.

The fixed 1-second sleep interval with 30 iterations is acceptable but could benefit from using wait.PollUntilContextTimeout for consistency with other polling patterns in this codebase.

♻️ Suggested improvement
err = wait.PollUntilContextTimeout(ctx, 1*time.Second, 30*time.Second, true, func(ctx context.Context) (bool, error) {
    _, err := kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{})
    if apierrors.IsNotFound(err) {
        return true, nil
    }
    return false, nil
})
if err != nil {
    t.Error("✗ Namespace should be deleted after cleanup")
} else {
    t.Logf("✓ Namespace %s successfully deleted", testNamespace)
}

This requires importing apierrors "k8s.io/apimachinery/pkg/api/errors" and "k8s.io/apimachinery/pkg/util/wait".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 524
- 532, Replace the manual for-loop polling in the test that checks namespace
deletion with Kubernetes' wait.PollUntilContextTimeout for consistency: instead
of the loop around kubeClient.CoreV1().Namespaces().Get and time.Sleep, call
wait.PollUntilContextTimeout with a 1s interval and 30s timeout, use
kubeClient.CoreV1().Namespaces().Get inside the predicate and return true when
apierrors.IsNotFound(err), otherwise false; ensure you import
"k8s.io/apimachinery/pkg/util/wait" and apierrors
"k8s.io/apimachinery/pkg/api/errors" and update the test assertion to fail on a
returned error from PollUntilContextTimeout and log success when it returns nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Line 79: Replace hardcoded user-specific absolute paths in the GUIDE.md
instructions (e.g., the "cd
/home/skundu/automation/library-go/test/library/encryption/kms" command and
similar occurrences referenced at lines 146, 156, 185, and 209) with either
relative paths (like "cd ../../test/library/encryption/kms") or a generic
placeholder (like "cd
/path/to/automation/library-go/test/library/encryption/kms") so the guide is
portable; update each matching "cd" or path string in the document accordingly
and ensure any example commands use the placeholder or relative path
consistently.
- Line 34: The Docker image tag in the GUIDE.md snippet references a
non-existent Vault Enterprise version
"docker.io/hashicorp/vault-enterprise:2.0.0-ent"; update that tag to an existing
release (e.g., replace "docker.io/hashicorp/vault-enterprise:2.0.0-ent" with
"docker.io/hashicorp/vault-enterprise:1.21.4+ent") so the documentation uses a
valid Vault Enterprise image.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 580-592: Replace the custom contains and findSubstring helpers
with the stdlib strings.Contains: remove the contains and findSubstring
functions and add import "strings"; then update the two call sites that
currently call contains(contentStr, "emptyDir") and contains(contentStr,
"persistentVolumeClaim") to use strings.Contains(contentStr, "...") instead so
the code uses the standard, tested implementation.
- Around line 358-360: The test currently prints sensitive AppRole credentials
(roleID and secretID) via t.Logf; instead redact or abbreviate these values
before logging (e.g., show only first/last N chars or replace with a fixed mask)
and/or remove the logs entirely. Update the t.Logf calls that reference roleID
and secretID in vault_kube_kms_deployer_test.go to use a redaction helper or
inline substring masking so the logs no longer emit full credentials while
preserving enough info for debugging.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 441-448: The current branch in the error path of initializeVault
returns hard-coded dummy credentials when d.isVaultInitialized(ctx, podName) is
true which will break subsequent unsealVault and configureVaultForKMS calls;
change this behavior to fail fast instead: when isVaultInitialized(ctx, podName)
returns true, return a non-nil error describing "Vault already initialized" (do
not return "EXISTING_ROOT_TOKEN" or dummy keys), so callers (e.g., unsealVault,
configureVaultForKMS) will not attempt to use invalid credentials; update any
callers or tests expecting the dummy return to handle the error path
accordingly.
- Around line 302-307: Replace the hardcoded temp path with a unique temp file
created via os.CreateTemp and write valuesContent into it (instead of using the
literal "/tmp/vault-values.yaml"); update uses of valuesFile to use the returned
*os.File (valuesFile.Name() where the path string is needed) and ensure you
still defer os.Remove(valuesFile.Name()) and close the file; look for the
valuesFile variable and the code that writes valuesContent to update the write,
close, and remove logic accordingly.
- Around line 493-505: The unseal loop hardcodes a threshold of 3; replace that
with the real threshold returned by Vault init (or a central constant) so
changes don't break unsealing: obtain the threshold value from the init result
used to populate unsealKeys (e.g., use initResp.Threshold or the struct that
produced unsealKeys) and loop for j := 0; j < threshold && j < len(unsealKeys);
j++ when calling d.execInPod; also validate threshold <= len(unsealKeys) and
return a clear error if not.
- Around line 172-173: Remove the line that logs the Vault root token to test
output: stop calling d.t.Logf("Vault root token: %s", rootToken). Keep or adjust
the address log (d.t.Logf("Vault address: http://%s.%s.svc:8200",
WellKnownVaultServiceName, d.config.Namespace)) if needed, and instead store or
pass rootToken via a secure test secret/variable or return it from the deployer
object without printing; reference the rootToken identifier and the d.t.Logf
call in vault_kube_kms_deployer.go when making the change.
- Around line 284-287: The current image tag parsing using strings.Split and
index arithmetic is fragile; replace that expression with logic that finds the
last ':' only if it appears after the last '/' so ports (registry:port) are
ignored and images without tags default to "latest": compute lastSlash :=
strings.LastIndex(d.config.Image, "/"), lastColon :=
strings.LastIndex(d.config.Image, ":"), then if lastColon > lastSlash set tag :=
d.config.Image[lastColon+1:] and base := d.config.Image[:lastColon] else set tag
:= "latest" and base := d.config.Image; use these base and tag variables instead
of the existing strings.Split/TrimSuffix expression to build the YAML values
(referencing d.config.Image and the current split/trim expression in the diff).

---

Nitpick comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Line 65: Several fenced code blocks in GUIDE.md lack language specifiers
(affecting syntax highlighting and markdownlint). Edit the affected fenced code
blocks (those around the examples referenced) and add appropriate language
identifiers such as text, plaintext, or the specific language (e.g., bash, json)
as appropriate; search for triple-backtick fences in GUIDE.md (examples near the
noted sections) and update each opening ``` to include the correct specifier to
satisfy markdownlint and enable highlighting.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 524-532: Replace the manual for-loop polling in the test that
checks namespace deletion with Kubernetes' wait.PollUntilContextTimeout for
consistency: instead of the loop around kubeClient.CoreV1().Namespaces().Get and
time.Sleep, call wait.PollUntilContextTimeout with a 1s interval and 30s
timeout, use kubeClient.CoreV1().Namespaces().Get inside the predicate and
return true when apierrors.IsNotFound(err), otherwise false; ensure you import
"k8s.io/apimachinery/pkg/util/wait" and apierrors
"k8s.io/apimachinery/pkg/api/errors" and update the test assertion to fail on a
returned error from PollUntilContextTimeout and log success when it returns nil.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 330-372: The applyManifests method on VaultDeployer appears to be
dead/unused now that installVaultWithHelm handles deployment; either remove the
entire applyManifests implementation (and any references to
vaultSharedManifestFiles, vaultDeploymentManifestFile,
wrapAssetWithTemplateData) to avoid stale code, or add a concise doc comment
above func (d *VaultDeployer) applyManifests explaining why it is retained
(e.g., "kept for tests/legacy/manual K8s manifest application; not used by
installVaultWithHelm") and mark it as intentional dead code so reviewers
understand its purpose.
- Around line 152-153: Replace the arbitrary time.Sleep(5 * time.Second) with a
proper readiness poll: implement a new helper method (e.g., waitForVaultReady or
pollVaultHealth) that repeatedly queries Vault's health endpoint (GET
/v1/sys/health) with a timeout and backoff, checks the response status/body for
readiness (initialized and not sealed / not standby or HTTP 200), and returns an
error on timeout; then call this helper from where time.Sleep is currently used
(replace time.Sleep with waitForVaultReady) so the deployer waits
deterministically for Vault to accept commands.
- Around line 711-714: Remove the unused helper function escapeForShell by
deleting its entire definition (the escapeForShell(s string) function) and also
remove any now-unused imports (e.g., "strings") that were only required for that
function so the package still builds cleanly; if you prefer to keep similar
behavior, replace any ad-hoc shell-escaping usages with a single shared utility
and ensure it's actually referenced, otherwise simply remove the function and
tidy imports.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4c97b637-7eae-4794-9ec6-8a265f46c606

📥 Commits

Reviewing files that changed from the base of the PR and between 0722f82 and e60a6cd.

📒 Files selected for processing (3)
  • test/library/encryption/kms/GUIDE.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (7)
test/library/encryption/kms/GUIDE.md (1)

78-79: ⚠️ Potential issue | 🟡 Minor

Replace user-specific absolute paths with portable placeholders.

These paths are environment-specific and make the guide hard to follow for others.

Also applies to: 145-146, 155-156, 184-185, 208-209

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 78 - 79, The guide
contains user-specific absolute paths like
"/home/skundu/automation/library-go/test/library/encryption/kms" (and the other
occurrences at the noted locations) which should be replaced with portable
placeholders; update the GUIDE.md entries to use a repo-root or environment
variable placeholder (e.g., <REPO_ROOT> or $PROJECT_ROOT or relative paths like
./test/library/encryption/kms or ~/) and add a brief note explaining how to
substitute the placeholder for local users so the instructions are
environment-agnostic.
test/library/encryption/kms/vault_kube_kms_deployer.go (5)

286-287: ⚠️ Potential issue | 🟠 Major

Image repo/tag parsing is brittle for registries with ports and untagged images.

The current split/trim approach can build invalid repo/tag combinations.

🔧 Safer parsing using last ':' after last '/'
+	lastSlash := strings.LastIndex(d.config.Image, "/")
+	lastColon := strings.LastIndex(d.config.Image, ":")
+	imageRepo := d.config.Image
+	imageTag := "latest"
+	if lastColon > lastSlash {
+		imageRepo = d.config.Image[:lastColon]
+		imageTag = d.config.Image[lastColon+1:]
+	}
...
-`, strings.TrimSuffix(d.config.Image, ":"+strings.Split(d.config.Image, ":")[len(strings.Split(d.config.Image, ":"))-1]),
-   strings.Split(d.config.Image, ":")[len(strings.Split(d.config.Image, ":"))-1])
+`, imageRepo, imageTag)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 286 -
287, The image repo/tag assembly is brittle where the code uses strings.Split on
d.config.Image; replace that logic with parsing that finds the last ':' only if
it occurs after the last '/' (i.e., use strings.LastIndex and
strings.LastIndexByte or similar) to separate repo and tag safely for images
with registry ports or untagged images; update the expression that builds the
repo and tag from d.config.Image accordingly (refer to the use of d.config.Image
in vault_kube_kms_deployer.go and the surrounding repo/tag assembly) so that if
no tag is present you treat the whole string as the repo and if a ':' is part of
the registry host (before the last '/') you do not split there.

443-446: ⚠️ Potential issue | 🔴 Critical

Fail fast when Vault is already initialized instead of returning fake credentials.

Returning "EXISTING_ROOT_TOKEN" and dummy keys makes downstream unseal/config steps fail with misleading errors.

🔧 Suggested fix
 		if d.isVaultInitialized(ctx, podName) {
-			d.t.Logf("Vault already initialized, skipping initialization")
-			return "EXISTING_ROOT_TOKEN", []string{"KEY1", "KEY2", "KEY3"}, nil
+			return "", nil, fmt.Errorf("vault already initialized; existing credentials must be provided")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 443 -
446, The current branch in the function that checks d.isVaultInitialized(ctx,
podName) must not return fake credentials ("EXISTING_ROOT_TOKEN", ["KEY1",...])
— instead fail fast by returning an explicit error so callers know Vault is
already initialized; modify the block in the method that calls
d.isVaultInitialized to return a non-nil error (e.g., ErrVaultAlreadyInitialized
or fmt.Errorf with context including podName) and no fake token/keys, so
unseal/config callers receive a clear failure instead of misleading dummy
values.

302-307: ⚠️ Potential issue | 🟠 Major

Use a unique temp values file to avoid parallel-test collisions.

Line 303 uses a fixed /tmp/vault-values.yaml; concurrent runs can overwrite each other.

🔧 Use `os.CreateTemp` and pass `valuesFile.Name()`
-	valuesFile := "/tmp/vault-values.yaml"
-	if err := os.WriteFile(valuesFile, []byte(valuesContent), 0644); err != nil {
+	valuesFile, err := os.CreateTemp("", "vault-values-*.yaml")
+	if err != nil {
+		return fmt.Errorf("failed to create values file: %w", err)
+	}
+	defer os.Remove(valuesFile.Name())
+	if _, err := valuesFile.WriteString(valuesContent); err != nil {
+		valuesFile.Close()
 		return fmt.Errorf("failed to write values file: %w", err)
 	}
-	defer os.Remove(valuesFile)
+	if err := valuesFile.Close(); err != nil {
+		return fmt.Errorf("failed to close values file: %w", err)
+	}
...
-		"--values", valuesFile,
+		"--values", valuesFile.Name(),

Also applies to: 313-313

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 302 -
307, The code currently uses a fixed filename string for valuesFile
("/tmp/vault-values.yaml") which can collide in parallel tests; replace that
with a unique temp file created via os.CreateTemp (e.g., f, err :=
os.CreateTemp("", "vault-values-*.yaml")), then use f.Name() instead of the
hardcoded path when writing valuesContent (or write directly to f), close the
file (f.Close()) and defer os.Remove(f.Name()) to cleanup; also apply the same
change to the other occurrence around valuesFile usage (the later os.Remove
call) so both places reference the temp file name returned by CreateTemp.

172-173: ⚠️ Potential issue | 🔴 Critical

Remove root token from test logs.

Line 172 logs full admin credentials to CI/test output. This is a direct secret leak.

🔧 Minimal fix
 d.t.Logf("HashiCorp Vault Enterprise deployed and configured successfully!")
-d.t.Logf("Vault root token: %s", rootToken)
 d.t.Logf("Vault address: http://%s.%s.svc:8200", WellKnownVaultServiceName, d.config.Namespace)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 172 -
173, The test currently logs the Vault root token via d.t.Logf("Vault root
token: %s", rootToken) which exposes secrets; remove that log or replace it with
a non-secret indicator (e.g., log that a root token was generated without
printing its value or print a redacted/hashed form). Update the code around the
d.t.Logf calls (the lines referencing rootToken, WellKnownVaultServiceName, and
d.config.Namespace) so only non-sensitive info is emitted (keep the Vault
address log if needed, but do not print rootToken).

493-500: ⚠️ Potential issue | 🟠 Major

Unseal loop hardcodes threshold and can panic on short key slices.

Line 494 assumes exactly 3 keys and directly indexes unsealKeys[j].

🔧 Guard key length and centralize threshold
+const vaultKeyThreshold = 3
...
-	for j := 0; j < 3; j++ {
+	if len(unsealKeys) < vaultKeyThreshold {
+		return fmt.Errorf("insufficient unseal keys: got %d, need %d", len(unsealKeys), vaultKeyThreshold)
+	}
+	for j := 0; j < vaultKeyThreshold; j++ {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 493 -
500, The unseal loop currently hardcodes 3 and indexes unsealKeys[j], which can
panic; introduce a central unsealThreshold variable/constant (e.g.,
unsealThreshold := 3), validate that len(unsealKeys) >= unsealThreshold before
the loop and return/log an error if not, and iterate using for j := 0; j <
unsealThreshold; j++ so the unsealCmd construction that references unsealKeys[j]
is safe; alternatively compute n := min(len(unsealKeys), unsealThreshold) and
loop to n if you prefer graceful partial unseal.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (1)

358-360: ⚠️ Potential issue | 🟠 Major

Do not print full AppRole credentials in test logs.

Line 359 and Line 360 expose reusable credentials in CI logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 358
- 360, The test currently logs full AppRole credentials using t.Logf with roleID
and secretID which exposes reusable secrets in CI; change the test to stop
printing them (remove or comment out the two t.Logf lines that reference roleID
and secretID) or replace them with non-sensitive placeholders/masked output
(e.g., show only last 4 chars) and, if needed, write full values to a secure
test artifact or secret store instead; update references in the test function
where roleID/secretID are logged to use the masked form or no logging.
🧹 Nitpick comments (1)
test/library/encryption/kms/vault_kube_kms_deployer.go (1)

380-393: Readiness gate should check container readiness, not just Pod phase.

Returning success on Running can race with init/config commands and increase flakes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 380 -
393, The readiness check currently returns success when pod.Status.Phase ==
corev1.PodRunning which can race with container init/config steps; update the
wait callback used in wait.PollUntilContextTimeout to verify container readiness
instead of just Pod phase by inspecting the Pod's readiness condition or
container statuses (e.g., look for ConditionType "Ready" with Status ==
corev1.ConditionTrue or ensure all entries in pod.Status.ContainerStatuses have
Ready == true); modify the block that fetches the Pod (the code using
d.kubeClient.CoreV1().Pods(...).Get and the return evaluating pod.Status.Phase)
to return success only when the Pod is truly ready (use pod.Status.Conditions or
pod.Status.ContainerStatuses for the check) while preserving the existing
NotFound handling and logging for podName and namespace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Line 65: The markdown file contains multiple fenced code blocks missing
language identifiers (MD040); update each triple-backtick fence to include an
appropriate language token (e.g., ```bash, ```text, ```go, or ```yaml) so the
linter stops flagging them; search for the backtick fences in GUIDE.md and
replace each opening ``` with the correct fenced language for the snippet
content (examples: shell commands -> bash, config -> yaml, plain output -> text,
Go snippets -> go).
- Around line 172-175: Update the GUIDE.md examples (currently showing
real-looking secrets in the vault_kube_kms_deployer_test.go excerpts) to redact
sensitive values — replace literal ROLE_ID, SECRET_ID and any root token strings
with placeholders like <REDACTED_ROLE_ID>, <REDACTED_SECRET_ID>,
<REDACTED_TOKEN> and change the example output blocks accordingly; additionally
add a brief explicit note next to the examples discouraging any plaintext
secret/token logging and instructing readers to use secure secret stores or
masked logging for AppRole credentials and Vault tokens.
- Around line 229-230: The guide's integration test invocation is inconsistent:
it tells readers to run with "-args -integration" while actual gating checks the
environment variable INTEGRATION_TEST=true; update the GUIDE.md text to either
instruct users to export/set INTEGRATION_TEST=true (e.g., export
INTEGRATION_TEST=true && go test ./... -run Test... ) or change the test gating
to honor the "-args -integration" flag; reference and update the occurrences
that mention "-args -integration" and the gating check INTEGRATION_TEST=true so
both the docs and the test gating are consistent.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 525-529: The namespace-delete loop incorrectly treats any error
from kubeClient.CoreV1().Namespaces().Get as successful deletion; change the
check inside the loop (the block around kubeClient.CoreV1().Namespaces().Get and
the deleted boolean) to only set deleted = true and break when
apierrors.IsNotFound(err) is true, otherwise continue retrying on
transient/errors (or log/record the non-notfound error) and sleep briefly
between attempts; import k8s.io/apimachinery/pkg/api/errors as apierrors and
update the logic in the test function surrounding that loop so non-NotFound
errors are not considered successful deletions.
- Around line 328-345: The test currently ignores failures from the oc exec
command and JSON parsing because it only proceeds when err == nil and otherwise
does nothing, which can hide failures; update the block that reads statusOutput
and unmarshals into vaultStatus so that if err from the oc exec call is non-nil
you call t.Fatalf (or t.Fatalf-like) with the error and raw statusOutput, and if
json.Unmarshal returns an error you also fail the test with a clear message
including the unmarshal error and the raw output; use the existing symbols
statusOutput, vaultStatus, and the surrounding error check (the if err == nil)
to locate the code and replace the silent-skipping behavior with explicit
t.Fatalf/t.Errorf on oc exec or JSON parse failures while keeping the subsequent
initialized/sealed assertions.
- Around line 274-278: The test currently calls t.Fatalf after
deployer.Deploy(ctx), which can prevent the later explicit cleanup call from
running; register the cleanup early with t.Cleanup so it always executes on test
termination (even after FailNow). Move the cleanup logic currently executed at
Line 366 into a t.Cleanup(func(){ ... }) registered before calling
deployer.Deploy(ctx) (and likewise replace the other end-of-test explicit
cleanup at 364-368 with t.Cleanup), ensuring the cleanup closure references the
same resources/namespaces and uses the existing deployer/teardown helpers.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 648-649: The test currently logs full command and arguments via
d.t.Logf("Executing in pod %s/%s: %v", d.config.Namespace, podName, command)
which can leak sensitive materials (unseal keys, root tokens, AppRole secrets);
change these logs to avoid printing raw command contents by either removing the
command from the log, replacing it with a non-sensitive summary (e.g.,
"executing vault-related command"), or pass the command through a sanitizer/mask
function that redacts tokens/keys before logging; update all similar usages (the
other d.t.Logf calls around the pod execution path that reference podName,
d.config.Namespace, and command) so no sensitive payloads are emitted.
- Around line 108-113: The constructor NewVaultDeployer currently dereferences
config without checking for nil; update it to guard against a nil *VaultConfig
by returning an error or creating a default VaultConfig when config == nil
before using fields like config.Namespace, ensuring VaultDeployer is constructed
safely; reference the NewVaultDeployer function and the VaultConfig and
VaultDeployer types and set defaults (e.g., WellKnownVaultNamespace) only after
verifying config is non-nil.

---

Duplicate comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Around line 78-79: The guide contains user-specific absolute paths like
"/home/skundu/automation/library-go/test/library/encryption/kms" (and the other
occurrences at the noted locations) which should be replaced with portable
placeholders; update the GUIDE.md entries to use a repo-root or environment
variable placeholder (e.g., <REPO_ROOT> or $PROJECT_ROOT or relative paths like
./test/library/encryption/kms or ~/) and add a brief note explaining how to
substitute the placeholder for local users so the instructions are
environment-agnostic.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 358-360: The test currently logs full AppRole credentials using
t.Logf with roleID and secretID which exposes reusable secrets in CI; change the
test to stop printing them (remove or comment out the two t.Logf lines that
reference roleID and secretID) or replace them with non-sensitive
placeholders/masked output (e.g., show only last 4 chars) and, if needed, write
full values to a secure test artifact or secret store instead; update references
in the test function where roleID/secretID are logged to use the masked form or
no logging.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 286-287: The image repo/tag assembly is brittle where the code
uses strings.Split on d.config.Image; replace that logic with parsing that finds
the last ':' only if it occurs after the last '/' (i.e., use strings.LastIndex
and strings.LastIndexByte or similar) to separate repo and tag safely for images
with registry ports or untagged images; update the expression that builds the
repo and tag from d.config.Image accordingly (refer to the use of d.config.Image
in vault_kube_kms_deployer.go and the surrounding repo/tag assembly) so that if
no tag is present you treat the whole string as the repo and if a ':' is part of
the registry host (before the last '/') you do not split there.
- Around line 443-446: The current branch in the function that checks
d.isVaultInitialized(ctx, podName) must not return fake credentials
("EXISTING_ROOT_TOKEN", ["KEY1",...]) — instead fail fast by returning an
explicit error so callers know Vault is already initialized; modify the block in
the method that calls d.isVaultInitialized to return a non-nil error (e.g.,
ErrVaultAlreadyInitialized or fmt.Errorf with context including podName) and no
fake token/keys, so unseal/config callers receive a clear failure instead of
misleading dummy values.
- Around line 302-307: The code currently uses a fixed filename string for
valuesFile ("/tmp/vault-values.yaml") which can collide in parallel tests;
replace that with a unique temp file created via os.CreateTemp (e.g., f, err :=
os.CreateTemp("", "vault-values-*.yaml")), then use f.Name() instead of the
hardcoded path when writing valuesContent (or write directly to f), close the
file (f.Close()) and defer os.Remove(f.Name()) to cleanup; also apply the same
change to the other occurrence around valuesFile usage (the later os.Remove
call) so both places reference the temp file name returned by CreateTemp.
- Around line 172-173: The test currently logs the Vault root token via
d.t.Logf("Vault root token: %s", rootToken) which exposes secrets; remove that
log or replace it with a non-secret indicator (e.g., log that a root token was
generated without printing its value or print a redacted/hashed form). Update
the code around the d.t.Logf calls (the lines referencing rootToken,
WellKnownVaultServiceName, and d.config.Namespace) so only non-sensitive info is
emitted (keep the Vault address log if needed, but do not print rootToken).
- Around line 493-500: The unseal loop currently hardcodes 3 and indexes
unsealKeys[j], which can panic; introduce a central unsealThreshold
variable/constant (e.g., unsealThreshold := 3), validate that len(unsealKeys) >=
unsealThreshold before the loop and return/log an error if not, and iterate
using for j := 0; j < unsealThreshold; j++ so the unsealCmd construction that
references unsealKeys[j] is safe; alternatively compute n :=
min(len(unsealKeys), unsealThreshold) and loop to n if you prefer graceful
partial unseal.

---

Nitpick comments:
In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 380-393: The readiness check currently returns success when
pod.Status.Phase == corev1.PodRunning which can race with container init/config
steps; update the wait callback used in wait.PollUntilContextTimeout to verify
container readiness instead of just Pod phase by inspecting the Pod's readiness
condition or container statuses (e.g., look for ConditionType "Ready" with
Status == corev1.ConditionTrue or ensure all entries in
pod.Status.ContainerStatuses have Ready == true); modify the block that fetches
the Pod (the code using d.kubeClient.CoreV1().Pods(...).Get and the return
evaluating pod.Status.Phase) to return success only when the Pod is truly ready
(use pod.Status.Conditions or pod.Status.ContainerStatuses for the check) while
preserving the existing NotFound handling and logging for podName and namespace.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fa265f17-afd0-4ce3-9a35-b60c3175a479

📥 Commits

Reviewing files that changed from the base of the PR and between e60a6cd and 2adf31b.

📒 Files selected for processing (3)
  • test/library/encryption/kms/GUIDE.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
@sandeepknd sandeepknd force-pushed the deploy_vault branch 3 times, most recently from 4d952ac to a19ea1d Compare April 28, 2026 18:52
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (12)
test/library/encryption/kms/GUIDE.md (3)

77-79: ⚠️ Potential issue | 🟡 Minor

Replace user-specific absolute paths with portable placeholders.

These /home/skundu/... examples are only valid on one workstation and make the guide hard to follow elsewhere.

Also applies to: 143-147, 155-158, 191-194, 215-218

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 77 - 79, Replace
user-specific absolute paths like
'/home/skundu/automation/library-go/test/library/encryption/kms' in GUIDE.md
with portable placeholders (e.g., $REPO_ROOT or ~/path/to/repo) so examples work
on any machine; update all occurrences including the other noted locations
(around the ranges referenced: 143-147, 155-158, 191-194, 215-218) and ensure
the placeholder is documented once near the top of the guide so readers know
what to substitute.

65-71: ⚠️ Potential issue | 🟡 Minor

Add language identifiers to the fenced code blocks.

Markdown lint is already flagging these fences with MD040.

Also applies to: 116-122, 167-187, 198-211, 228-230, 240-242, 250-252, 269-271, 298-300

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 65 - 71, The GUIDE.md
contains multiple fenced code blocks (for example the file-tree block listing
vault_kube_kms_deployer.go, vault_kube_kms_deployer_test.go, vault.hclic,
GUIDE.md) that lack language identifiers and trigger MD040; update each fenced
block in GUIDE.md to include an appropriate language tag (e.g., ```text or
```bash for file trees/commands, ```yaml or ```json for config snippets, ```sh
for shell commands) so the markdown linter stops flagging them.

179-181: ⚠️ Potential issue | 🟠 Major

Do not normalize plaintext secret logging in the guide.

The sample output and usage examples still print full ROLE_ID/SECRET_ID, and the security section explicitly says those credentials are logged. That encourages exactly the behavior we should avoid in CI and local troubleshooting.

Also applies to: 363-365, 485-488

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 179 - 181, The guide
currently prints full plaintext credentials in its sample output (the blocks
that list "ROLE_ID" and "SECRET_ID" with example UUIDs); replace those concrete
UUIDs with non-sensitive placeholders or masked values (e.g.,
"<REDACTED_ROLE_ID>" or "f4ce0bd3-****-****-****-804d01246a4f") and update all
similar sample blocks (other occurrences that list ROLE_ID/SECRET_ID) so
examples no longer show full secrets but still illustrate format; ensure the
text describing that credentials are logged is updated to recommend redaction
and not show real values.
test/library/encryption/kms/vault_kube_kms_deployer.go (5)

32-35: ⚠️ Potential issue | 🟠 Major

Verify the default Vault image tag exists.

If docker.io/hashicorp/vault-enterprise:2.0.0-ent is not a published tag, every deployment will fail with ImagePullBackOff. Please pin WellKnownVaultImage to a real Vault Enterprise release and keep the guide/examples aligned with it.

Does the container image tag `docker.io/hashicorp/vault-enterprise:2.0.0-ent` exist, and what Vault Enterprise image tags are currently published by HashiCorp?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 32 - 35,
The WellKnownVaultImage constant currently points to
docker.io/hashicorp/vault-enterprise:2.0.0-ent which may not be a published tag
causing ImagePullBackOff; verify available HashiCorp Vault Enterprise tags and
update the WellKnownVaultImage constant to a confirmed, published release (e.g.,
a specific enterprise version tag), and update any accompanying guide/examples
to match that confirmed tag so deployments use a real image; locate and change
the WellKnownVaultImage symbol in vault_kube_kms_deployer.go and ensure any
documentation or tests referencing the old tag are updated accordingly.

108-123: ⚠️ Potential issue | 🟠 Major

Guard config before applying defaults.

NewVaultDeployer(..., nil) will panic here before the test gets a usable failure.

🔧 Proposed fix
 func NewVaultDeployer(t testing.TB, kubeClient kubernetes.Interface, restConfig *rest.Config, config *VaultConfig) *VaultDeployer {
 	t.Helper()
+	if config == nil {
+		config = &VaultConfig{}
+	}
 
 	// Set defaults
 	if config.Namespace == "" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 108 -
123, The NewVaultDeployer function should guard against a nil config before
applying defaults: check if the config parameter is nil at the start of
NewVaultDeployer and call t.Fatalf (or t.Fatal) with a clear error message
(e.g., "config is nil") to fail the test early; then proceed with the existing
default assignments for config.Namespace, config.Image, config.Replicas, and
config.Mode. This ensures NewVaultDeployer will not panic when called with nil
and provides a useful test failure.

301-306: ⚠️ Potential issue | 🟡 Minor

Use a unique temp file for Helm values.

A fixed /tmp/vault-values.yaml will collide across parallel runs and can make one test install another test's values.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 301 -
306, The code writes Helm values to a fixed path valuesFile :=
"/tmp/vault-values.yaml" which can collide in parallel runs; replace this with a
unique temp file (e.g., use os.CreateTemp or ioutil.TempFile to create a file
like "vault-values-*.yaml"), write valuesContent to that temp file (using the
temp file's Name()), and defer removal using the temp file's Name() (ensure
defer os.Remove(tempName) so cleanup targets the created file). Keep the same
error handling around writing (wrap and return fmt.Errorf on failure).

439-445: ⚠️ Potential issue | 🔴 Critical

Fail fast when Vault is already initialized.

Returning dummy credentials lets the flow continue into unsealVault and configureVaultForKMS with invalid auth, which turns a recoverable precondition into a confusing downstream failure.

🐛 Proposed fix
 	if err != nil {
 		// Check if already initialized
 		if d.isVaultInitialized(ctx, podName) {
-			d.t.Logf("Vault already initialized, skipping initialization")
-			return "EXISTING_ROOT_TOKEN", []string{"KEY1", "KEY2", "KEY3"}, nil
+			return "", nil, fmt.Errorf("Vault is already initialized and existing credentials cannot be recovered")
 		}
 		return "", nil, fmt.Errorf("failed to initialize Vault: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 439 -
445, The code currently returns dummy credentials when d.isVaultInitialized(ctx,
podName) is true, which lets unsealVault and configureVaultForKMS proceed with
invalid auth; instead modify the initialization branch in the function calling
d.execInPod so that when d.isVaultInitialized(ctx, podName) returns true it
fails fast by returning an error (e.g., wrapped with context like "vault already
initialized") rather than "EXISTING_ROOT_TOKEN" and dummy keys; update callers
of this function to handle the error path (so unsealVault and
configureVaultForKMS are not invoked with invalid credentials) and keep
references to d.execInPod and d.isVaultInitialized to locate and change the
logic.

614-630: ⚠️ Potential issue | 🔴 Critical

Stop printing Vault secrets to the test log.

This still emits AppRole credentials directly and logs raw command output from init/unseal/auth flows. Those logs can contain the root token, unseal keys, role IDs, and secret IDs.

Also applies to: 646-694

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 614 -
630, The test currently prints sensitive Vault credentials (AppRole Role ID and
Secret ID) to the test log; stop emitting secrets by removing or redacting the
d.t.Logf calls that print roleID and secretID and ensure d.execInPodWithEnv is
not logging raw command output; store the values in d.config.AppRoleRoleID and
d.config.AppRoleSecretID but never log them (or log only a masked version like
"<redacted>" or first/last 4 chars), and apply the same change to the other
secret-emitting blocks referenced around the 646-694 region so no raw Vault
tokens, unseal keys, role IDs, or secret IDs are written to logs.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (4)

325-345: ⚠️ Potential issue | 🟠 Major

Make the status check fail when oc exec or JSON parsing fails.

Right now both failure modes are silently ignored, so the test can report success without ever proving Vault was initialized or unsealed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 325
- 345, The status check currently ignores failures from
exec.CommandContext(statusCmd).Output() and json.Unmarshal(statusOutput,
&vaultStatus); update the block around initCheckCmd/statusCmd/statusOutput so
that if statusCmd.Output() returns an error you call t.Fatalf (or
t.Errorf+return) with the exec error, and if json.Unmarshal fails you also fail
the test with the parse error; keep the subsequent checks on
vaultStatus.Initialized and vaultStatus.Sealed but ensure they run only after
successful output and unmarshal so the test cannot report success when oc exec
or JSON parsing failed.

522-526: ⚠️ Potential issue | 🟡 Minor

Only treat NotFound as successful namespace deletion.

This loop currently marks any API error as success, so a transient read failure can make the cleanup test pass while the namespace still exists.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 522
- 526, The loop currently treats any error from
kubeClient.CoreV1().Namespaces().Get as a successful deletion; change it to only
mark deleted when the error is a NotFound error by using
apierrors.IsNotFound(err) (k8s.io/apimachinery/pkg/api/errors.IsNotFound) and
break in that case, while for other non-nil errors continue retrying (or
log/record transient errors) so transient read failures don't cause a
false-positive deletion; update the logic around the Get call and ensure the
appropriate apierrors import is added.

358-360: ⚠️ Potential issue | 🟠 Major

Redact the AppRole credentials in test output.

These values end up in CI logs and are usable until cleanup revokes access.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 358
- 360, The test currently prints sensitive AppRole credentials via t.Logf using
the variables roleID and secretID; instead redact or truncate these values
before logging to avoid exposing secrets in CI logs. Modify the t.Logf calls
that reference roleID and secretID to either log a constant like "[REDACTED]" or
a safely truncated form (e.g., first/last 4 chars with asterisks) so the
variables roleID and secretID are never emitted in full to test output.

271-278: ⚠️ Potential issue | 🟠 Major

Register cleanup before deployment so failures do not leak namespaces.

If anything after Deploy calls t.Fatalf, this test leaves the namespace behind. If the intent is to preserve successful deployments for manual inspection, a t.Cleanup that only runs on failure would keep that behavior without leaking failed runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 271
- 278, Register a t.Cleanup before calling deployer.Deploy so namespaces are
removed on test failure: create the ctx/cancel as before, then call
t.Cleanup(func(){ if t.Failed() { /* call deployer cleanup method, e.g.,
deployer.Delete(ctx) or deployer.Teardown(ctx) */ } }); then proceed to t.Log
and if err := deployer.Deploy(ctx) { t.Fatalf(...) } so a failed test won't leak
the namespace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Around line 36-50: The guide incorrectly asserts compatibility with vanilla
Kubernetes but the implementation shells out to oc and binds OpenShift
privileged SCC; update the steps mentioning "Kubernetes Cluster (OpenShift or
vanilla Kubernetes)", the command examples referencing kubectl, and the "oc CLI"
section to scope the guide explicitly to OpenShift only (or else remove any
claims/examples that reference kubectl or generic Kubernetes). Ensure the text
and examples consistently require oc and note the SCC requirement so readers
know OpenShift is mandatory.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 247-285: The Helm values string built in valuesContent ignores the
caller-provided VaultConfig fields; update the generator in
vault_kube_kms_deployer.go to use d.config (VaultConfig) values instead of
hardcoded ones: set dataStorage.size from d.config.StorageSize,
dataStorage.storageClass from d.config.StorageClass (use null when empty), set
standalone.enabled when d.config.Mode == "standalone" and ha.enabled when
d.config.Mode == "ha"/"replicated", and set server.replicas from
d.config.Replicas; keep the existing image repository/tag logic using
d.config.Image but ensure the other fields (size, storageClass,
standalone.enabled, ha.enabled, replicas) are interpolated from d.config so
Deploy respects the public VaultConfig.
- Around line 659-673: The code builds envPrefix by concatenating raw env values
into a shell command (envPrefix, fullCmd) which allows shell metacharacters to
break the command; fix it by properly shell-quoting or escaping each env value
before adding to envPrefix (replace the current envPrefix += fmt.Sprintf("%s=%s
", e.Name, e.Value) with a call that escapes e.Value for POSIX sh), keep using
shellJoin for the command and preserve the heredoc logic for stdin, then pass
the safe fullCmd to ocArgs ("sh", "-c", fullCmd).

---

Duplicate comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Around line 77-79: Replace user-specific absolute paths like
'/home/skundu/automation/library-go/test/library/encryption/kms' in GUIDE.md
with portable placeholders (e.g., $REPO_ROOT or ~/path/to/repo) so examples work
on any machine; update all occurrences including the other noted locations
(around the ranges referenced: 143-147, 155-158, 191-194, 215-218) and ensure
the placeholder is documented once near the top of the guide so readers know
what to substitute.
- Around line 65-71: The GUIDE.md contains multiple fenced code blocks (for
example the file-tree block listing vault_kube_kms_deployer.go,
vault_kube_kms_deployer_test.go, vault.hclic, GUIDE.md) that lack language
identifiers and trigger MD040; update each fenced block in GUIDE.md to include
an appropriate language tag (e.g., ```text or ```bash for file trees/commands,
```yaml or ```json for config snippets, ```sh for shell commands) so the
markdown linter stops flagging them.
- Around line 179-181: The guide currently prints full plaintext credentials in
its sample output (the blocks that list "ROLE_ID" and "SECRET_ID" with example
UUIDs); replace those concrete UUIDs with non-sensitive placeholders or masked
values (e.g., "<REDACTED_ROLE_ID>" or "f4ce0bd3-****-****-****-804d01246a4f")
and update all similar sample blocks (other occurrences that list
ROLE_ID/SECRET_ID) so examples no longer show full secrets but still illustrate
format; ensure the text describing that credentials are logged is updated to
recommend redaction and not show real values.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 325-345: The status check currently ignores failures from
exec.CommandContext(statusCmd).Output() and json.Unmarshal(statusOutput,
&vaultStatus); update the block around initCheckCmd/statusCmd/statusOutput so
that if statusCmd.Output() returns an error you call t.Fatalf (or
t.Errorf+return) with the exec error, and if json.Unmarshal fails you also fail
the test with the parse error; keep the subsequent checks on
vaultStatus.Initialized and vaultStatus.Sealed but ensure they run only after
successful output and unmarshal so the test cannot report success when oc exec
or JSON parsing failed.
- Around line 522-526: The loop currently treats any error from
kubeClient.CoreV1().Namespaces().Get as a successful deletion; change it to only
mark deleted when the error is a NotFound error by using
apierrors.IsNotFound(err) (k8s.io/apimachinery/pkg/api/errors.IsNotFound) and
break in that case, while for other non-nil errors continue retrying (or
log/record transient errors) so transient read failures don't cause a
false-positive deletion; update the logic around the Get call and ensure the
appropriate apierrors import is added.
- Around line 358-360: The test currently prints sensitive AppRole credentials
via t.Logf using the variables roleID and secretID; instead redact or truncate
these values before logging to avoid exposing secrets in CI logs. Modify the
t.Logf calls that reference roleID and secretID to either log a constant like
"[REDACTED]" or a safely truncated form (e.g., first/last 4 chars with
asterisks) so the variables roleID and secretID are never emitted in full to
test output.
- Around line 271-278: Register a t.Cleanup before calling deployer.Deploy so
namespaces are removed on test failure: create the ctx/cancel as before, then
call t.Cleanup(func(){ if t.Failed() { /* call deployer cleanup method, e.g.,
deployer.Delete(ctx) or deployer.Teardown(ctx) */ } }); then proceed to t.Log
and if err := deployer.Deploy(ctx) { t.Fatalf(...) } so a failed test won't leak
the namespace.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 32-35: The WellKnownVaultImage constant currently points to
docker.io/hashicorp/vault-enterprise:2.0.0-ent which may not be a published tag
causing ImagePullBackOff; verify available HashiCorp Vault Enterprise tags and
update the WellKnownVaultImage constant to a confirmed, published release (e.g.,
a specific enterprise version tag), and update any accompanying guide/examples
to match that confirmed tag so deployments use a real image; locate and change
the WellKnownVaultImage symbol in vault_kube_kms_deployer.go and ensure any
documentation or tests referencing the old tag are updated accordingly.
- Around line 108-123: The NewVaultDeployer function should guard against a nil
config before applying defaults: check if the config parameter is nil at the
start of NewVaultDeployer and call t.Fatalf (or t.Fatal) with a clear error
message (e.g., "config is nil") to fail the test early; then proceed with the
existing default assignments for config.Namespace, config.Image,
config.Replicas, and config.Mode. This ensures NewVaultDeployer will not panic
when called with nil and provides a useful test failure.
- Around line 301-306: The code writes Helm values to a fixed path valuesFile :=
"/tmp/vault-values.yaml" which can collide in parallel runs; replace this with a
unique temp file (e.g., use os.CreateTemp or ioutil.TempFile to create a file
like "vault-values-*.yaml"), write valuesContent to that temp file (using the
temp file's Name()), and defer removal using the temp file's Name() (ensure
defer os.Remove(tempName) so cleanup targets the created file). Keep the same
error handling around writing (wrap and return fmt.Errorf on failure).
- Around line 439-445: The code currently returns dummy credentials when
d.isVaultInitialized(ctx, podName) is true, which lets unsealVault and
configureVaultForKMS proceed with invalid auth; instead modify the
initialization branch in the function calling d.execInPod so that when
d.isVaultInitialized(ctx, podName) returns true it fails fast by returning an
error (e.g., wrapped with context like "vault already initialized") rather than
"EXISTING_ROOT_TOKEN" and dummy keys; update callers of this function to handle
the error path (so unsealVault and configureVaultForKMS are not invoked with
invalid credentials) and keep references to d.execInPod and d.isVaultInitialized
to locate and change the logic.
- Around line 614-630: The test currently prints sensitive Vault credentials
(AppRole Role ID and Secret ID) to the test log; stop emitting secrets by
removing or redacting the d.t.Logf calls that print roleID and secretID and
ensure d.execInPodWithEnv is not logging raw command output; store the values in
d.config.AppRoleRoleID and d.config.AppRoleSecretID but never log them (or log
only a masked version like "<redacted>" or first/last 4 chars), and apply the
same change to the other secret-emitting blocks referenced around the 646-694
region so no raw Vault tokens, unseal keys, role IDs, or secret IDs are written
to logs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e393d53d-7da5-4f53-ac9a-0a1bbe81ab11

📥 Commits

Reviewing files that changed from the base of the PR and between 2adf31b and 4d952ac.

📒 Files selected for processing (3)
  • test/library/encryption/kms/GUIDE.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/GUIDE.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (16)
test/library/encryption/kms/vault_kube_kms_deployer.go (10)

108-113: ⚠️ Potential issue | 🔴 Critical

Guard config before dereferencing in constructor.

Line 112 dereferences config unconditionally; NewVaultDeployer(..., nil) will panic.

🔧 Suggested fix
 func NewVaultDeployer(t testing.TB, kubeClient kubernetes.Interface, restConfig *rest.Config, config *VaultConfig) *VaultDeployer {
 	t.Helper()
+	if config == nil {
+		config = &VaultConfig{}
+	}
 
 	// Set defaults
 	if config.Namespace == "" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 108 -
113, The constructor NewVaultDeployer dereferences config without guarding for
nil; add a nil-check at the start of the function (e.g., if config == nil {
config = &VaultConfig{} }) before accessing config.Namespace so calling
NewVaultDeployer(..., nil) does not panic, then proceed to set defaults and
return the *VaultDeployer as before.

301-306: ⚠️ Potential issue | 🟡 Minor

Use a unique temp file for Helm values.

Lines 302-306 use a fixed /tmp/vault-values.yaml, which can race in parallel test runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 301 -
306, The code writes Helm values to a fixed path in valuesFile which can race in
parallel tests; replace the fixed "/tmp/vault-values.yaml" usage with a unique
temp file created via os.CreateTemp (or ioutil.TempFile), write valuesContent
into that temp file (using the returned *os.File or calling os.WriteFile on
temp.Name()), close the file if necessary, and defer removal using defer
os.Remove(temp.Name()); update references to valuesFile to use the temp file
name and preserve the intended file permissions when writing.

492-499: ⚠️ Potential issue | 🔴 Critical

Unseal loop can panic and is coupled to a magic threshold.

Line 498 indexes unsealKeys[j] with j < 3 and no length guard. If fewer keys are returned, this panics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 492 -
499, The unseal loop in vault_kube_kms_deployer.go indexes unsealKeys with a
hardcoded limit (for j := 0; j < 3) which can panic if fewer keys are returned;
change it to compute a threshold = min(3, len(unsealKeys)) (or read a
vaultThreshold constant) and iterate for j := 0; j < threshold; j++; if
len(unsealKeys) < 3 return an explicit error (or fail the test) before the loop
so the code doesn't panic and the failure is clear; update references to
unsealKeys and unsealCmd accordingly.

647-648: ⚠️ Potential issue | 🔴 Critical

Command/request logging leaks sensitive Vault material.

These logs can print init outputs, unseal flows, and secret-bearing command context to CI logs.

Also applies to: 689-694

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 647 -
648, The test currently logs full command and output (d.t.Logf("Executing in pod
%s/%s: %v", d.config.Namespace, podName, command)), which can leak Vault
init/unseal outputs and secrets; change these calls to avoid printing command or
output contents—e.g. log only non-sensitive context (d.config.Namespace and
podName) or a generic message like "Executing command in pod %s/%s" and remove
or redact the 'command' and any captured output; apply the same change to the
similar logging calls around the other occurrence (lines 689-694) so no
secret-bearing command context is written to CI logs.

615-616: ⚠️ Potential issue | 🟠 Major

Do not log AppRole credentials in plaintext.

Lines 615 and 630 expose role_id and secret_id in test logs.

Also applies to: 629-630

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 615 -
616, Remove plaintext logging of AppRole credentials: stop calling d.t.Logf with
the raw roleID and secretID values and either omit those logs entirely or print
only redacted forms (e.g., show length or last 4 chars with the rest replaced by
asterisks). Update the places that currently log roleID and secretID (calls that
pass roleID and secretID to d.t.Logf) to use redaction or remove the log entries
so credentials are not exposed in test output.

441-445: ⚠️ Potential issue | 🔴 Critical

Fail fast when Vault is already initialized; do not return dummy credentials.

Returning "EXISTING_ROOT_TOKEN" and fake keys causes downstream auth/unseal/configuration failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 441 -
445, The current branch returns dummy credentials when d.isVaultInitialized(ctx,
podName) is true, which breaks downstream auth/unseal/config; change the
early-exit to fail fast by returning a clear error instead of
"EXISTING_ROOT_TOKEN" and fake keys: replace the return of dummy values with an
explicit error (e.g., fmt.Errorf or errors.New) signaling "vault already
initialized" from the same block that calls d.isVaultInitialized; if real
credentials can be retrieved, call the appropriate retrieval method instead of
fabricating tokens and ensure callers of this function handle the error path.

659-663: ⚠️ Potential issue | 🟠 Major

Shell command composition injects unescaped env values.

Line 662 concatenates raw env values into sh -c; metacharacters can alter command execution.

🔒 Suggested fix
 	for _, e := range env {
-		envPrefix += fmt.Sprintf("%s=%s ", e.Name, e.Value)
+		envPrefix += fmt.Sprintf("%s='%s' ", e.Name, escapeForShell(e.Value))
 	}

Also applies to: 668-673

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 659 -
663, The code builds a shell command by concatenating raw env values into
envPrefix (using env, stdin, envPrefix) and passing it to sh -c, which allows
shell metacharacter injection; instead, stop composing a shell string with
unescaped values — either run the command directly via exec.Command with cmd.Env
= append(os.Environ(), fmt.Sprintf("%s=%s", e.Name, e.Value)...) and pass
arguments as separate params, or if a shell is mandatory, properly escape/quote
each value (e.g., strconv.Quote or a reliable shell-escape library) before
concatenation; apply the same fix to the other similar block around lines
668-673.

247-257: ⚠️ Potential issue | 🟠 Major

Helm values ignore public VaultConfig fields.

The generated YAML hardcodes storage/mode behavior, so Replicas, Mode, StorageClass, and StorageSize are not honored.

Also applies to: 271-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 247 -
257, The Helm values string in valuesContent currently hardcodes replicas, mode,
storage size and storageClass so VaultConfig fields are ignored; update the code
that builds valuesContent to interpolate VaultConfig.Replicas, VaultConfig.Mode,
VaultConfig.StorageSize and VaultConfig.StorageClass (handle empty StorageClass
-> null) instead of hardcoded "2Gi", "null", and fixed accessMode/replica
values, and mirror the same change in the other valuesContent block later in the
file so Replicas/Mode/StorageClass/StorageSize are honored when rendering Helm
values.

34-34: ⚠️ Potential issue | 🟠 Major

Verify WellKnownVaultImage resolves to a published Vault Enterprise image tag.

Line 34 hardcodes a tag that appears problematic and can break default deployments at pull time.

What published tags exist for docker.io/hashicorp/vault-enterprise, and does the tag `2.0.0-ent` exist?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` at line 34,
WellKnownVaultImage is hardcoded to a likely-nonexistent enterprise tag
("2.0.0-ent"); verify available published tags for
docker.io/hashicorp/vault-enterprise and update WellKnownVaultImage in
vault_kube_kms_deployer.go to a valid published tag (or switch to a supported
OSS image like docker.io/hashicorp/vault:X.Y.Z) and/or make it configurable via
an environment variable so tests won’t break at pull time; specifically, check
the registry for the exact tag name and replace the literal in the
WellKnownVaultImage constant or add logic to read VAULT_IMAGE from env before
falling back to the constant.

285-286: ⚠️ Potential issue | 🟠 Major

Image repo/tag extraction is brittle for registry-port and no-tag images.

Current strings.Split + index logic can derive wrong repo/tag values and produce invalid Helm values.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 285 -
286, The current repo/tag extraction using repeated strings.Split on
d.config.Image is brittle (fails for registry:port and no-tag images); replace
it with robust parsing: compute lastSlash := strings.LastIndex(d.config.Image,
"/") and lastColon := strings.LastIndex(d.config.Image, ":") and treat a tag as
present only if lastColon > lastSlash; then set tag := "" if no tag else
d.config.Image[lastColon+1:] and repo := d.config.Image[:lastColon] (or repo :=
d.config.Image if no tag); alternatively import and use
docker/distribution/reference.ParseNormalizedNamed to get name and tag—update
the expression building the Helm values (the strings.TrimSuffix/strings.Split
usage) to use these repo and tag variables instead of the current repeated
strings.Split logic.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (3)

329-346: ⚠️ Potential issue | 🟠 Major

Do not silently skip Vault status verification on command/parse failure.

If oc exec fails (or JSON parsing fails), this block does nothing and the test may pass without validating initialized/unsealed state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 329
- 346, The test currently ignores failures from the oc exec call and JSON
parsing (statusOutput and json.Unmarshal into vaultStatus), which can let the
test pass without validating Vault state; update the block to explicitly fail
the test when err != nil or when json.Unmarshal returns an error (use t.Fatalf
or t.Errorf) and include the underlying error and raw statusOutput in the
message, and keep the existing initialized/unsealed checks using vaultStatus
when parsing succeeds.

359-361: ⚠️ Potential issue | 🟡 Minor

Avoid logging full AppRole credentials.

Lines 360-361 print raw secret material to test logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 359
- 361, The test currently prints raw AppRole secrets via t.Logf using roleID and
secretID; remove or redact these secrets to avoid leaking credentials. Replace
the two t.Logf lines that output roleID and secretID with either no logging or a
safe redaction (e.g., show only first/last few chars or a hash) so that the
variables roleID and secretID are never logged in full by the test.

546-549: ⚠️ Potential issue | 🟡 Minor

Namespace deletion check should only succeed on NotFound.

Any error currently flips deleted=true; transient API/RBAC errors can be misreported as successful cleanup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 546
- 549, The current namespace deletion loop sets deleted = true for any Get
error; change it to only mark deleted when the error is a NotFound using
k8s.io/apimachinery/pkg/api/errors.IsNotFound(err). Specifically, in the polling
that calls kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace,
metav1.GetOptions{}), set deleted = true and break only when
errors.IsNotFound(err); if err == nil keep polling (namespace still exists); for
any other error do not flip deleted (optionally log or return the error) so
transient API/RBAC errors aren't treated as successful deletion.
test/library/encryption/kms/GUIDE.md (3)

215-217: ⚠️ Potential issue | 🟠 Major

Documentation still encourages plaintext credential logging.

Examples and security notes should redact or mask ROLE_ID/SECRET_ID and explicitly discourage logging secrets.

Also applies to: 507-510, 644-645

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 215 - 217, The docs
currently show plaintext AppRole credentials (ROLE_ID and SECRET_ID in the
vault_kube_kms_deployer_test.go examples) and implicitly encourage logging them;
update the GUIDE.md examples (and the sample lines referencing
vault_kube_kms_deployer_test.go) to redact or mask those values (e.g., replace
with <REDACTED> or partial masks) and add a brief explicit note advising never
to log secrets or store them in plaintext, referencing the symbols ROLE_ID and
SECRET_ID so reviewers can find and remove any remaining cleartext occurrences
(also apply the change to the other mentioned occurrences around the same file).

36-41: ⚠️ Potential issue | 🟠 Major

Scope prerequisites to OpenShift (or remove vanilla-K8s claim).

This guide says vanilla Kubernetes is supported, but the deployer/test flow depends on oc and OpenShift-specific SCC/RBAC behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 36 - 41, The guide
currently claims vanilla Kubernetes support but uses OpenShift-specific commands
and security constructs; either restrict the prerequisites to OpenShift or
remove the vanilla-K8s claim and document differences. Update the checklist and
commands (replace or remove the "kubectl version" line and keep only "oc
version"), add a clear note in the prerequisites explaining that the deploy/test
flow relies on OpenShift SCC and RBAC behavior (mentioning SCC/RBAC) and that
vanilla Kubernetes will require alternative steps/configs, or provide those
alternative steps if you want to keep vanilla-K8s support.

78-79: ⚠️ Potential issue | 🟡 Minor

Replace user-specific absolute paths with portable placeholders.

These hardcoded local paths are not reusable across environments.

Also applies to: 170-171, 188-189, 284-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/GUIDE.md` around lines 78 - 79, Replace the
hardcoded user-specific absolute path
"/home/skundu/automation/library-go/test/library/encryption/kms" in the GUIDE.md
examples with a portable placeholder (e.g. <project-root> or ${REPO_ROOT}) and
update the other duplicate occurrences noted (around the other cd examples) so
instructions work across environments; ensure the placeholder is documented at
the top of the guide so readers know how to substitute their local repo path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Around line 344-349: The troubleshooting paragraph in GUIDE.md is
contradictory: it presents an ImagePullBackOff for
"hashicorp/vault-enterprise:2.0.0-ent" then states that the same tag is correct.
Fix by making the example consistent—either show an actually incorrect failing
image (e.g., a truncated or wrong registry) and then provide the correct image
"docker.io/hashicorp/vault-enterprise:2.0.0-ent", or explicitly explain that the
error can occur even when the deployer uses the correct image and instruct
readers to verify cluster access to docker.io; update the lines referencing
"hashicorp/vault-enterprise:2.0.0-ent" and
"docker.io/hashicorp/vault-enterprise:2.0.0-ent" accordingly.

---

Duplicate comments:
In `@test/library/encryption/kms/GUIDE.md`:
- Around line 215-217: The docs currently show plaintext AppRole credentials
(ROLE_ID and SECRET_ID in the vault_kube_kms_deployer_test.go examples) and
implicitly encourage logging them; update the GUIDE.md examples (and the sample
lines referencing vault_kube_kms_deployer_test.go) to redact or mask those
values (e.g., replace with <REDACTED> or partial masks) and add a brief explicit
note advising never to log secrets or store them in plaintext, referencing the
symbols ROLE_ID and SECRET_ID so reviewers can find and remove any remaining
cleartext occurrences (also apply the change to the other mentioned occurrences
around the same file).
- Around line 36-41: The guide currently claims vanilla Kubernetes support but
uses OpenShift-specific commands and security constructs; either restrict the
prerequisites to OpenShift or remove the vanilla-K8s claim and document
differences. Update the checklist and commands (replace or remove the "kubectl
version" line and keep only "oc version"), add a clear note in the prerequisites
explaining that the deploy/test flow relies on OpenShift SCC and RBAC behavior
(mentioning SCC/RBAC) and that vanilla Kubernetes will require alternative
steps/configs, or provide those alternative steps if you want to keep
vanilla-K8s support.
- Around line 78-79: Replace the hardcoded user-specific absolute path
"/home/skundu/automation/library-go/test/library/encryption/kms" in the GUIDE.md
examples with a portable placeholder (e.g. <project-root> or ${REPO_ROOT}) and
update the other duplicate occurrences noted (around the other cd examples) so
instructions work across environments; ensure the placeholder is documented at
the top of the guide so readers know how to substitute their local repo path.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 329-346: The test currently ignores failures from the oc exec call
and JSON parsing (statusOutput and json.Unmarshal into vaultStatus), which can
let the test pass without validating Vault state; update the block to explicitly
fail the test when err != nil or when json.Unmarshal returns an error (use
t.Fatalf or t.Errorf) and include the underlying error and raw statusOutput in
the message, and keep the existing initialized/unsealed checks using vaultStatus
when parsing succeeds.
- Around line 359-361: The test currently prints raw AppRole secrets via t.Logf
using roleID and secretID; remove or redact these secrets to avoid leaking
credentials. Replace the two t.Logf lines that output roleID and secretID with
either no logging or a safe redaction (e.g., show only first/last few chars or a
hash) so that the variables roleID and secretID are never logged in full by the
test.
- Around line 546-549: The current namespace deletion loop sets deleted = true
for any Get error; change it to only mark deleted when the error is a NotFound
using k8s.io/apimachinery/pkg/api/errors.IsNotFound(err). Specifically, in the
polling that calls kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace,
metav1.GetOptions{}), set deleted = true and break only when
errors.IsNotFound(err); if err == nil keep polling (namespace still exists); for
any other error do not flip deleted (optionally log or return the error) so
transient API/RBAC errors aren't treated as successful deletion.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 108-113: The constructor NewVaultDeployer dereferences config
without guarding for nil; add a nil-check at the start of the function (e.g., if
config == nil { config = &VaultConfig{} }) before accessing config.Namespace so
calling NewVaultDeployer(..., nil) does not panic, then proceed to set defaults
and return the *VaultDeployer as before.
- Around line 301-306: The code writes Helm values to a fixed path in valuesFile
which can race in parallel tests; replace the fixed "/tmp/vault-values.yaml"
usage with a unique temp file created via os.CreateTemp (or ioutil.TempFile),
write valuesContent into that temp file (using the returned *os.File or calling
os.WriteFile on temp.Name()), close the file if necessary, and defer removal
using defer os.Remove(temp.Name()); update references to valuesFile to use the
temp file name and preserve the intended file permissions when writing.
- Around line 492-499: The unseal loop in vault_kube_kms_deployer.go indexes
unsealKeys with a hardcoded limit (for j := 0; j < 3) which can panic if fewer
keys are returned; change it to compute a threshold = min(3, len(unsealKeys))
(or read a vaultThreshold constant) and iterate for j := 0; j < threshold; j++;
if len(unsealKeys) < 3 return an explicit error (or fail the test) before the
loop so the code doesn't panic and the failure is clear; update references to
unsealKeys and unsealCmd accordingly.
- Around line 647-648: The test currently logs full command and output
(d.t.Logf("Executing in pod %s/%s: %v", d.config.Namespace, podName, command)),
which can leak Vault init/unseal outputs and secrets; change these calls to
avoid printing command or output contents—e.g. log only non-sensitive context
(d.config.Namespace and podName) or a generic message like "Executing command in
pod %s/%s" and remove or redact the 'command' and any captured output; apply the
same change to the similar logging calls around the other occurrence (lines
689-694) so no secret-bearing command context is written to CI logs.
- Around line 615-616: Remove plaintext logging of AppRole credentials: stop
calling d.t.Logf with the raw roleID and secretID values and either omit those
logs entirely or print only redacted forms (e.g., show length or last 4 chars
with the rest replaced by asterisks). Update the places that currently log
roleID and secretID (calls that pass roleID and secretID to d.t.Logf) to use
redaction or remove the log entries so credentials are not exposed in test
output.
- Around line 441-445: The current branch returns dummy credentials when
d.isVaultInitialized(ctx, podName) is true, which breaks downstream
auth/unseal/config; change the early-exit to fail fast by returning a clear
error instead of "EXISTING_ROOT_TOKEN" and fake keys: replace the return of
dummy values with an explicit error (e.g., fmt.Errorf or errors.New) signaling
"vault already initialized" from the same block that calls d.isVaultInitialized;
if real credentials can be retrieved, call the appropriate retrieval method
instead of fabricating tokens and ensure callers of this function handle the
error path.
- Around line 659-663: The code builds a shell command by concatenating raw env
values into envPrefix (using env, stdin, envPrefix) and passing it to sh -c,
which allows shell metacharacter injection; instead, stop composing a shell
string with unescaped values — either run the command directly via exec.Command
with cmd.Env = append(os.Environ(), fmt.Sprintf("%s=%s", e.Name, e.Value)...)
and pass arguments as separate params, or if a shell is mandatory, properly
escape/quote each value (e.g., strconv.Quote or a reliable shell-escape library)
before concatenation; apply the same fix to the other similar block around lines
668-673.
- Around line 247-257: The Helm values string in valuesContent currently
hardcodes replicas, mode, storage size and storageClass so VaultConfig fields
are ignored; update the code that builds valuesContent to interpolate
VaultConfig.Replicas, VaultConfig.Mode, VaultConfig.StorageSize and
VaultConfig.StorageClass (handle empty StorageClass -> null) instead of
hardcoded "2Gi", "null", and fixed accessMode/replica values, and mirror the
same change in the other valuesContent block later in the file so
Replicas/Mode/StorageClass/StorageSize are honored when rendering Helm values.
- Line 34: WellKnownVaultImage is hardcoded to a likely-nonexistent enterprise
tag ("2.0.0-ent"); verify available published tags for
docker.io/hashicorp/vault-enterprise and update WellKnownVaultImage in
vault_kube_kms_deployer.go to a valid published tag (or switch to a supported
OSS image like docker.io/hashicorp/vault:X.Y.Z) and/or make it configurable via
an environment variable so tests won’t break at pull time; specifically, check
the registry for the exact tag name and replace the literal in the
WellKnownVaultImage constant or add logic to read VAULT_IMAGE from env before
falling back to the constant.
- Around line 285-286: The current repo/tag extraction using repeated
strings.Split on d.config.Image is brittle (fails for registry:port and no-tag
images); replace it with robust parsing: compute lastSlash :=
strings.LastIndex(d.config.Image, "/") and lastColon :=
strings.LastIndex(d.config.Image, ":") and treat a tag as present only if
lastColon > lastSlash; then set tag := "" if no tag else
d.config.Image[lastColon+1:] and repo := d.config.Image[:lastColon] (or repo :=
d.config.Image if no tag); alternatively import and use
docker/distribution/reference.ParseNormalizedNamed to get name and tag—update
the expression building the Helm values (the strings.TrimSuffix/strings.Split
usage) to use these repo and tag variables instead of the current repeated
strings.Split logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f264afb8-e053-440b-b3d4-13c7242581d2

📥 Commits

Reviewing files that changed from the base of the PR and between 4d952ac and a19ea1d.

📒 Files selected for processing (3)
  • test/library/encryption/kms/GUIDE.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/GUIDE.md Outdated
@sandeepknd sandeepknd force-pushed the deploy_vault branch 2 times, most recently from 374fd2d to be568d8 Compare April 29, 2026 09:28
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (13)
test/library/encryption/kms/vault_kube_kms_deployer.go (9)

301-307: ⚠️ Potential issue | 🟠 Major

Use a unique temp values file to avoid cross-test collisions.

Line 302 uses a fixed path (/tmp/vault-values.yaml), which is race-prone in parallel runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 301 -
307, The temp values file uses a fixed path ("valuesFile" variable) which can
collide in parallel tests; update the code that writes values (the block
creating and writing valuesFile) to use a unique temp file (e.g., via
os.CreateTemp / ioutil.TempFile) so each test gets its own filename, write the
valuesContent to that file, ensure you close the file handle before using it,
keep the defer os.Remove on the generated filename, and preserve the permission
semantics and error wrapping in the existing function (referencing the
valuesFile variable and the surrounding write/check/error-return logic).

647-647: ⚠️ Potential issue | 🔴 Critical

Command and output logging may leak Vault secrets.

Raw command/output logging can expose root tokens, unseal keys, and secret IDs.

Also applies to: 689-694

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` at line 647, The test
currently logs raw commands and outputs (d.t.Logf("Executing in pod %s/%s: %v",
d.config.Namespace, podName, command)) which may leak Vault secrets; update the
logging in the pod execution helper (the call site using d.t.Logf with podName,
d.config.Namespace and command) to either remove raw command/output logging or
redact sensitive tokens/keys before logging (e.g., replace patterns like "vault
token", "unseal", "root", "secret_id" or known env var values with
"<REDACTED>"); apply the same change to the other occurrences around lines
referenced (the other d.t.Logf usages at the pod execution helper and the block
at 689-694) so no raw Vault secrets are emitted to test logs.

615-630: ⚠️ Potential issue | 🟠 Major

Do not log AppRole credentials in plain text.

Line 615 and Line 630 print role_id and secret_id directly to test logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 615 -
630, The test currently logs AppRole credentials (roleID and secretID) via
d.t.Logf and stores secretID in d.config.AppRoleSecretID; remove or redact those
plain-text logs to avoid leaking secrets. Replace the d.t.Logf("AppRole Role ID:
%s", roleID) and d.t.Logf("AppRole Secret ID: %s", secretID) calls with either
no logging or a redacted/truncated form (e.g., only show last 4 chars or
"[REDACTED]") and keep storing the secret in d.config.AppRoleSecretID as needed;
ensure any reference to secret-id generated via execInPodWithEnv remains
non-printed and that error logs do not include the secret value.

108-113: ⚠️ Potential issue | 🟠 Major

Guard against nil config before applying defaults.

Line 112 dereferences config directly; NewVaultDeployer(..., nil) will panic.

Suggested fix
 func NewVaultDeployer(t testing.TB, kubeClient kubernetes.Interface, restConfig *rest.Config, config *VaultConfig) *VaultDeployer {
 	t.Helper()
+	if config == nil {
+		config = &VaultConfig{}
+	}
 
 	// Set defaults
 	if config.Namespace == "" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 108 -
113, NewVaultDeployer currently dereferences config without nil-check; update
NewVaultDeployer to guard against a nil *VaultConfig by checking if config ==
nil and, if so, initializing it to an empty VaultConfig (e.g., config =
&VaultConfig{}) before applying defaults like WellKnownVaultNamespace and
continuing to construct and return the *VaultDeployer; reference
NewVaultDeployer, VaultConfig, WellKnownVaultNamespace, and VaultDeployer when
locating and changing the code.

247-257: ⚠️ Potential issue | 🟠 Major

Helm values ignore VaultConfig fields.

StorageSize, StorageClass, Mode, and Replicas are exposed in VaultConfig but hardcoded in values (size: 2Gi, storageClass: null, standalone.enabled: true, ha.enabled: false).

Also applies to: 271-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 247 -
257, The Helm values template stored in valuesContent currently hardcodes
storage size, storageClass, accessMode and HA/standalone flags; update the
template generation to inject the VaultConfig fields (StorageSize, StorageClass,
Mode, Replicas) instead of literals: use VaultConfig.StorageSize for size,
VaultConfig.StorageClass (or null when empty) for storageClass, VaultConfig.Mode
(or VaultConfig.AccessMode) for accessMode, and map VaultConfig.Replicas to
enable HA vs standalone (set standalone.enabled when Replicas==1, ha.enabled
when Replicas>1) so valuesContent reflects the VaultConfig passed into the
deployer function.

442-445: ⚠️ Potential issue | 🔴 Critical

Fail fast when Vault is already initialized.

Returning "EXISTING_ROOT_TOKEN" and dummy keys causes downstream auth/unseal failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 442 -
445, The current branch in d.isVaultInitialized(ctx, podName) returns a fake
root token and dummy keys which causes downstream auth/unseal failures; instead
fail fast by returning no token/keys and a clear error. Modify the block that
calls d.isVaultInitialized(ctx, podName) so it logs that Vault is already
initialized and then returns an error (e.g., fmt.Errorf including
podName/context) with empty token and nil keys rather than "EXISTING_ROOT_TOKEN"
and the fake ["KEY1","KEY2","KEY3"] values.

493-499: ⚠️ Potential issue | 🟠 Major

Unseal loop can panic on short key lists.

Line 498 indexes unsealKeys[j] with a hardcoded j < 3 and no length validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 493 -
499, The unseal loop in vault_kube_kms_deployer.go indexes unsealKeys with a
hardcoded j < 3 and can panic if unsealKeys has fewer than 3 elements; change
the loop in the block that builds unsealCmd (the for j := 0; j < 3; j++ { ...
unsealKeys[j] }) to bound j by the actual slice length (e.g., for j := 0; j < 3
&& j < len(unsealKeys); j++ or iterate using for j := range unsealKeys and stop
after 3) and add a nil/length check before the loop to avoid dereferencing an
empty slice.

659-673: ⚠️ Potential issue | 🟠 Major

Quote env values before composing sh -c.

Line 662 injects raw env values into a shell command; metacharacters can alter execution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 659 -
673, Env values are being injected raw into envPrefix before composing the sh -c
command, which allows metacharacters to alter execution; update the loop that
builds envPrefix (the envPrefix variable populated from env slice) to
shell-quote each env value (and/or use strconv.Quote or a dedicated shellEscape
helper) so you emit NAME='value' (with proper escaping of any embedded single
quotes) instead of raw values, and keep using shellJoin and the stdin heredoc
logic as-is before appending to ocArgs ("sh", "-c", fullCmd).

285-286: ⚠️ Potential issue | 🟡 Minor

Image repository/tag split is fragile.

The split/trim logic breaks for registries with ports and images without explicit tags.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 285 -
286, The image split logic using strings.Split and strings.TrimSuffix is fragile
for registries with ports and images without tags; replace it with a safe split
that finds the last '/' and the last ':' and only treats the part after ':' as a
tag if that colon occurs after the last '/', otherwise treat the whole string as
the repository and use a default tag (e.g., "latest") or no tag as appropriate.
Locate code referencing d.config.Image and the current strings.Split/TrimSuffix
usage, compute lastSlash := strings.LastIndex(image, "/") and lastColon :=
strings.LastIndex(image, ":"), then if lastColon > lastSlash set repo =
image[:lastColon] and tag = image[lastColon+1:] else set repo = image and tag =
"latest" (or nil per project conventions), and use those repo/tag values instead
of the current split logic.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (3)

546-549: ⚠️ Potential issue | 🟠 Major

Namespace deletion check should only treat NotFound as success.

Any API error currently flips deleted=true, which can mask transient failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 546
- 549, The loop that checks namespace deletion currently sets deleted = true for
any error returned by kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace,
metav1.GetOptions{}); change it to treat success only when the error is a
NotFound; i.e., import k8s.io/apimachinery/pkg/api/errors (apierrors) and
replace the unconditional error branch with a conditional that sets deleted =
true only if apierrors.IsNotFound(err), otherwise propagate or record the
unexpected error (fail the test or continue retrying) so transient API errors
don't incorrectly mark the namespace as deleted; update references to
kubeClient.CoreV1().Namespaces().Get, deleted, and testNamespace accordingly.

329-346: ⚠️ Potential issue | 🟠 Major

Do not silently skip Vault status verification failures.

If oc exec or JSON parsing fails, the test currently continues and can pass falsely.

Suggested fix
 statusOutput, err := statusCmd.Output()
-if err == nil {
-	var vaultStatus struct {
-		Initialized bool `json:"initialized"`
-		Sealed      bool `json:"sealed"`
-	}
-	if err := json.Unmarshal(statusOutput, &vaultStatus); err == nil {
-		...
-	}
-}
+if err != nil {
+	t.Fatalf("Failed to run vault status: %v; output=%s", err, string(statusOutput))
+}
+var vaultStatus struct {
+	Initialized bool `json:"initialized"`
+	Sealed      bool `json:"sealed"`
+}
+if err := json.Unmarshal(statusOutput, &vaultStatus); err != nil {
+	t.Fatalf("Failed to parse vault status JSON: %v; output=%s", err, string(statusOutput))
+}
+...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 329
- 346, The current check silently ignores failures from the `oc exec` call (when
`err != nil`) and from `json.Unmarshal(statusOutput, &vaultStatus)`; update the
block that references `statusOutput` and `vaultStatus` so that any `err` from
the exec call results in a test failure (e.g., `t.Fatalf` or `t.Fatalf("...:
%v", err)`) and any JSON unmarshal error likewise fails the test and logs
`statusOutput` for debugging; keep the subsequent initialized/sealed assertions
but ensure they run only after successful unmarshal so failures are reported
rather than skipped.

359-361: ⚠️ Potential issue | 🟡 Minor

Avoid logging full AppRole credentials.

Line 360 and Line 361 expose sensitive values in CI logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 359
- 361, The test currently logs full AppRole credentials via t.Logf for roleID
and secretID which exposes sensitive data; remove or redact these full values in
the test (vault_kube_kms_deployer_test.go) by replacing the t.Logf calls that
reference roleID and secretID with either non-sensitive messages (e.g., "AppRole
credentials saved to CI secret store") or masked output (e.g., show only
first/last N characters of roleID/secretID) and ensure any storage is done via
secure CI secrets/artifacts rather than test logs; update the t.Logf calls that
mention ROLE_ID and SECRET_ID accordingly.
test/library/encryption/kms/deploy_vault_readme.md (1)

199-200: ⚠️ Potential issue | 🟠 Major

Documentation still promotes printing sensitive credentials.

The guide says credentials are output and shows full sample ROLE_ID/SECRET_ID; this normalizes secret leakage in logs.

Also applies to: 216-217, 644-645

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/deploy_vault_readme.md` around lines 199 - 200,
The README currently prints and shows full AppRole credentials (ROLE_ID and
SECRET_ID) and instructs outputting them for inspection; remove or redact any
example literal ROLE_ID/SECRET_ID values and stop recommending printing secrets
to logs or console. Update the "Outputs AppRole credentials" and any sample
blocks that display ROLE_ID/SECRET_ID to instead instruct users to store
credentials in a secure secret store or CI/CD secret variable, show only masked
placeholders (e.g., <REDACTED_ROLE_ID>), and replace guidance to "Leaves
deployment running" with a note about cleaning up or using ephemeral,
access-controlled inspection methods. Ensure all occurrences of
ROLE_ID/SECRET_ID in the document are changed to masked placeholders and
guidance points to secure storage and safe handling.
🧹 Nitpick comments (1)
test/library/encryption/kms/deploy_vault_readme.md (1)

78-79: Avoid hardcoded machine-specific absolute paths in docs.

These paths are not portable and will fail for most contributors.

Also applies to: 170-171, 188-189, 284-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/deploy_vault_readme.md` around lines 78 - 79,
Replace hardcoded absolute paths like the snippet "cd
/home/skundu/automation/library-go/test/library/encryption/kms" with a portable
alternative (e.g., a relative path from the repo root such as "cd
../../..../test/library/encryption/kms", a placeholder like
"<repo-root>/test/library/encryption/kms", or an environment variable like
"$REPO_ROOT/test/library/encryption/kms"); update all other occurrences of the
same pattern in the document so examples are portable for other contributors and
add a brief note telling users how to set the placeholder or navigate from the
repository root.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/deploy_vault_readme.md`:
- Line 70: The project tree in the README mistakenly references "GUIDE.md"
instead of this file's actual name; update the reference string "GUIDE.md" to
"deploy_vault_readme.md" (or rename the target file to match if intended) so the
project structure accurately reflects the current filename; search for the
literal "GUIDE.md" in this document and replace it with "deploy_vault_readme.md"
and verify links or TOC entries that use the old name are updated accordingly.
- Around line 65-71: The markdown contains unlabelled fenced code blocks (e.g.,
the project tree and several example blocks) that trigger markdownlint MD040;
update each triple-backtick block in the README/GUIDE markdown so it includes an
appropriate language tag (for example ```text for plain output, ```bash for
shell snippets, ```yaml for manifests, or ```log for logs) — search for the
unlabeled ``` blocks (including the project tree listing that shows
vault_kube_kms_deployer.go, vault_kube_kms_deployer_test.go, vault.hclic,
GUIDE.md) and add the correct language identifier to each fence.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 565-569: The test currently treats any error from
kubeClient.RbacV1().ClusterRoleBindings().Get(ctx, clusterRoleBindingName,
metav1.GetOptions{}) as "deleted or doesn't exist"; change the logic to
distinguish a NotFound error from other API errors by using
apierrors.IsNotFound(err) (import k8s.io/apimachinery/pkg/api/errors as
apierrors), log the "deleted/doesn't exist" message only when IsNotFound(err) is
true, and for non-nil/non-NotFound errors log the actual error and fail the test
(t.Fatalf or t.Fatalf-like) so RBAC/API failures are surfaced; keep references
to clusterRoleBindingName and ctx in the updated handling.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 389-392: The wait closure currently returns true when
pod.Status.Phase == corev1.PodRunning which is insufficient; change the
readiness check to ensure the Pod is Running and has a PodReady condition with
Status == corev1.ConditionTrue by inspecting pod.Status.Conditions (i.e., keep
corev1.PodRunning but also verify the condition of type corev1.PodReady is
ConditionTrue) before returning true so commands only run after the container is
ready.

---

Duplicate comments:
In `@test/library/encryption/kms/deploy_vault_readme.md`:
- Around line 199-200: The README currently prints and shows full AppRole
credentials (ROLE_ID and SECRET_ID) and instructs outputting them for
inspection; remove or redact any example literal ROLE_ID/SECRET_ID values and
stop recommending printing secrets to logs or console. Update the "Outputs
AppRole credentials" and any sample blocks that display ROLE_ID/SECRET_ID to
instead instruct users to store credentials in a secure secret store or CI/CD
secret variable, show only masked placeholders (e.g., <REDACTED_ROLE_ID>), and
replace guidance to "Leaves deployment running" with a note about cleaning up or
using ephemeral, access-controlled inspection methods. Ensure all occurrences of
ROLE_ID/SECRET_ID in the document are changed to masked placeholders and
guidance points to secure storage and safe handling.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 546-549: The loop that checks namespace deletion currently sets
deleted = true for any error returned by
kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{});
change it to treat success only when the error is a NotFound; i.e., import
k8s.io/apimachinery/pkg/api/errors (apierrors) and replace the unconditional
error branch with a conditional that sets deleted = true only if
apierrors.IsNotFound(err), otherwise propagate or record the unexpected error
(fail the test or continue retrying) so transient API errors don't incorrectly
mark the namespace as deleted; update references to
kubeClient.CoreV1().Namespaces().Get, deleted, and testNamespace accordingly.
- Around line 329-346: The current check silently ignores failures from the `oc
exec` call (when `err != nil`) and from `json.Unmarshal(statusOutput,
&vaultStatus)`; update the block that references `statusOutput` and
`vaultStatus` so that any `err` from the exec call results in a test failure
(e.g., `t.Fatalf` or `t.Fatalf("...: %v", err)`) and any JSON unmarshal error
likewise fails the test and logs `statusOutput` for debugging; keep the
subsequent initialized/sealed assertions but ensure they run only after
successful unmarshal so failures are reported rather than skipped.
- Around line 359-361: The test currently logs full AppRole credentials via
t.Logf for roleID and secretID which exposes sensitive data; remove or redact
these full values in the test (vault_kube_kms_deployer_test.go) by replacing the
t.Logf calls that reference roleID and secretID with either non-sensitive
messages (e.g., "AppRole credentials saved to CI secret store") or masked output
(e.g., show only first/last N characters of roleID/secretID) and ensure any
storage is done via secure CI secrets/artifacts rather than test logs; update
the t.Logf calls that mention ROLE_ID and SECRET_ID accordingly.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 301-307: The temp values file uses a fixed path ("valuesFile"
variable) which can collide in parallel tests; update the code that writes
values (the block creating and writing valuesFile) to use a unique temp file
(e.g., via os.CreateTemp / ioutil.TempFile) so each test gets its own filename,
write the valuesContent to that file, ensure you close the file handle before
using it, keep the defer os.Remove on the generated filename, and preserve the
permission semantics and error wrapping in the existing function (referencing
the valuesFile variable and the surrounding write/check/error-return logic).
- Line 647: The test currently logs raw commands and outputs
(d.t.Logf("Executing in pod %s/%s: %v", d.config.Namespace, podName, command))
which may leak Vault secrets; update the logging in the pod execution helper
(the call site using d.t.Logf with podName, d.config.Namespace and command) to
either remove raw command/output logging or redact sensitive tokens/keys before
logging (e.g., replace patterns like "vault token", "unseal", "root",
"secret_id" or known env var values with "<REDACTED>"); apply the same change to
the other occurrences around lines referenced (the other d.t.Logf usages at the
pod execution helper and the block at 689-694) so no raw Vault secrets are
emitted to test logs.
- Around line 615-630: The test currently logs AppRole credentials (roleID and
secretID) via d.t.Logf and stores secretID in d.config.AppRoleSecretID; remove
or redact those plain-text logs to avoid leaking secrets. Replace the
d.t.Logf("AppRole Role ID: %s", roleID) and d.t.Logf("AppRole Secret ID: %s",
secretID) calls with either no logging or a redacted/truncated form (e.g., only
show last 4 chars or "[REDACTED]") and keep storing the secret in
d.config.AppRoleSecretID as needed; ensure any reference to secret-id generated
via execInPodWithEnv remains non-printed and that error logs do not include the
secret value.
- Around line 108-113: NewVaultDeployer currently dereferences config without
nil-check; update NewVaultDeployer to guard against a nil *VaultConfig by
checking if config == nil and, if so, initializing it to an empty VaultConfig
(e.g., config = &VaultConfig{}) before applying defaults like
WellKnownVaultNamespace and continuing to construct and return the
*VaultDeployer; reference NewVaultDeployer, VaultConfig,
WellKnownVaultNamespace, and VaultDeployer when locating and changing the code.
- Around line 247-257: The Helm values template stored in valuesContent
currently hardcodes storage size, storageClass, accessMode and HA/standalone
flags; update the template generation to inject the VaultConfig fields
(StorageSize, StorageClass, Mode, Replicas) instead of literals: use
VaultConfig.StorageSize for size, VaultConfig.StorageClass (or null when empty)
for storageClass, VaultConfig.Mode (or VaultConfig.AccessMode) for accessMode,
and map VaultConfig.Replicas to enable HA vs standalone (set standalone.enabled
when Replicas==1, ha.enabled when Replicas>1) so valuesContent reflects the
VaultConfig passed into the deployer function.
- Around line 442-445: The current branch in d.isVaultInitialized(ctx, podName)
returns a fake root token and dummy keys which causes downstream auth/unseal
failures; instead fail fast by returning no token/keys and a clear error. Modify
the block that calls d.isVaultInitialized(ctx, podName) so it logs that Vault is
already initialized and then returns an error (e.g., fmt.Errorf including
podName/context) with empty token and nil keys rather than "EXISTING_ROOT_TOKEN"
and the fake ["KEY1","KEY2","KEY3"] values.
- Around line 493-499: The unseal loop in vault_kube_kms_deployer.go indexes
unsealKeys with a hardcoded j < 3 and can panic if unsealKeys has fewer than 3
elements; change the loop in the block that builds unsealCmd (the for j := 0; j
< 3; j++ { ... unsealKeys[j] }) to bound j by the actual slice length (e.g., for
j := 0; j < 3 && j < len(unsealKeys); j++ or iterate using for j := range
unsealKeys and stop after 3) and add a nil/length check before the loop to avoid
dereferencing an empty slice.
- Around line 659-673: Env values are being injected raw into envPrefix before
composing the sh -c command, which allows metacharacters to alter execution;
update the loop that builds envPrefix (the envPrefix variable populated from env
slice) to shell-quote each env value (and/or use strconv.Quote or a dedicated
shellEscape helper) so you emit NAME='value' (with proper escaping of any
embedded single quotes) instead of raw values, and keep using shellJoin and the
stdin heredoc logic as-is before appending to ocArgs ("sh", "-c", fullCmd).
- Around line 285-286: The image split logic using strings.Split and
strings.TrimSuffix is fragile for registries with ports and images without tags;
replace it with a safe split that finds the last '/' and the last ':' and only
treats the part after ':' as a tag if that colon occurs after the last '/',
otherwise treat the whole string as the repository and use a default tag (e.g.,
"latest") or no tag as appropriate. Locate code referencing d.config.Image and
the current strings.Split/TrimSuffix usage, compute lastSlash :=
strings.LastIndex(image, "/") and lastColon := strings.LastIndex(image, ":"),
then if lastColon > lastSlash set repo = image[:lastColon] and tag =
image[lastColon+1:] else set repo = image and tag = "latest" (or nil per project
conventions), and use those repo/tag values instead of the current split logic.

---

Nitpick comments:
In `@test/library/encryption/kms/deploy_vault_readme.md`:
- Around line 78-79: Replace hardcoded absolute paths like the snippet "cd
/home/skundu/automation/library-go/test/library/encryption/kms" with a portable
alternative (e.g., a relative path from the repo root such as "cd
../../..../test/library/encryption/kms", a placeholder like
"<repo-root>/test/library/encryption/kms", or an environment variable like
"$REPO_ROOT/test/library/encryption/kms"); update all other occurrences of the
same pattern in the document so examples are portable for other contributors and
add a brief note telling users how to set the placeholder or navigate from the
repository root.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3880f5ca-6c72-4c8e-ad3a-36eb0ad7842e

📥 Commits

Reviewing files that changed from the base of the PR and between a19ea1d and be568d8.

📒 Files selected for processing (3)
  • test/library/encryption/kms/deploy_vault_readme.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/deploy_vault_readme.md Outdated
Comment thread test/library/encryption/kms/deploy_vault_readme.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
@sandeepknd
Copy link
Copy Markdown
Author

@coderabbitai resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

✅ Actions performed

Reviews resumed.

Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
@gangwgr
Copy link
Copy Markdown
Contributor

gangwgr commented Apr 29, 2026

try to use helic file content because we stored content not file in secret i

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
test/library/encryption/kms/vault_kube_kms_deployer_test.go (1)

526-529: ⚠️ Potential issue | 🟡 Minor

Differentiate NotFound from other API errors during namespace checks.

Both checks currently treat any error as “missing/deleted”, which can hide transient API/RBAC failures.

✅ Proposed fix
-		ns, err := kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{})
-		if err != nil {
-			t.Logf("  ⚠ Namespace %s not found (may have been deleted already)", testNamespace)
-			continue
-		}
+		ns, err := kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{})
+		if err != nil {
+			if apierrors.IsNotFound(err) {
+				t.Logf("  ⚠ Namespace %s not found (may have been deleted already)", testNamespace)
+				continue
+			}
+			t.Errorf("  ✗ Failed to query namespace %s: %v", testNamespace, err)
+			continue
+		}
...
-			_, err = kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{})
-			if err != nil {
-				deleted = true
-				break
-			}
+			_, err = kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace, metav1.GetOptions{})
+			if err != nil {
+				if apierrors.IsNotFound(err) {
+					deleted = true
+					break
+				}
+				t.Logf("  ... namespace check transient error: %v", err)
+			}

Also applies to: 567-570

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` around lines 526
- 529, When calling kubeClient.CoreV1().Namespaces().Get(ctx, testNamespace,
metav1.GetOptions{}) in the test, don't treat every error as "namespace
missing"; instead use the k8s API error helper
(k8s.io/apimachinery/pkg/api/errors). Replace the unconditional continue branch
with a conditional: if apierrors.IsNotFound(err) then t.Logf(... and continue)
else fail the test or surface the error (e.g., t.Fatalf or t.Errorf) so
transient API/RBAC errors are not hidden; apply the same change to the other
occurrence that calls Namespaces().Get for testNamespace.
test/library/encryption/kms/vault_kube_kms_deployer.go (1)

283-289: ⚠️ Potential issue | 🟠 Major

DeploymentMode currently has no effect in Helm values.

Both branches set standalone.enabled=true and ha.enabled=false, so DeploymentModeCloud behaves the same as local in this path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 283 -
289, The Helm values for Vault deployment ignore d.config.Mode because
standaloneEnabled and haEnabled are set to the same values in both branches;
update the logic in vault_kube_kms_deployer.go where standaloneEnabled and
haEnabled are defined so that when d.config.Mode == DeploymentModeCloud you set
standaloneEnabled="false" and haEnabled="true" (or otherwise enable HA for
cloud), while keeping the local/default branch as standaloneEnabled="true" and
haEnabled="false"; ensure you reference the DeploymentModeCloud constant and the
variables standaloneEnabled/haEnabled used later when rendering Helm values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/deploy_vault_readme.md`:
- Around line 521-522: The README's default replicas value is incorrect; update
the documentation to match the code which uses DefaultVaultReplicas (3) as the
default in NewVaultDeployer. Change the comment/value for the Replicas field in
the doc from "default: 1" to "default: 3" (or explicitly reference
DefaultVaultReplicas) and ensure the description for Replicas and Mode still
references the types DeploymentMode and the NewVaultDeployer behavior so docs
reflect the actual default used by NewVaultDeployer.
- Around line 216-217: The sample output in deploy_vault_readme.md exposes full
credentials for ROLE_ID and SECRET_ID; update the sample lines that show
vault_kube_kms_deployer_test.go output so these values are redacted or truncated
(e.g. ROLE_ID: f4ce0bd3-2d**-****-****-****d01246a4f or ROLE_ID: f4ce0bd3...a4f,
SECRET_ID: 58fe4d86...124c or SECRET_ID: ********-****-****-****-************),
keeping the surrounding context intact and only replacing the secret string
tokens ROLE_ID and SECRET_ID in the example output.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 681-683: Current test code logs AppRole credentials in plaintext
(roleID/secretID) via d.t.Logf while assigning to
d.config.AppRoleRoleID/d.config.AppRoleSecretID; remove or redact those logs so
sensitive values are not emitted. Locate the uses of roleID and secretID and
either delete the d.t.Logf lines or replace the logged values with a constant
mask (e.g., "[REDACTED]") or a length-only hint, ensuring assignments to
d.config.AppRoleRoleID and d.config.AppRoleSecretID remain but no plaintext
credentials are written to logs.

---

Duplicate comments:
In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 526-529: When calling kubeClient.CoreV1().Namespaces().Get(ctx,
testNamespace, metav1.GetOptions{}) in the test, don't treat every error as
"namespace missing"; instead use the k8s API error helper
(k8s.io/apimachinery/pkg/api/errors). Replace the unconditional continue branch
with a conditional: if apierrors.IsNotFound(err) then t.Logf(... and continue)
else fail the test or surface the error (e.g., t.Fatalf or t.Errorf) so
transient API/RBAC errors are not hidden; apply the same change to the other
occurrence that calls Namespaces().Get for testNamespace.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 283-289: The Helm values for Vault deployment ignore d.config.Mode
because standaloneEnabled and haEnabled are set to the same values in both
branches; update the logic in vault_kube_kms_deployer.go where standaloneEnabled
and haEnabled are defined so that when d.config.Mode == DeploymentModeCloud you
set standaloneEnabled="false" and haEnabled="true" (or otherwise enable HA for
cloud), while keeping the local/default branch as standaloneEnabled="true" and
haEnabled="false"; ensure you reference the DeploymentModeCloud constant and the
variables standaloneEnabled/haEnabled used later when rendering Helm values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2c1ec926-d63d-4cd4-90bd-3e915b53dee9

📥 Commits

Reviewing files that changed from the base of the PR and between a19ea1d and 02fd41b.

📒 Files selected for processing (3)
  • test/library/encryption/kms/deploy_vault_readme.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/deploy_vault_readme.md Outdated
Comment thread test/library/encryption/kms/deploy_vault_readme.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
@sandeepknd sandeepknd force-pushed the deploy_vault branch 2 times, most recently from 493671b to c6d30db Compare April 29, 2026 13:36
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
test/library/encryption/kms/vault_kube_kms_deployer.go (3)

170-171: Avoid arbitrary time.Sleep; prefer polling for readiness.

Using time.Sleep(5 * time.Second) introduces a race condition where Vault may not be ready to accept commands, or wastes time if it's ready sooner. Consider polling Vault's status endpoint instead.

♻️ Suggested approach
-	// Wait a bit for Vault to be ready to accept commands
-	time.Sleep(5 * time.Second)
+	// Wait for Vault to be ready to accept commands
+	if err := d.waitForVaultReady(ctx); err != nil {
+		return fmt.Errorf("Vault not ready to accept commands: %w", err)
+	}

Implement waitForVaultReady to poll vault status until it returns without connection errors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 170 -
171, Replace the arbitrary time.Sleep(5 * time.Second) by implementing and
calling waitForVaultReady that polls Vault's status via the "vault status"
command (or HTTP status endpoint) until it reports ready or a timeout elapses;
use a context with deadline, retry loop with short backoff, treat transient
connection errors as retryable, and return an error if timeout is reached.
Remove the Sleep call and invoke waitForVaultReady from the same deployment flow
(where time.Sleep is used) so the deployer only proceeds when vault status
indicates readiness.

543-546: Differentiate "already exists" errors from other failures.

Currently, all errors from enabling transit engine are logged and ignored. This could mask permission errors, network failures, or other critical issues. Consider checking for specific "already enabled" error patterns.

♻️ Suggested approach
 	if _, _, err := d.execInPodWithEnv(ctx, podName, "vault", enableTransitCmd, env); err != nil {
-		// Ignore error if already enabled
-		d.t.Logf("Transit engine may already be enabled: %v", err)
+		// Only ignore "path is already in use" errors
+		if !strings.Contains(err.Error(), "path is already in use") {
+			return fmt.Errorf("failed to enable transit engine: %w", err)
+		}
+		d.t.Logf("Transit engine already enabled, continuing")
 	}

This same pattern applies to Lines 556-558 (transit key) and Lines 569-571 (AppRole auth).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 543 -
546, The current code calls d.execInPodWithEnv(ctx, podName, "vault",
enableTransitCmd, env) and unconditionally ignores any error by logging "Transit
engine may already be enabled"; change this to inspect the returned err and only
swallow it when it matches the Vault "already enabled"/"already exists" pattern
(e.g., contains "already enabled" or the known API error string); for any other
error, propagate or fail the test (return the err or call d.t.Fatalf/d.t.Errorf)
so permission/network/etc. failures are not masked. Apply the same fix for the
transit key and AppRole auth calls (the other execInPodWithEnv usages noted in
the comment) so each checks error content before deciding to ignore versus fail.

767-778: shellJoin doesn't escape all shell metacharacters.

The function only handles spaces and single quotes. Characters like $, `, \, ", !, etc. could still cause shell injection or unexpected behavior in edge cases.

♻️ Consider a more robust approach

For test utilities, the current implementation may be sufficient, but consider using %q formatting or always single-quoting all arguments:

 func shellJoin(args []string) string {
 	quoted := make([]string, len(args))
 	for i, arg := range args {
-		if strings.Contains(arg, " ") || strings.Contains(arg, "'") {
-			quoted[i] = fmt.Sprintf("'%s'", strings.ReplaceAll(arg, "'", "'\\''"))
-		} else {
-			quoted[i] = arg
-		}
+		// Always quote to be safe
+		quoted[i] = fmt.Sprintf("'%s'", strings.ReplaceAll(arg, "'", "'\\''"))
 	}
 	return strings.Join(quoted, " ")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go` around lines 767 -
778, shellJoin currently only handles spaces and single quotes and can still
allow shell metacharacters to cause injection; replace its ad-hoc logic in the
shellJoin function with a robust quoting approach such as using strconv.Quote
(or fmt.Sprintf("%q")) on each argument to produce a safely quoted
representation and then strings.Join the results, ensuring all characters like
$, `, \, ", !, etc. are escaped consistently instead of only handling spaces and
single quotes.
test/library/encryption/kms/vault_kube_kms_deployer_test.go (1)

113-113: Minor: Use slice literal with capacity for efficiency.

append(vaultSharedManifestFiles, vaultDeploymentManifestFile) creates a new slice. For clarity and slight efficiency, consider pre-allocating.

♻️ Suggested improvement
-	allManifests := append(vaultSharedManifestFiles, vaultDeploymentManifestFile)
+	allManifests := make([]string, 0, len(vaultSharedManifestFiles)+1)
+	allManifests = append(allManifests, vaultSharedManifestFiles...)
+	allManifests = append(allManifests, vaultDeploymentManifestFile)

Or simply:

allManifests := append([]string{}, vaultSharedManifestFiles...)
allManifests = append(allManifests, vaultDeploymentManifestFile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go` at line 113,
Replace the direct append call that allocates a new slice for allManifests by
preallocating capacity and copying the shared files first: create a slice with
make([]string, 0, len(vaultSharedManifestFiles)+1), append
vaultSharedManifestFiles... into it, then append vaultDeploymentManifestFile;
this targets the allManifests assignment that currently uses
append(vaultSharedManifestFiles, vaultDeploymentManifestFile) to avoid an
unnecessary allocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/library/encryption/kms/deploy_vault_readme.md`:
- Around line 77-79: Replace hardcoded, user-specific absolute paths in
deploy_vault_readme.md (e.g., the cd command instances that use
"/home/skundu/automation/library-go/...") with portable alternatives; update the
occurrences referenced in the diff (the cd lines around the current examples and
the similar lines at the other two locations) to use relative paths like "cd
test/library/encryption/kms" or a placeholder such as
"${REPO_ROOT}/test/library/encryption/kms" so the README is not tied to a
specific user's home directory.

In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Around line 516-522: The test is checking for a StatefulSet but the deployer
actually creates a Deployment; update the check in the test (where
kubeClient.AppsV1().StatefulSets(testNamespace).Get(...) is used) to query the
Deployments API instead
(kubeClient.AppsV1().Deployments(testNamespace).Get(...)) for the "vault"
resource name (testNamespace and "vault" stay the same) so the test inspects the
correct resource created by the deployer (see vault_kube_kms_deployer.go where
the Deployment is created).

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 974-979: The template references a PVC named "vault-data" in cloud
mode but no PVC manifest is provided; add a new manifest entry "vault_pvc.yaml"
to the vaultSharedManifestFiles slice and add a corresponding template entry in
getVaultManifestTemplate that emits a PersistentVolumeClaim named "vault-data"
when .Mode == "cloud" (include .Namespace, optional .StorageClass, and
.StorageSize in the spec); this ensures the claim referenced by the Deployment's
persistentVolumeClaim: claimName: vault-data exists in cloud mode.

---

Nitpick comments:
In `@test/library/encryption/kms/vault_kube_kms_deployer_test.go`:
- Line 113: Replace the direct append call that allocates a new slice for
allManifests by preallocating capacity and copying the shared files first:
create a slice with make([]string, 0, len(vaultSharedManifestFiles)+1), append
vaultSharedManifestFiles... into it, then append vaultDeploymentManifestFile;
this targets the allManifests assignment that currently uses
append(vaultSharedManifestFiles, vaultDeploymentManifestFile) to avoid an
unnecessary allocation.

In `@test/library/encryption/kms/vault_kube_kms_deployer.go`:
- Around line 170-171: Replace the arbitrary time.Sleep(5 * time.Second) by
implementing and calling waitForVaultReady that polls Vault's status via the
"vault status" command (or HTTP status endpoint) until it reports ready or a
timeout elapses; use a context with deadline, retry loop with short backoff,
treat transient connection errors as retryable, and return an error if timeout
is reached. Remove the Sleep call and invoke waitForVaultReady from the same
deployment flow (where time.Sleep is used) so the deployer only proceeds when
vault status indicates readiness.
- Around line 543-546: The current code calls d.execInPodWithEnv(ctx, podName,
"vault", enableTransitCmd, env) and unconditionally ignores any error by logging
"Transit engine may already be enabled"; change this to inspect the returned err
and only swallow it when it matches the Vault "already enabled"/"already exists"
pattern (e.g., contains "already enabled" or the known API error string); for
any other error, propagate or fail the test (return the err or call
d.t.Fatalf/d.t.Errorf) so permission/network/etc. failures are not masked. Apply
the same fix for the transit key and AppRole auth calls (the other
execInPodWithEnv usages noted in the comment) so each checks error content
before deciding to ignore versus fail.
- Around line 767-778: shellJoin currently only handles spaces and single quotes
and can still allow shell metacharacters to cause injection; replace its ad-hoc
logic in the shellJoin function with a robust quoting approach such as using
strconv.Quote (or fmt.Sprintf("%q")) on each argument to produce a safely quoted
representation and then strings.Join the results, ensuring all characters like
$, `, \, ", !, etc. are escaped consistently instead of only handling spaces and
single quotes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2b37e46d-6193-4001-8292-2703029069d1

📥 Commits

Reviewing files that changed from the base of the PR and between 02fd41b and 493671b.

📒 Files selected for processing (3)
  • test/library/encryption/kms/deploy_vault_readme.md
  • test/library/encryption/kms/vault_kube_kms_deployer.go
  • test/library/encryption/kms/vault_kube_kms_deployer_test.go

Comment thread test/library/encryption/kms/deploy_vault_readme.md Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer_test.go Outdated
Comment thread test/library/encryption/kms/vault_kube_kms_deployer.go Outdated
// Run:
//
// INTEGRATION_TEST=true VAULT_LICENSE="$(cat vault.hclic)" go test -v -timeout 15m -run TestVaultDeployerIntegration ./test/library/encryption/kms/...
func TestVaultDeployerIntegration(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to have a unit test for a test deployment package?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you suggest to remove the vault_kube_kms_deployer_test.go file itself ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we don't need it

@sandeepknd sandeepknd force-pushed the deploy_vault branch 5 times, most recently from eff7879 to 953b3b7 Compare May 4, 2026 18:52
@sandeepknd sandeepknd requested a review from ardaguclu May 4, 2026 18:57
@sandeepknd sandeepknd changed the title [WIP]utility for vault deployer utility for vault deployer May 4, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 4, 2026
sleep 2
done

if vault status 2>&1 | grep -q 'Initialized.*true'; then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line:32, we almost do the same. In line:32, we wait for the Vault to be initialized. But after executing the same command here, we decide that Vault initialized and skip setup. I'm not sure I understand the difference between two?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly for handling the re-deployment or restart of vault pod or trying to re-initialize an already-initialized Vault pod. The condition check appears to be redundant; actually the issue arises due to two sidecar containers (vault binary and vault-script) being launched simultaneously. In situations when the vault-setup script starts executing the vault commands (eg: check the vault status) even before the vault binary is installed, it would fail.
So, here either the vault-script needs to explicitly wait for few seconds (use sleep hardcoded time in seconds)
before executing any vault command
or
need to wait/poll and check if vault is responding - initialized field might show up but status might still be empty
or
initialized field might show up with status true/false.

@ardaguclu ,
please share your thoughts.

fi

echo "Initializing Vault..."
INIT_OUTPUT=$(vault operator init \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just not;

Suggested change
INIT_OUTPUT=$(vault operator init \
INIT_OUTPUT=$(vault operator init -key-shares=1 -key-threshold=1 -format=json

exit 1
}

UNSEAL_KEY=$(echo "$INIT_OUTPUT" | tr -d ' \n\t' | sed -n 's/.*"unseal_keys_b64":\["\([^"]*\)".*/\1/p')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We init keys using -format=json. Why can't we use jq instead of this?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither 'jq' nor 'oc' comes pre-installed with hasicorp vault image.

~ $ 
~ $ jq
/bin/sh: jq: not found
~ $ oc
/bin/sh: oc: not found
~ $ 

}

UNSEAL_KEY=$(echo "$INIT_OUTPUT" | tr -d ' \n\t' | sed -n 's/.*"unseal_keys_b64":\["\([^"]*\)".*/\1/p')
ROOT_TOKEN=$(echo "$INIT_OUTPUT" | tr -d ' \n\t' | sed -n 's/.*"root_token":"\([^"]*\)".*/\1/p')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

fi

echo "Unsealing Vault..."
vault operator unseal "$UNSEAL_KEY"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vault operator unseal "$UNSEAL_KEY"
vault operator unseal "${UNSEAL_KEY}"


echo "Creating vault-credentials secret via K8s API..."
SA_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
K8S_API="https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be simpler to install oc and use it (i.e. oc create secret)?

}
EOSECRET

wget --header="Authorization: Bearer ${SA_TOKEN}" \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can install oc in vault image. That would save us multiple lines of code.

// Run:
//
// INTEGRATION_TEST=true VAULT_LICENSE="$(cat vault.hclic)" go test -v -timeout 15m -run TestVaultDeployerIntegration ./test/library/encryption/kms/...
func TestVaultDeployerIntegration(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we don't need it

Comment on lines +58 to +67
- name: vault-setup
image: {{ .Image }}
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "/vault/config/setup.sh"]
env:
- name: VAULT_ADDR
value: "http://127.0.0.1:8200"
volumeMounts:
- name: config
mountPath: /vault/config
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not make it an init container? avoids the sleep statements in your setup script too

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we are having two containers - vault binary (installation with enterprise licesnse) and the vault-setup.
Initconatiner will get completed even before the main container starts.
We need to keep the vault binary container running and configuring the vault (vault-setup.sh) can be done post that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you go checkout how the helm chart works:
https://github.com/hashicorp/vault-helm/tree/main

because that doesn't need any of that, you just point the license to it and it is deployed, I don't see any operator code necessary to setup any of this, but like I mentioned earlier, haven't been using this in a long time.

having a sidecar container that is dependent on another container is something we should avoid and also the sleep statements are an indicator to me that this isn't quite right ;-)

also from a common sense perspective: you setup something, then you run it. Why would you run something and then setup?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That ^^ would be awesome

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"also from a common sense perspective: you setup something, then you run it. Why would you run something and then setup?"
I think its a bit confusing here :)
we are
installing vault and then configuring it.

"having a sidecar container that is dependent on another container is something we should avoid and also the sleep statements are an indicator to me that this isn't quite right ;-)"
maybe @ardaguclu would like your inputs here as well.

Copy link
Copy Markdown
Author

@sandeepknd sandeepknd May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gangwgr
#2189 (comment),
so the init and unseal part can be executed under the init container.
How about the vault main binary ? How is it executed ?
Will it continue to get executed as the main container in the vault pod.
So, once again comes the chicken egg situation, the init container will try to configure the vault even before the vault binary is installed.
Is my understanding correct ?

Copy link
Copy Markdown
Contributor

@tjungblu tjungblu May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have this in library-go btw? seems a post-install script in the release repo would be good enough. Seems there are some helm installs there already, too:
https://github.com/search?q=repo%3Aopenshift%2Frelease+%22helm+install%22&type=code

maybe the more relevant question for this would be: do we need to deploy those per-testcase, do we need multiple instances in different namespaces (likely with migrations?).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good question, Yes I prefer in release repo there it is easy to handle.
@p0lyn0mial can we move to release repo or keep it here?

Copy link
Copy Markdown
Author

@sandeepknd sandeepknd May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjungblu ,
#2189 (comment)
your thoughts on this;
even if we deploy through helm, eventually both the vault server (binary) and the setup script needs to be executed synchronously (server at first then configuration script). How to achieve this polling/waiting without sleep ?

Even if run both in the same container, we still need to wait/poll for the vault server to be ready, to be able to configure it.

#2189 (comment)
even if we run it by a job,
vault container → starts, takes ~2-5s to be ready
Job (separate pod): vault-setup → polls until vault service ready. polling is still needed to synchronize.

Just trying to understand. :)

Copy link
Copy Markdown
Contributor

@p0lyn0mial p0lyn0mial May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good question, Yes I prefer in release repo there it is easy to handle.
@p0lyn0mial can we move to release repo or keep it here?

If we could create a reusable custom workflow that works across multiple jobs, then we could consider it.

In general, though, I would prefer to implement it in go.

  1. The test cases are much more explicit and easier to understand when reading the code.
    It is also easier to modify.

  2. Writing validation functions would be simpler as well especially since I’m pretty sure we’ll need code to verify the status of the Vault service.

  3. It would also make it easier to support Vault to Vault migration scenarios, which would require multiple Vault instances.

  4. In the future we will also have a function that will force a key rotation.

Does that make sense?

@ardaguclu
Copy link
Copy Markdown
Member

ardaguclu commented May 5, 2026

/retitle CNTRLPLANE-3364: utility for vault deployer

@openshift-ci openshift-ci Bot changed the title utility for vault deployer CNTRLPLANE-3364: utility for vault deployer May 5, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 5, 2026

@sandeepknd: This pull request references CNTRLPLANE-3364 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Introduces a Kubernetes test deployer for local HashiCorp Vault Enterprise deployment, init/unseal, transit AppRole KMS configuration, unit and optional integration tests.

$ INTEGRATION_TEST=true VAULT_LICENSE=$(cat vault.hclic) go test -v -run TestVaultDeployerIntegration
=== RUN   TestVaultDeployerIntegration
   vault_kube_kms_deployer_test.go:62: Deploying Vault...
   vault_kube_kms_deployer_test.go:63: Deploying Vault Enterprise in namespace "vault-kms-test-20260430-194817" (replicas: 1)
   vault_kube_kms_deployer_test.go:63: Applied vault_namespace.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_serviceaccount.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_scc_rolebinding.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_role.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_role_binding.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_configmap.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Applied vault_service.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Created vault-license secret from VAULT_LICENSE env var
   vault_kube_kms_deployer_test.go:63: Applied vault_deployment.yaml (changed=true)
   vault_kube_kms_deployer_test.go:63: Waiting for Vault deployment to be ready...
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=0 available=0
   vault_kube_kms_deployer.go:231: Vault deployment: desired=1 ready=1 available=1
   vault_kube_kms_deployer_test.go:63: Reading vault credentials from secret vault-kms-test-20260430-194817/vault-credentials
   vault_kube_kms_deployer_test.go:63: Vault AppRole credentials loaded from secret
   vault_kube_kms_deployer_test.go:63: Vault Enterprise deployed. Service: http://vault.vault-kms-test-20260430-194817.svc:8200
   vault_kube_kms_deployer_test.go:72: Namespace vault-kms-test-20260430-194817 created
   vault_kube_kms_deployer_test.go:79: Deployment ready: 1/1 replicas
   vault_kube_kms_deployer_test.go:92: Vault credentials secret verified
   vault_kube_kms_deployer_test.go:94: Integration test passed!
--- PASS: TestVaultDeployerIntegration (21.53s)
PASS
ok  	github.com/openshift/library-go/test/library/encryption/kms	21.553s

$ INTEGRATION_TEST=true go test -v -run TestVaultCleanup
=== RUN   TestVaultCleanup
   vault_kube_kms_deployer_test.go:139: Found namespace to cleanup: vault-kms-test-20260430-194343
   vault_kube_kms_deployer_test.go:145: Deleting namespace vault-kms-test-20260430-194343...
   vault_kube_kms_deployer_test.go:155: Verifying namespace deletion...
   vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (1/30)
   vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (2/30)
   vault_kube_kms_deployer_test.go:166: Waiting for namespace deletion... (3/30)
   vault_kube_kms_deployer_test.go:159: Namespace successfully deleted
   vault_kube_kms_deployer_test.go:160: Cleanup test passed!
--- PASS: TestVaultCleanup (15.22s)
PASS
ok  	github.com/openshift/library-go/test/library/encryption/kms	15.264s

Summary by CodeRabbit

  • New Features
  • Added an automated Vault-on-Kubernetes deployer for integration testing: deploys Vault, handles license input, waits for readiness, initializes and unseals Vault, configures transit KMS and AppRole auth, stores credentials, and provides cleanup and getters.
  • Tests
  • Added unit and integration tests covering configuration/defaults, manifest templating, license scenarios, deployment modes, AppRole credential handling with masked logging, deployment/validation, and cleanup.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

}

// NewVaultDeployer creates a new VaultDeployer with the given configuration.
func NewVaultDeployer(t testing.TB, kubeClient kubernetes.Interface, config *VaultConfig) *VaultDeployer {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when two independent tests provide the same configuration ? will the service be installed only once ?

// VaultDeployer manages the deployment and lifecycle of HashiCorp Vault Enterprise
// for KMS encryption testing. All init/unseal/configure logic runs inside the pod
// as an init container; the Go deployer only applies manifests and reads results.
type VaultDeployer struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of this struct ? do we need it ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not then maybe we could do something similar to

func DeployUpstreamMockKMSPlugin(ctx context.Context, t testing.TB, kubeClient kubernetes.Interface, namespace, image string, count int) {
?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

@sandeepknd: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


// readCredentials reads the vault-credentials Secret created by the init
// container and populates the VaultConfig with AppRole credentials.
func (d *VaultDeployer) readCredentials(ctx context.Context) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of this function ?

namespace: {{ .Namespace }}
data:
vault.hcl: |
cluster_addr = "http://POD_IP:8201"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we tried installing the vault without setting this option ?

data:
vault.hcl: |
cluster_addr = "http://POD_IP:8201"
api_addr = "http://POD_IP:8200"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we tried installing the vault without setting this option ?


disable_mlock = true

setup.sh: |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an official HashiCorp Vault go client we could use for setup ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the port-forward hack is what we already use for talking with etcd
xref:

func (e *etcdWrapper) newEtcdClientInternal() (EtcdClient, func(), error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we could use the client to force a key rotation, client.Logical().WriteWithContext(ctx, "transit/keys/kubernetes-encryption-key/rotate", nil)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the drawback is that we pull in a new dep just for testing ....

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I raised this on slack already, because the client makes this a lot easier, but the additional test dependency sucks

@p0lyn0mial
Copy link
Copy Markdown
Contributor

The TestVaultDeployerIntegration was good for testing locally but this is not something we would like to checkin.

@p0lyn0mial
Copy link
Copy Markdown
Contributor

I am okay with splitting this PR into smaller ones if that's easier.

@sandeepknd
Copy link
Copy Markdown
Author

Raised a new PR openshift/release#78947 for the CI implementation.

@sandeepknd
Copy link
Copy Markdown
Author

closing this PR since the other one is the working one as per the most recent implementation approac.

@sandeepknd sandeepknd closed this May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants