Skip to content

Fix CSI volume grouping for multi-collection bundles#5293

Open
psalajova wants to merge 2 commits into
openshift:mainfrom
psalajova:fix-csi-volume-grouping
Open

Fix CSI volume grouping for multi-collection bundles#5293
psalajova wants to merge 2 commits into
openshift:mainfrom
psalajova:fix-csi-volume-grouping

Conversation

@psalajova

@psalajova psalajova commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

When a GSM bundle references secrets from multiple collections (or groups), ResolveCredentialReferences expands them into credentials that all share the bundle's single mount_path. The previous grouping function keyed on (collection, group, mount_path), which split these into separate CSI volumes at the same path — Kubernetes rejects duplicate mount paths.

This is the same class of bug fixed in #4619 (pre-group/field era). The grouping key was (collection, mount_path) then, later widened to (collection, group, mount_path). That was safe when each credential had its own mount path, but breaks now that bundles can span collection boundaries.

Fix: group by mount_path only. A single SPC can reference secrets from any number of GSM collections — each object entry has its own full GSM path. Collection/group boundaries don't require separate volumes.

Example: multi-collection bundle

A bundle with entries from two different collections:

# gsm-config.yaml
- name: my-bundle
  gsm_secrets:
    - collection: team-a          # collection A
      group: aws-creds
      fields:
        - name: access-key
        - name: secret-key
    - collection: team-b          # collection B
      group: gcp-creds
      fields:
        - name: sa-key--dot--json

A test mounts it at a single path:

secrets:
  - bundle: my-bundle
    mount_path: /var/secrets

After resolution, 3 credentials all have mount_path: /var/secrets:

collection=team-a  group=aws-creds  field=access-key       mount=/var/secrets
collection=team-a  group=aws-creds  field=secret-key       mount=/var/secrets
collection=team-b  group=gcp-creds  field=sa-key--dot--json mount=/var/secrets

Before (broken): grouped by (collection:group:mount_path) → 2 groups → 2 CSI volumes at /var/secrets → K8s error: mount path must be unique

Group 1: "team-a:aws-creds:/var/secrets"  → SPC with [access-key, secret-key]  → CSI volume at /var/secrets
Group 2: "team-b:gcp-creds:/var/secrets"  → SPC with [sa-key.json]             → CSI volume at /var/secrets  ← DUPLICATE

After (fixed): grouped by mount_path only → 1 group → 1 CSI volume

Group 1: "/var/secrets" → SPC with [access-key, secret-key, sa-key.json] → CSI volume at /var/secrets  ← OK

The pod sees three files: /var/secrets/access-key, /var/secrets/secret-key, /var/secrets/sa-key.json

Example: file name collision detection

The new ValidateNoFileCollisionsOnMountPath catches actual file name collisions regardless of collection/group:

collection=col-a  group=grp-1  field=config  mount=/var/secrets
collection=col-b  group=grp-2  field=config  mount=/var/secrets
→ ERROR: file name collision at mount_path=/var/secrets: col-a/grp-1/config and col-b/grp-2/config both produce file "config"

The old check only looked within the same collection, so this cross-collection collision would have been missed.

Changes

  • GroupCredentialsByCollectionGroupAndMountPathGroupCredentialsByMountPath — key on mount path only
  • GetSPCName — hash (mountPath, sorted collection:group:field tuples) instead of (collection, group, mountPath, fields)
  • GetCSIVolumeName — hash (mountPath) instead of (collection, group, mountPath)
  • ValidateNoGroupCollisionsOnMountPathValidateNoFileCollisionsOnMountPath — check actual file name collisions across all credentials at the same mount path, regardless of collection/group
  • Update all callers and tests

Co-Authored-By: Claude

This PR fixes how CI generates and validates CSI-mounted Google Secret Manager secrets for GSM multi-collection/multi-group bundles in multi-stage Prow jobs.

Practically, when a single bundle resolves to multiple GSM (collection, group, field) credentials that all use the same Kubernetes mountPath, CI now groups CSI volumes purely by mountPath (not by (collection, group, mountPath)), preventing Kubernetes rejections caused by duplicate mounts at the same path.

To keep generated Kubernetes resources deterministic and consistent under the new grouping:

  • GetSPCName now hashes mountPath plus sorted (collection:group:field) tuples.
  • GetCSIVolumeName uses a hash derived from mountPath (then combines it with namespace for the final DNS-safe name).

Validation was renamed and strengthened:

  • ValidateNoGroupCollisionsOnMountPathValidateNoFileCollisionsOnMountPath
  • It checks for actual filename collisions at each mountPath after computing the effective file name (As if set, otherwise Field) and denormalizing forbidden symbols back to the mounted filename, across all collections/groups sharing that mount path.

Implementation-wise, CSI GSM credential handling was centralized and rewired to pkg/steps/csi_secrets, and all affected generators/callers/tests (ci-operator, prowgen, and multi-stage initialization/CSI volume/SPC generation) were updated to use the new grouping, naming, resolution (ResolveCredentialReferences), and collision validation logic.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: dba9c3cd-9ba3-4f11-b974-14754d2d4cbe

📥 Commits

Reviewing files that changed from the base of the PR and between 4ca54bc and 6548ba4.

📒 Files selected for processing (6)
  • pkg/steps/csi_secrets/csi_utils.go
  • pkg/steps/csi_secrets/csi_utils_test.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver_test.go
  • pkg/steps/multi_stage/gen.go
  • pkg/steps/multi_stage/init.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/release (manual)
  • openshift/ci-docs (manual)
  • openshift/release-controller (manual)
  • openshift/ci-chat-bot (manual)
🚧 Files skipped from review as they are similar to previous changes (6)
  • pkg/steps/multi_stage/gen.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver_test.go
  • pkg/steps/csi_secrets/csi_utils.go
  • pkg/steps/csi_secrets/csi_utils_test.go
  • pkg/steps/multi_stage/init.go

📝 Walkthrough

Walkthrough

The PR moves CSI secrets configuration and helper logic into pkg/steps/csi_secrets, changes GSM bundle validation to file-collision checks on mount paths, and updates multi-stage plus external callers to use the new types and exported helpers.

Changes

CSI secrets migration

Layer / File(s) Summary
Helpers and types
pkg/steps/csi_secrets/config.go, pkg/steps/csi_secrets/csi_utils.go, pkg/steps/csi_secrets/csi_utils_test.go
Adds GSMConfiguration and CollectionGroupKey, exports CSI/GSM helper functions, and updates grouping and hash behavior around mount paths.
Bundle resolution and collision checks
pkg/steps/csi_secrets/gsm_bundle_resolver.go, pkg/steps/csi_secrets/gsm_bundle_resolver_test.go
Switches discovered-field caching to CollectionGroupKey and replaces group-collision validation with filename-collision validation on mount paths.
Multi-stage CSI wiring
pkg/steps/multi_stage/multi_stage.go, pkg/steps/multi_stage/gen.go, pkg/steps/multi_stage/init.go, pkg/steps/multi_stage/*_test.go
Replaces local GSM and CSI helper usage with csi_secrets for resolution, validation, SPC creation, censoring, and CSI volume construction.
Caller type updates
cmd/ci-operator/main.go, pkg/defaults/config.go, pkg/prowgen/podspec.go
Switches GSM config and CSI volume builder call sites to the new csi_secrets types and helpers.

Estimated code review effort: 4 (Complex) | ~60 minutes

🚥 Pre-merge checks | ✅ 16 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Coverage For New Features ⚠️ Warning New pure helpers BuildSecretProviderClass, GetCensorMountPath, IsK8sSecretReference, and IsGSMReference lack direct unit tests; current tests only hit them indirectly. Add small table-driven tests for those helpers, with independent expected values so the assertions don’t reuse the same function under test.
✅ Passed checks (16 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: fixing CSI volume grouping for multi-collection bundles.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Go Error Handling ✅ Passed PASS: The PR’s added code wraps errors with %w, performs nil checks on CSI/GSM paths, and introduces no new ignored errors or panics in the diff.
Stable And Deterministic Test Names ✅ Passed No Ginkgo titles were added; changed tests use static t.Run names and no dynamic title construction was found.
Test Structure And Quality ✅ Passed PASS: The touched tests are plain table-driven testing unit tests, not Ginkgo, and I don’t see cluster waits or cleanup hazards.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added; the touched files are plain Go unit tests in pkg/steps with no MicroShift-sensitive APIs or skip labels.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests were added; the touched files are production code and Go unit tests, with no It/Describe/Context/When blocks.
Topology-Aware Scheduling Compatibility ✅ Passed Only CSI secret config/volume wiring changed; no nodeSelector, affinity, topologySpread, replica, or PDB logic was introduced.
Ote Binary Stdout Contract ✅ Passed The PR only refactors CSI secret helpers/types; no new stdout writes were added in main/init/TestMain/setup code, and existing ci-operator help/logging paths are unchanged.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests were added; the changed tests are standard unit tests and show no IPv4-only or external connectivity assumptions.
No-Weak-Crypto ✅ Passed No MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or secret comparisons were added; the new helpers use SHA-256 only for deterministic names.
Container-Privileges ✅ Passed Touched files only refactor CSI secret handling; no privileged/hostPID/hostNetwork/hostIPC/SYS_ADMIN/allowPrivilegeEscalation settings were added.
No-Sensitive-Data-In-Logs ✅ Passed Only a new debug line logs field count plus collection/group; no passwords, tokens, PII, or hostnames are logged in touched code.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from hector-vido and pruan-rht July 3, 2026 15:30
@openshift-ci

openshift-ci Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: psalajova

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 3, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/steps/multi_stage/gen.go (1)

637-662: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Align SPC name generation between createSPCs and addCredentials. createSPCs hashes the merged GSM credentials across all steps, but addCredentials hashes only the current step’s subset. If two steps share a mount path but not the same GSM tuple set, the pod will reference an SPC name that was never created. Make both paths use the same credential set or derive the SPC from the same per-step grouping.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/steps/multi_stage/gen.go` around lines 637 - 662, The SPC name used in
addCredentials for CSI-mounted GSM secrets is derived from only the current
step’s grouped credentials, while createSPCs builds names from the merged GSM
credential set, so the pod can reference an SPC that was never created. Update
addCredentials and createSPCs to use the same credential grouping logic for
csi_secrets.GetSPCName and csi_secrets.GetCSIVolumeName, ideally by sharing the
same per-mount-path grouping from csi_secrets.GroupCredentialsByMountPath or
another common helper, so the generated SPC name stays consistent across both
paths.
🧹 Nitpick comments (2)
pkg/steps/csi_secrets/gsm_bundle_resolver_test.go (1)

484-484: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Test name is stale.

TestValidateNoGroupCollisionsOnMountPath now tests ValidateNoFileCollisionsOnMountPath (line 556); the name still references the old group-collision semantics.

✏️ Rename fix
-func TestValidateNoGroupCollisionsOnMountPath(t *testing.T) {
+func TestValidateNoFileCollisionsOnMountPath(t *testing.T) {
As per coding guidelines, "A good name is long enough to fully communicate what the item is or does."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/steps/csi_secrets/gsm_bundle_resolver_test.go` at line 484, The test name
is stale and still refers to the old group-collision behavior. Rename
TestValidateNoGroupCollisionsOnMountPath to match the current
ValidateNoFileCollisionsOnMountPath behavior, keeping the new name aligned with
the test’s actual purpose and the related validation function.

Source: Coding guidelines

pkg/steps/csi_secrets/csi_utils.go (1)

27-33: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add doc comments to newly exported functions.

IsK8sSecretReference, IsGSMReference, BuildGCPSecretsParameter, GetCensorMountPath, and BuildSecretProviderClass are now exported from this new package but lack doc comments, unlike GroupCredentialsByMountPath/GetSPCName/GetCSIVolumeName in the same file.

As per coding guidelines, "Go documentation on Classes/Functions/Fields should be written properly." As per path instructions, "Comment important exported functions with their purpose, parameters, and return values."

📝 Example fix
+// IsK8sSecretReference reports whether the credential reference points to a Kubernetes Secret.
 func IsK8sSecretReference(c api.CredentialReference) bool {
 	return c.Namespace != "" && c.Name != ""
 }

+// IsGSMReference reports whether the credential reference points to a Google Secret Manager field.
 func IsGSMReference(c api.CredentialReference) bool {
 	return c.Collection != "" && c.Group != "" && c.Field != ""
 }

Also applies to: 46-46, 138-142

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/steps/csi_secrets/csi_utils.go` around lines 27 - 33, Add Go doc comments
for the newly exported helpers in csi_utils.go so they follow the same style as
GroupCredentialsByMountPath, GetSPCName, and GetCSIVolumeName. Update the
declarations of IsK8sSecretReference, IsGSMReference, BuildGCPSecretsParameter,
GetCensorMountPath, and BuildSecretProviderClass with concise comments that
start with the function name and explain each function’s purpose, inputs, and
return value where applicable.

Sources: Coding guidelines, Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/steps/multi_stage/init.go`:
- Line 80: The mount-path collision check in resolveCredentials() is only
step-local, but createSPCs() and addCredentials() derive SPC names from job-wide
mount-path groupings, so two steps can still diverge on the same mount_path. Add
a job-wide validation or shared grouping in init.go that covers the resolved
credentials across the entire job before createSPCs() and addCredentials() run,
using the existing resolveCredentials(), createSPCs(), and addCredentials() flow
to keep SPC naming consistent.

---

Outside diff comments:
In `@pkg/steps/multi_stage/gen.go`:
- Around line 637-662: The SPC name used in addCredentials for CSI-mounted GSM
secrets is derived from only the current step’s grouped credentials, while
createSPCs builds names from the merged GSM credential set, so the pod can
reference an SPC that was never created. Update addCredentials and createSPCs to
use the same credential grouping logic for csi_secrets.GetSPCName and
csi_secrets.GetCSIVolumeName, ideally by sharing the same per-mount-path
grouping from csi_secrets.GroupCredentialsByMountPath or another common helper,
so the generated SPC name stays consistent across both paths.

---

Nitpick comments:
In `@pkg/steps/csi_secrets/csi_utils.go`:
- Around line 27-33: Add Go doc comments for the newly exported helpers in
csi_utils.go so they follow the same style as GroupCredentialsByMountPath,
GetSPCName, and GetCSIVolumeName. Update the declarations of
IsK8sSecretReference, IsGSMReference, BuildGCPSecretsParameter,
GetCensorMountPath, and BuildSecretProviderClass with concise comments that
start with the function name and explain each function’s purpose, inputs, and
return value where applicable.

In `@pkg/steps/csi_secrets/gsm_bundle_resolver_test.go`:
- Line 484: The test name is stale and still refers to the old group-collision
behavior. Rename TestValidateNoGroupCollisionsOnMountPath to match the current
ValidateNoFileCollisionsOnMountPath behavior, keeping the new name aligned with
the test’s actual purpose and the related validation function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 604cd25d-2c81-4ec8-a095-c154cbf8f8ae

📥 Commits

Reviewing files that changed from the base of the PR and between 7055b37 and 571f1dd.

📒 Files selected for processing (14)
  • cmd/ci-operator/main.go
  • pkg/defaults/config.go
  • pkg/prowgen/podspec.go
  • pkg/steps/csi_secrets/config.go
  • pkg/steps/csi_secrets/csi_utils.go
  • pkg/steps/csi_secrets/csi_utils_test.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver.go
  • pkg/steps/csi_secrets/gsm_bundle_resolver_test.go
  • pkg/steps/multi_stage/gen.go
  • pkg/steps/multi_stage/gen_test.go
  • pkg/steps/multi_stage/init.go
  • pkg/steps/multi_stage/init_test.go
  • pkg/steps/multi_stage/multi_stage.go
  • pkg/steps/multi_stage/multi_stage_test.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/release (manual)
  • openshift/ci-docs (manual)
  • openshift/release-controller (manual)
  • openshift/ci-chat-bot (manual)

Comment thread pkg/steps/multi_stage/init.go
@psalajova psalajova force-pushed the fix-csi-volume-grouping branch from 571f1dd to 4ca54bc Compare July 3, 2026 16:49
Group credentials by mount_path only instead of (collection, group,
mount_path). When a bundle spans multiple collections, the old grouping
produced multiple CSI volumes at the same mount path, which Kubernetes
rejects. A single SPC can reference secrets from any number of
collections, so collection/group boundaries don't require separate
volumes.

Also replace ValidateNoGroupCollisionsOnMountPath with
ValidateNoFileCollisionsOnMountPath, which checks for actual file name
collisions across all credentials at the same mount path regardless of
collection or group.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@psalajova psalajova force-pushed the fix-csi-volume-grouping branch from 4ca54bc to 6548ba4 Compare July 4, 2026 07:44
@psalajova

Copy link
Copy Markdown
Contributor Author

/test e2e

@openshift-ci

openshift-ci Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

@psalajova: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/integration 6548ba4 link true /test integration

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant