Skip to content

Introduce Naive Cluster Profile Sets#4983

Merged
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
danilo-gemoli:feat/ci-operator/cluster-profile-sets
Mar 6, 2026
Merged

Introduce Naive Cluster Profile Sets#4983
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
danilo-gemoli:feat/ci-operator/cluster-profile-sets

Conversation

@danilo-gemoli
Copy link
Contributor

@danilo-gemoli danilo-gemoli commented Mar 4, 2026

Introducing a simple and naive cluster profile sets implementation by embedding the cluster profile name into the lease resource names.
This PR is much easier to explain with a real example.
As of today, in OpenShift, we have the following aws cluster profiles, and their associated lease types:

Cluster Profile | Lease Type
----------------+-------------------
aws             | aws-quota-slice
aws-2           | aws-2-quota-slice
aws-3           | aws-3-quota-slice
aws-4           | aws-4-quota-slice
aws-5           | aws-5-quota-slice

Each one of them is defined as follow in boskos, (choosing aws-3 as an example, see here):

- type: aws-3-quota-slice
  names:
  - us-east-1--aws-3-quota-slice-00
  - us-east-1--aws-3-quota-slice-01
  ...
  - us-west-2--aws-3-quota-slice-24

We define a new openshift-org-aws set by taking all the names from the aws-*-quota-slice lease types and put them all into a single entry openshift-org-aws-quota-slice:

- type: openshift-org-aws-quota-slice
  names:
  - aws--us-east-1--quota-slice-00
  - aws--us-east-1--quota-slice-01
  ...
  - aws-2--us-east-1--quota-slice-00
  - aws-2--us-east-1--quota-slice-01
  ...
  - aws-3--us-east-1--quota-slice-00
  - aws-3--us-east-1--quota-slice-01
  ...
  - aws-4--us-east-1--quota-slice-00
  - aws-4--us-east-1--quota-slice-01
  ...
  - aws-5--us-east-1--quota-slice-00
  - aws-5--us-east-1--quota-slice-01
  ...
  - aws-5--us-west-2--quota-slice-34

The names match the pattern ${CLUSTER_PROFILE}--${REGION}--${QUOTA_SLICE}.

The openshift-org-aws has to be added as usual, see https://docs.ci.openshift.org/how-tos/adding-a-cluster-profile/, such that it maps to the openshift-org-aws-quota-slice lease type.

At this stage a test, that is referencing the new cluster profile set, runs:

- as: e2e-aws-ovn-proxy
  steps:
    cluster_profile: openshift-org-aws
    workflow: openshift-e2e-aws-proxy

ci-operator starts and performs what follow:

  1. It acquires a lease of type openshift-org-aws-quota-slice at runtime, hence getting aws-4--us-east-1--quota-slice-01 as a lease (see pkg/steps/lease.go).
  2. It recognizes there is the aws-4 cluster profile name in it (see pkg/steps/lease.go). The code now handles lease names matching the pattern ${CLUSTER_PROFILE}--${REGION}--${QUOTA_SLICE}.
  3. It copies the secret cluster-secrets-aws-4 from the ci NS into ci-op-xxxx. This part was previously handled in cmd/ci-operator/main.go but it is now executed in pkg/steps/lease.go.
  4. It exposes the environment variable CLUSTER_PROFILE_SET_NAME=openshift-org-aws to any step. The variable is exported as a parameter by pkg/steps/lease.go and then gets cunsumed in pkg/steps/multi_stage/multi_stage.go.

Important note about (3): Since a cluster profile might not been know until the moment a lease is acquired, ci-operator doesn't copy the secret that belongs to a cluster profile at the beginning of its execution anymore.

Summary by CodeRabbit

  • New Features

    • Leases now include cluster-profile metadata and expose env vars CLUSTER_PROFILE and CLUSTER_PROFILE_SET_NAME to downstream steps.
    • Cluster-profile lookup is plumbed into lease acquisition so profiles can be resolved at runtime.
  • Improvements

    • Cluster-profile selection is propagated into step environments and test parameters.
    • Cluster-profile secrets are fetched and provisioned into target namespaces during acquisition to enable subsequent steps.
  • Tests

    • Expanded tests covering cluster-profile scenarios and secret wiring.

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e0d0883a-1b7b-4f6a-b79b-ef34fda9bf5b

📥 Commits

Reviewing files that changed from the base of the PR and between 926ce2d and 7d8ac45.

📒 Files selected for processing (4)
  • pkg/steps/lease.go
  • pkg/steps/lease_test.go
  • pkg/steps/multi_stage/multi_stage.go
  • pkg/steps/multi_stage/multi_stage_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/steps/multi_stage/multi_stage.go

Walkthrough

Removed initializer secret creation and getClusterProfileSecret; introduced ClusterProfileGetter wiring from resolver into defaults.Config; moved cluster-profile lookup and secret import into LeaseStep during lease acquisition; added cluster-profile constants, StepLease fields, updated signatures and tests.

Changes

Cohort / File(s) Summary
API Definitions
pkg/api/constant.go, pkg/api/types.go, pkg/api/leases.go, pkg/api/leases_test.go
Added ClusterProfileSetEnv and ClusterProfileParam constants; added ClusterProfile and ClusterProfileTarget fields to StepLease; changed LeasesForTest to accept *TestStepConfiguration and updated tests.
Defaults & Config
pkg/defaults/config.go, pkg/defaults/defaults.go, pkg/defaults/defaults_test.go
Added ClusterProfileGetter to defaults.Config; pass kubeClient and ClusterProfileGetter into LeaseStep; adjusted expected env vars in tests to include cluster-profile keys.
CI-Operator Wiring
cmd/ci-operator/main.go
Removed runtime creation/appending of cluster-profile secrets and deleted getClusterProfileSecret; expose resolver-provided cluster-profile getter via ToGraphConfig.
Lease Step Implementation & Tests
pkg/steps/lease.go, pkg/steps/lease_test.go
Extended LeaseStep signature to accept kubeClient and ClusterProfileGetter; added cluster-profile extraction, lookup via getter, import/upsert of immutable secret during lease acquisition; added provides for cluster-profile envs; expanded and table-driven tests.
Multi-Stage Step
pkg/steps/multi_stage/multi_stage.go, pkg/steps/multi_stage/multi_stage_test.go
Read cluster profile from step parameters in Run; include ClusterProfileSetEnv in environment output; tests updated to use cmp with sorted comparisons.
Callers & Prowgen
pkg/prowgen/prowgen.go, pkg/defaults/...
Updated callers to use TestStepConfiguration with LeasesForTest and propagated signature/parameter changes across call sites.

Sequence Diagram(s)

sequenceDiagram
    participant Config as Configuration
    participant LeaseStep as Lease Step
    participant ProfileGetter as ClusterProfileGetter
    participant KubeClient as Kubernetes Client
    participant KubeAPI as Kubernetes API

    Config->>LeaseStep: Initialize with kubeClient & ClusterProfileGetter
    LeaseStep->>LeaseStep: Acquire leases for test
    LeaseStep->>LeaseStep: Extract cluster profile name from leased resources
    alt Cluster profile present
        LeaseStep->>ProfileGetter: Request ClusterProfileDetails(name)
        ProfileGetter-->>LeaseStep: Return ClusterProfileDetails
        LeaseStep->>KubeClient: Get secret from "ci" namespace
        KubeClient->>KubeAPI: Read Secret
        KubeAPI-->>KubeClient: Secret data
        LeaseStep->>KubeClient: Create/Upsert immutable secret in target namespace
        KubeClient->>KubeAPI: Create/Upsert Secret
        KubeAPI-->>KubeClient: Acknowledge
    end
    LeaseStep->>LeaseStep: Record metrics and provide cluster-profile env vars
    LeaseStep-->>Config: Continue pipeline with cluster-profile state
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Introduce Naive Cluster Profile Sets' directly and clearly describes the main feature being added across the entire changeset—cluster profile sets functionality with a simple implementation approach.
Stable And Deterministic Test Names ✅ Passed Test names in modified test files appear to be static and descriptive, not containing dynamic information like timestamps, UUIDs, pod names, or IP addresses that would vary between runs.
Test Structure And Quality ✅ Passed Test code meets quality requirements with clear table-driven design, meaningful diagnostic messages, and appropriate use of fake clients.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@danilo-gemoli
Copy link
Contributor Author

/test e2e

@openshift-ci openshift-ci bot requested review from Prucek and pruan-rht March 4, 2026 10:21
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 4, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
pkg/defaults/defaults_test.go (1)

1715-1718: Prefer constants in expected param map.

Line 1715 and Line 1716 can use api.ClusterProfileParam and api.ClusterProfileSetEnv to avoid future string drift in tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/defaults/defaults_test.go` around lines 1715 - 1718, The expected params
map is using raw strings "CLUSTER_PROFILE" and "CLUSTER_PROFILE_SET_NAME";
replace those literal keys with the constants api.ClusterProfileParam and
api.ClusterProfileSetEnv respectively in the test (the map that also includes
api.DefaultLeaseEnv) so the test references stable identifiers and avoids string
drift—update the map entries where those two keys appear and run tests to verify
no import changes are needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/api/leases.go`:
- Around line 11-13: LeasesForTest currently dereferences test and
test.MultiStageTestConfigurationLiteral (e.g., reading
MultiStageTestConfigurationLiteral.ClusterProfile) and can panic if either is
nil; add defensive nil checks at the top of LeasesForTest (function
LeasesForTest, type TestStepConfiguration and its field
MultiStageTestConfigurationLiteral) that return an empty []StepLease when test
== nil or test.MultiStageTestConfigurationLiteral == nil before accessing
ClusterProfile or other fields, then proceed as before.

In `@pkg/steps/lease_test.go`:
- Around line 454-458: Fix the typo in the test error string: update the
t.Errorf call that currently says "failed to resove provides param %s: %s" (in
the block that populates gotProvides and checks err) to use "resolve" instead of
"resove" so the message reads "failed to resolve provides param %s: %s"; keep
the same formatting and variables (k, err) in the t.Errorf invocation.

In `@pkg/steps/multi_stage/multi_stage_test.go`:
- Around line 401-403: The comparator passed to cmpopts.SortSlices is wrong:
change the anonymous function signature from func(a, b string) bool to func(a, b
coreapi.EnvVar) bool and implement a strict (irreflexive) ordering using < not
<=; for example, compare primary key fields such as a.Name < b.Name and fall
back to a.Value < b.Value (or other deterministic fields) to ensure a total,
strict order for []coreapi.EnvVar before calling cmp.Diff with
cmpopts.SortSlices.

---

Nitpick comments:
In `@pkg/defaults/defaults_test.go`:
- Around line 1715-1718: The expected params map is using raw strings
"CLUSTER_PROFILE" and "CLUSTER_PROFILE_SET_NAME"; replace those literal keys
with the constants api.ClusterProfileParam and api.ClusterProfileSetEnv
respectively in the test (the map that also includes api.DefaultLeaseEnv) so the
test references stable identifiers and avoids string drift—update the map
entries where those two keys appear and run tests to verify no import changes
are needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f7092df-c322-4032-b65d-0510e6d22a89

📥 Commits

Reviewing files that changed from the base of the PR and between ac09bd6 and 8985d58.

📒 Files selected for processing (13)
  • cmd/ci-operator/main.go
  • pkg/api/constant.go
  • pkg/api/leases.go
  • pkg/api/leases_test.go
  • pkg/api/types.go
  • pkg/defaults/config.go
  • pkg/defaults/defaults.go
  • pkg/defaults/defaults_test.go
  • pkg/prowgen/prowgen.go
  • pkg/steps/lease.go
  • pkg/steps/lease_test.go
  • pkg/steps/multi_stage/multi_stage.go
  • pkg/steps/multi_stage/multi_stage_test.go

Comment on lines +11 to +13
func LeasesForTest(test *TestStepConfiguration) (ret []StepLease) {
multiStageTest := test.MultiStageTestConfigurationLiteral
if p := multiStageTest.ClusterProfile; p != "" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First, let's look at the actual function implementation
echo "=== Function implementation ==="
cat -n pkg/api/leases.go | head -30

echo ""
echo "=== All call sites of LeasesForTest ==="
rg -nP --type=go 'LeasesForTest\s*\('

Repository: openshift/ci-tools

Length of output: 1563


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Call site in prowgen.go ==="
sed -n '230,240p' pkg/prowgen/prowgen.go

echo ""
echo "=== Call site in defaults.go ==="
sed -n '395,405p' pkg/defaults/defaults.go

echo ""
echo "=== Test context ==="
sed -n '45,55p' pkg/api/leases_test.go

echo ""
echo "=== Full context around prowgen.go call ==="
sed -n '220,245p' pkg/prowgen/prowgen.go

Repository: openshift/ci-tools

Length of output: 2107


Add defensive nil guards in LeasesForTest.

Lines 12 and 13 can panic if test or test.MultiStageTestConfigurationLiteral is nil. The function comment states "It is assumed that they have been validated," yet the function itself has no guards. While current call sites in prowgen.go and defaults.go check MultiStageTestConfigurationLiteral before calling, the function should be defensive to prevent panics from new callers.

💡 Proposed fix
 func LeasesForTest(test *TestStepConfiguration) (ret []StepLease) {
+	if test == nil || test.MultiStageTestConfigurationLiteral == nil {
+		return nil
+	}
 	multiStageTest := test.MultiStageTestConfigurationLiteral
 	if p := multiStageTest.ClusterProfile; p != "" {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func LeasesForTest(test *TestStepConfiguration) (ret []StepLease) {
multiStageTest := test.MultiStageTestConfigurationLiteral
if p := multiStageTest.ClusterProfile; p != "" {
func LeasesForTest(test *TestStepConfiguration) (ret []StepLease) {
if test == nil || test.MultiStageTestConfigurationLiteral == nil {
return nil
}
multiStageTest := test.MultiStageTestConfigurationLiteral
if p := multiStageTest.ClusterProfile; p != "" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/leases.go` around lines 11 - 13, LeasesForTest currently dereferences
test and test.MultiStageTestConfigurationLiteral (e.g., reading
MultiStageTestConfigurationLiteral.ClusterProfile) and can panic if either is
nil; add defensive nil checks at the top of LeasesForTest (function
LeasesForTest, type TestStepConfiguration and its field
MultiStageTestConfigurationLiteral) that return an empty []StepLease when test
== nil or test.MultiStageTestConfigurationLiteral == nil before accessing
ClusterProfile or other fields, then proceed as before.

@danilo-gemoli danilo-gemoli force-pushed the feat/ci-operator/cluster-profile-sets branch from 8985d58 to e2447fd Compare March 4, 2026 13:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
pkg/api/leases.go (1)

11-13: ⚠️ Potential issue | 🟡 Minor

Add defensive nil guards in LeasesForTest to prevent panic.

At Line 12, test.MultiStageTestConfigurationLiteral is dereferenced without checking whether test or test.MultiStageTestConfigurationLiteral is nil.

💡 Proposed fix
 func LeasesForTest(test *TestStepConfiguration) (ret []StepLease) {
+	if test == nil || test.MultiStageTestConfigurationLiteral == nil {
+		return nil
+	}
 	multiStageTest := test.MultiStageTestConfigurationLiteral
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/leases.go` around lines 11 - 13, LeasesForTest currently dereferences
test and test.MultiStageTestConfigurationLiteral without guards; add nil checks
at the start of LeasesForTest to return empty slice if test is nil or
test.MultiStageTestConfigurationLiteral is nil/zero value, then proceed to set
multiStageTest := test.MultiStageTestConfigurationLiteral and use its
ClusterProfile field as before (referencing LeasesForTest,
TestStepConfiguration, and MultiStageTestConfigurationLiteral to locate the
change).
🧹 Nitpick comments (3)
pkg/api/leases_test.go (1)

19-29: Add coverage for ClusterProfileTarget propagation.

This case validates ClusterProfile but not ClusterProfileTarget. Since the new flow relies on target propagation for secret import, this should be asserted explicitly.

✅ Suggested test adjustment
 		name: "cluster profile, lease",
 		tests: TestStepConfiguration{
+			As: "e2e-aws",
 			MultiStageTestConfigurationLiteral: &MultiStageTestConfigurationLiteral{
 				ClusterProfile: ClusterProfileAWS,
 			},
 		},
 		expected: []StepLease{{
 			ResourceType:   "aws-quota-slice",
 			Env:            DefaultLeaseEnv,
 			Count:          1,
 			ClusterProfile: string(ClusterProfileAWS),
+			ClusterProfileTarget: "e2e-aws",
 		}},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/leases_test.go` around lines 19 - 29, The test currently sets
MultiStageTestConfigurationLiteral.ClusterProfile but does not set or assert
ClusterProfileTarget propagation; update the test case (the
TestStepConfiguration input using MultiStageTestConfigurationLiteral) to include
a ClusterProfileTarget value and update the expected StepLease entry to assert
that StepLease.ClusterProfileTarget (or the field name used in the lease struct)
equals that value, ensuring the test covers propagation of ClusterProfileTarget
from the input config to the produced StepLease.
pkg/steps/lease_test.go (2)

265-266: Minor grammar issue: "two lease" → "two leases"

✏️ Proposed fix
-		name: "Acquire two lease of different types",
+		name: "Acquire two leases of different types",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/steps/lease_test.go` around lines 265 - 266, The test case name string
"Acquire two lease of different types" has a grammar typo; update the test
case's name value (the string literal used in the test case definition, e.g., in
the test case struct/entry where name: "Acquire two lease of different types")
to "Acquire two leases of different types" so the description is grammatically
correct.

254-438: Consider adding error-path test cases.

The current test cases cover happy-path scenarios well. Consider adding test cases for error conditions such as:

  • Cluster profile getter returning an error (profile not found)
  • Source secret missing from the "ci" namespace
  • Malformed lease name that cannot be parsed for cluster profile extraction

This would improve test robustness and ensure error handling paths are exercised.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/steps/lease_test.go` around lines 254 - 438, Add negative test cases to
the existing table-driven tests: insert entries where leases (the
[]api.StepLease row) reference a non-existent cluster profile (leave
clusterProfiles map without that key to simulate "profile not found"), where the
source secret in objects ([]ctrlruntimeclient.Object) is omitted to simulate
"missing ci secret", and where resources (map[string]*common.Resource) contain a
malformed Name value (e.g., not parseable like "bad-name") to exercise the
lease-name parsing error path; for each new case assert the function under test
returns an error (check err != nil) and that no successful provides/secrets are
produced (wantProvides empty and wantSecrets.Items length 0 or wantCalls empty)
so the test verifies the error-handling branches for ClusterProfile getter
failures, missing source secret, and malformed lease names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/steps/lease.go`:
- Around line 174-176: The comparator passed to sort.Slice is indexing s.leases
with the loop indices instead of using the permutation in sorted, so change the
comparator to compare s.leases[sorted[i]] and s.leases[sorted[j]] (e.g., use
sorted[i]/sorted[j] when accessing s.leases and compare their ResourceType) so
the sort orders the permutation slice correctly for deadlock avoidance; update
the anonymous func in the sort.Slice call that currently references
s.leases[i]/s.leases[j] to reference s.leases[sorted[i]]/s.leases[sorted[j]]
instead.
- Around line 191-195: The loop is overwriting s.clusterProfileName for every
lease even when l.ClusterProfile is empty; change the logic in the lease
iteration so s.clusterProfileName is only set when l.ClusterProfile (or the
resolved clusterProfileName from clusterProfileFromResources(names)) is
non-empty—i.e., if l.ClusterProfile != "" then set s.clusterProfileName =
l.ClusterProfile, otherwise preserve the existing s.clusterProfileName;
similarly only assign s.clusterProfileSetName when you have a non-empty
l.ClusterProfile and when clusterProfileFromResources(names) returns non-empty,
using the functions/fields clusterProfileFromResources(names), l.ClusterProfile,
s.clusterProfileName and s.clusterProfileSetName to locate the change.

In `@pkg/steps/multi_stage/multi_stage.go`:
- Around line 218-223: The code unconditionally assigns s.profile from
getClusterProfileFromParams(s.params) which can overwrite a previously
configured profile with an empty value; change the logic so that after calling
getClusterProfileFromParams (and handling err), only assign s.profile =
clusterProfile when clusterProfile is non-empty (e.g., check clusterProfile !=
""), otherwise leave the existing s.profile intact so missing/empty
CLUSTER_PROFILE does not clear the configured profile; the relevant symbols are
getClusterProfileFromParams, s.params, and s.profile.

---

Duplicate comments:
In `@pkg/api/leases.go`:
- Around line 11-13: LeasesForTest currently dereferences test and
test.MultiStageTestConfigurationLiteral without guards; add nil checks at the
start of LeasesForTest to return empty slice if test is nil or
test.MultiStageTestConfigurationLiteral is nil/zero value, then proceed to set
multiStageTest := test.MultiStageTestConfigurationLiteral and use its
ClusterProfile field as before (referencing LeasesForTest,
TestStepConfiguration, and MultiStageTestConfigurationLiteral to locate the
change).

---

Nitpick comments:
In `@pkg/api/leases_test.go`:
- Around line 19-29: The test currently sets
MultiStageTestConfigurationLiteral.ClusterProfile but does not set or assert
ClusterProfileTarget propagation; update the test case (the
TestStepConfiguration input using MultiStageTestConfigurationLiteral) to include
a ClusterProfileTarget value and update the expected StepLease entry to assert
that StepLease.ClusterProfileTarget (or the field name used in the lease struct)
equals that value, ensuring the test covers propagation of ClusterProfileTarget
from the input config to the produced StepLease.

In `@pkg/steps/lease_test.go`:
- Around line 265-266: The test case name string "Acquire two lease of different
types" has a grammar typo; update the test case's name value (the string literal
used in the test case definition, e.g., in the test case struct/entry where
name: "Acquire two lease of different types") to "Acquire two leases of
different types" so the description is grammatically correct.
- Around line 254-438: Add negative test cases to the existing table-driven
tests: insert entries where leases (the []api.StepLease row) reference a
non-existent cluster profile (leave clusterProfiles map without that key to
simulate "profile not found"), where the source secret in objects
([]ctrlruntimeclient.Object) is omitted to simulate "missing ci secret", and
where resources (map[string]*common.Resource) contain a malformed Name value
(e.g., not parseable like "bad-name") to exercise the lease-name parsing error
path; for each new case assert the function under test returns an error (check
err != nil) and that no successful provides/secrets are produced (wantProvides
empty and wantSecrets.Items length 0 or wantCalls empty) so the test verifies
the error-handling branches for ClusterProfile getter failures, missing source
secret, and malformed lease names.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 88b7b994-ec63-4358-b162-247fb954ca82

📥 Commits

Reviewing files that changed from the base of the PR and between 8985d58 and e2447fd.

📒 Files selected for processing (13)
  • cmd/ci-operator/main.go
  • pkg/api/constant.go
  • pkg/api/leases.go
  • pkg/api/leases_test.go
  • pkg/api/types.go
  • pkg/defaults/config.go
  • pkg/defaults/defaults.go
  • pkg/defaults/defaults_test.go
  • pkg/prowgen/prowgen.go
  • pkg/steps/lease.go
  • pkg/steps/lease_test.go
  • pkg/steps/multi_stage/multi_stage.go
  • pkg/steps/multi_stage/multi_stage_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/api/types.go
  • pkg/steps/multi_stage/multi_stage_test.go
  • pkg/defaults/config.go
  • pkg/defaults/defaults_test.go

@danilo-gemoli danilo-gemoli force-pushed the feat/ci-operator/cluster-profile-sets branch from e2447fd to 926ce2d Compare March 4, 2026 19:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/steps/lease_test.go (1)

241-556: Add a failure-path case for cluster-profile setup after successful acquire.

TestAcquireLeases currently validates happy paths, but it does not cover the case where acquire succeeds and handleClusterProfile fails. Adding that case will lock in correct cleanup/release behavior and prevent lease-leak regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/steps/lease_test.go` around lines 241 - 556, TestAcquireLeases lacks a
failure-path for cluster-profile setup after a successful lease acquire; add a
new subtest case in TestAcquireLeases that includes a lease with
ClusterProfile/ClusterProfileTarget, configure the fake clusterProfileGetter or
kubeClient to make handleClusterProfile return an error (simulate failure during
cluster-profile setup), and assert that the leaseClient was still called to
release the acquired resource (check gotCalls contains releaseone for the
acquired resource) and that no orphaned cluster-profile secret remains in
kubeClient (verify secrets list and provides behave as expected); reference the
test harness symbols TestAcquireLeases, LeaseStep, lease.NewFakeClient,
clusterProfileGetter and ensure the new test mirrors existing cluster-profile
cases but injects the failure to validate cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/steps/lease.go`:
- Around line 252-281: The code dereferences s.clusterProfileGetter and
s.kubeClient without nil checks in handleClusterProfile and
importClusterProfileSecret which can panic if wiring is missing; add guards that
validate s.clusterProfileGetter != nil before calling it in handleClusterProfile
(return a descriptive error like "clusterProfileGetter not configured for
cluster profile X") and validate s.kubeClient != nil at the start of
importClusterProfileSecret (return a descriptive error like "kubeClient not
configured when importing secret Y for test Z") so missing dependencies produce
controlled errors instead of panics.
- Around line 187-200: The lease acquisition code assigns l.resources after
calling s.handleClusterProfile, which can fail and trigger cleanup without the
acquired names; to fix, assign l.resources = names immediately after names are
computed (before invoking s.handleClusterProfile) so cleanup/rollback sees the
acquired lease names, then call s.handleClusterProfile(ctx, l, names) and handle
its error as before; update references to l.resources, names, and
s.handleClusterProfile in pkg/steps/lease.go accordingly.

---

Nitpick comments:
In `@pkg/steps/lease_test.go`:
- Around line 241-556: TestAcquireLeases lacks a failure-path for
cluster-profile setup after a successful lease acquire; add a new subtest case
in TestAcquireLeases that includes a lease with
ClusterProfile/ClusterProfileTarget, configure the fake clusterProfileGetter or
kubeClient to make handleClusterProfile return an error (simulate failure during
cluster-profile setup), and assert that the leaseClient was still called to
release the acquired resource (check gotCalls contains releaseone for the
acquired resource) and that no orphaned cluster-profile secret remains in
kubeClient (verify secrets list and provides behave as expected); reference the
test harness symbols TestAcquireLeases, LeaseStep, lease.NewFakeClient,
clusterProfileGetter and ensure the new test mirrors existing cluster-profile
cases but injects the failure to validate cleanup.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1ff93ed9-a74c-4610-9663-f1acce9a9119

📥 Commits

Reviewing files that changed from the base of the PR and between e2447fd and 926ce2d.

📒 Files selected for processing (4)
  • pkg/steps/lease.go
  • pkg/steps/lease_test.go
  • pkg/steps/multi_stage/multi_stage.go
  • pkg/steps/multi_stage/multi_stage_test.go

Comment on lines +252 to +281
func (s *leaseStep) handleClusterProfile(ctx context.Context, l *stepLease, names []string) error {
s.clusterProfileName = l.ClusterProfile

if clusterProfileName := clusterProfileFromResources(names); clusterProfileName != "" {
s.clusterProfileSetName = l.ClusterProfile
s.clusterProfileName = clusterProfileName
}

if s.clusterProfileName == "" {
return nil
}

cpDetails, err := s.clusterProfileGetter(s.clusterProfileName)
if err != nil {
return fmt.Errorf("resolve cluster profile %s: %w", s.clusterProfileName, err)
}

if err := s.importClusterProfileSecret(ctx, cpDetails.Secret, l.ClusterProfileTarget); err != nil {
return fmt.Errorf("import secret %s for cluster profile %s: %w", cpDetails.Secret, s.clusterProfileName, err)
}

return nil
}

// importClusterProfileSecret retrieves the cluster profile secret name using config resolver,
// and gets the secret from the ci namespace
func (s *leaseStep) importClusterProfileSecret(ctx context.Context, secretName, testName string) error {
ciSecret := &coreapi.Secret{}
err := s.kubeClient.Get(ctx, ctrlruntimeclient.ObjectKey{Namespace: "ci", Name: secretName}, ciSecret)
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard cluster-profile dependencies before dereference.

At Line 264 and Line 280, s.clusterProfileGetter / s.kubeClient are used without nil checks. If a cluster-profile lease is configured but wiring is missing, this panics instead of returning a controlled error.

💡 Proposed fix
 func (s *leaseStep) handleClusterProfile(ctx context.Context, l *stepLease, names []string) error {
+	if s.clusterProfileGetter == nil {
+		return errors.New("cluster profile getter is not configured")
+	}
+	if s.kubeClient == nil {
+		return errors.New("kube client is not configured")
+	}
+
 	s.clusterProfileName = l.ClusterProfile
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *leaseStep) handleClusterProfile(ctx context.Context, l *stepLease, names []string) error {
s.clusterProfileName = l.ClusterProfile
if clusterProfileName := clusterProfileFromResources(names); clusterProfileName != "" {
s.clusterProfileSetName = l.ClusterProfile
s.clusterProfileName = clusterProfileName
}
if s.clusterProfileName == "" {
return nil
}
cpDetails, err := s.clusterProfileGetter(s.clusterProfileName)
if err != nil {
return fmt.Errorf("resolve cluster profile %s: %w", s.clusterProfileName, err)
}
if err := s.importClusterProfileSecret(ctx, cpDetails.Secret, l.ClusterProfileTarget); err != nil {
return fmt.Errorf("import secret %s for cluster profile %s: %w", cpDetails.Secret, s.clusterProfileName, err)
}
return nil
}
// importClusterProfileSecret retrieves the cluster profile secret name using config resolver,
// and gets the secret from the ci namespace
func (s *leaseStep) importClusterProfileSecret(ctx context.Context, secretName, testName string) error {
ciSecret := &coreapi.Secret{}
err := s.kubeClient.Get(ctx, ctrlruntimeclient.ObjectKey{Namespace: "ci", Name: secretName}, ciSecret)
if err != nil {
func (s *leaseStep) handleClusterProfile(ctx context.Context, l *stepLease, names []string) error {
if s.clusterProfileGetter == nil {
return errors.New("cluster profile getter is not configured")
}
if s.kubeClient == nil {
return errors.New("kube client is not configured")
}
s.clusterProfileName = l.ClusterProfile
if clusterProfileName := clusterProfileFromResources(names); clusterProfileName != "" {
s.clusterProfileSetName = l.ClusterProfile
s.clusterProfileName = clusterProfileName
}
if s.clusterProfileName == "" {
return nil
}
cpDetails, err := s.clusterProfileGetter(s.clusterProfileName)
if err != nil {
return fmt.Errorf("resolve cluster profile %s: %w", s.clusterProfileName, err)
}
if err := s.importClusterProfileSecret(ctx, cpDetails.Secret, l.ClusterProfileTarget); err != nil {
return fmt.Errorf("import secret %s for cluster profile %s: %w", cpDetails.Secret, s.clusterProfileName, err)
}
return nil
}
// importClusterProfileSecret retrieves the cluster profile secret name using config resolver,
// and gets the secret from the ci namespace
func (s *leaseStep) importClusterProfileSecret(ctx context.Context, secretName, testName string) error {
ciSecret := &coreapi.Secret{}
err := s.kubeClient.Get(ctx, ctrlruntimeclient.ObjectKey{Namespace: "ci", Name: secretName}, ciSecret)
if err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/steps/lease.go` around lines 252 - 281, The code dereferences
s.clusterProfileGetter and s.kubeClient without nil checks in
handleClusterProfile and importClusterProfileSecret which can panic if wiring is
missing; add guards that validate s.clusterProfileGetter != nil before calling
it in handleClusterProfile (return a descriptive error like
"clusterProfileGetter not configured for cluster profile X") and validate
s.kubeClient != nil at the start of importClusterProfileSecret (return a
descriptive error like "kubeClient not configured when importing secret Y for
test Z") so missing dependencies produce controlled errors instead of panics.

@danilo-gemoli danilo-gemoli force-pushed the feat/ci-operator/cluster-profile-sets branch from 926ce2d to 7d8ac45 Compare March 4, 2026 19:40
@danilo-gemoli
Copy link
Contributor Author

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 4, 2026
@danilo-gemoli
Copy link
Contributor Author

/test e2e

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 6, 2026
@jmguzik
Copy link
Contributor

jmguzik commented Mar 6, 2026

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2026
if s.clusterProfileName == "" {
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit?

	if s.clusterProfileGetter == nil {
		return fmt.Errorf("cluster profile getter is not configured")
	}

It's fixable in the followup though

Copy link
Contributor

@jmguzik jmguzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danilo-gemoli, jmguzik, liangxia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [danilo-gemoli,jmguzik,liangxia]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danilo-gemoli
Copy link
Contributor Author

/override ci/prow/integration

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

@danilo-gemoli: Overrode contexts on behalf of danilo-gemoli: ci/prow/integration

Details

In response to this:

/override ci/prow/integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

@danilo-gemoli: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/breaking-changes 7d8ac45 link false /test breaking-changes

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@danilo-gemoli
Copy link
Contributor Author

/override ci/prow/images

@openshift-ci-robot
Copy link
Contributor

Scheduling required tests:
/test e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

@danilo-gemoli: Overrode contexts on behalf of danilo-gemoli: ci/prow/images

Details

In response to this:

/override ci/prow/images

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@danilo-gemoli
Copy link
Contributor Author

/override ci/prow/e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

@danilo-gemoli: Overrode contexts on behalf of danilo-gemoli: ci/prow/e2e

Details

In response to this:

/override ci/prow/e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 6a0a2b5 into openshift:main Mar 6, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants