STOR-3013: Add e2e test for SELinuxMount upgrade readiness by rvagner78 · Pull Request #715 · openshift/cluster-storage-operator

rvagner78 · 2026-06-26T11:42:56Z

Summary by CodeRabbit

Tests
- Added an end-to-end readiness test covering SELinux mount conflicts for pods sharing an RWO PVC.
- Verifies upgrade readiness is blocked when two pods use the same PVC with conflicting SELinux levels.
- Confirms the system emits the expected warning metric and alert state while blocked.
- Ensures readiness returns once the conflicting pod is removed, and remains healthy when pods use compatible SELinux labels.

openshift-ci · 2026-06-26T11:43:00Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci-robot · 2026-06-26T11:43:00Z

@rvagner78: This pull request references STOR-3013 which is a valid jira issue.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-06-26T11:43:10Z

Walkthrough

Adds a new e2e test suite that creates SELinux-labeled pods on a shared PVC, checks the upgrade readiness condition, and queries Thanos/Prometheus metrics and alerts during conflicting and non-conflicting pod scenarios.

Changes

SELinux mount readiness test

Layer / File(s)	Summary
Test setup and pod helpers `test/e2e/selinux_mount_readiness.go`	Creates the namespace and PVC, gates execution to CSI platforms, builds SELinux-labeled pause pods, waits for pod phase transitions, and polls the Storage condition.
Monitoring route and query access `test/e2e/selinux_mount_readiness.go`	Discovers the Thanos Querier URL, obtains a bearer token, sends authenticated monitoring requests, parses query totals, and reads alert states.
Conflicting pod readiness flow `test/e2e/selinux_mount_readiness.go`	Creates compatible and conflicting pods, waits for the readiness condition to turn false, checks the conflict metric and alert state, then deletes the conflicting pod and waits for recovery.
Compatible pod steady state `test/e2e/selinux_mount_readiness.go`	Verifies the readiness condition stays true when two pods share the same SELinux label on the same PVC.
Recover after pod recreation `test/e2e/selinux_mount_readiness.go`	Verifies the readiness condition returns to true after a conflicting pod is recreated with a compatible SELinux label.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 4

❌ Failed checks (4 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality	⚠️ Warning	Shared-PVC tests never co-locate pod2 with pod1, so they can hit Multi-Attach instead of SELinux conflict and violate existing e2e patterns.	Pin pod2 to pod1's node (or equivalent node affinity) in all shared-PVC cases; consider splitting the first It into narrower checks if needed.
Microshift Test Compatibility	⚠️ Warning	New e2e test lacks any MicroShift skip/tag and directly uses Storage CR plus openshift-monitoring/thanos-querier/prometheus-k8s, all unsupported on MicroShift.	Add a [Skipped:MicroShift] or relevant [apigroup:...] tag, or guard with exutil.IsMicroShiftCluster(), and avoid storage/operator and monitoring-stack dependencies.
Ipv6 And Disconnected Network Test Compatibility	⚠️ Warning	The test uses public `k8s.gcr.io/pause:3.2` via `PauseImage`, which can fail in disconnected environments; no IPv4-only code was found.	Use a mirrored/internal pause image or an image stream already available in-cluster; otherwise mark the test [Skipped:Disconnected].

✅ Passed checks (11 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding an e2e test for SELinuxMount upgrade readiness.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	All Ginkgo titles are fixed string literals; no dynamic values appear in Describe/It names.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	No multi-node or HA assumption found: the new e2e test has no SNO skip guard, anti-affinity, node scaling, or distinct-node requirement; multiple pods on one node are fine.
Topology-Aware Scheduling Compatibility	✅ Passed	The PR only adds e2e test pods; it introduces no node selectors, anti-affinity, spread constraints, or control-plane-targeted scheduling logic.
Ote Binary Stdout Contract	✅ Passed	The new e2e file has no process-level stdout writes; it only uses fmt.Sprintf and Ginkgo logging inside test blocks/helpers.
No-Weak-Crypto	✅ Passed	The only changed file is an e2e test; it contains no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, no custom crypto, and no secret/token comparisons.
Container-Privileges	✅ Passed	The new test pod explicitly runs non-root, drops ALL caps, and sets allowPrivilegeEscalation=false; no host namespaces or privileged flags were added.
No-Sensitive-Data-In-Logs	✅ Passed	Log calls only emit pod/namespace names and static labels; no tokens, passwords, API keys, or hostnames are logged.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.23.0)

test/e2e/selinux_mount_readiness.go

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.13][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

openshift-ci · 2026-06-26T11:43:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rvagner78
Once this PR has been reviewed and has the lgtm label, please assign dobsonj for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

test/e2e/selinux_mount_readiness.go (1)
84-89: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low value

Cleanup uses the timeout-bound testContext.

All deferred cleanups (namespace, pod1, pod2) use testContext, which is derived from ctx with testTimeout. If the test body times out, testContext is already canceled/expired and the cleanup deletes will fail immediately, leaking the namespace/PVC/pods. Consider a fresh context.Background() (with its own short timeout) for teardown.

Also applies to: 119-123, 132-136
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/selinux_mount_readiness.go` around lines 84 - 89, The deferred
cleanup paths in selinux_mount_readiness.go are using the timeout-bound
testContext, so teardown can be canceled before namespace/pod deletion runs.
Update the deferred cleanup blocks for the namespace and the pod cleanup helpers
to use a fresh background context, ideally with a short dedicated timeout for
teardown instead of reusing testContext. Make the change in the cleanup closures
that call cleanupNamespace and the pod cleanup functions so they can still run
after the main test context expires.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 132-136: The deferred cleanup for pod2 in the
selinux_mount_readiness test is using pod2.Name after it has been cleared, so it
may try to delete an empty pod name. Update the cleanup logic around the
deferred deletePodWithWait closure and the later pod2 deletion path to track
“already deleted” with a separate boolean or guard instead of mutating
pod2.Name. Keep the deferred cleanup using the real pod2.Name only when the pod
still exists, and ensure the explicit delete path sets the flag rather than
blanking the name.
- Around line 65-72: The platform check in the selinux mount readiness test can
panic because infra.Status.PlatformStatus may be nil before accessing Type.
Update the infrastructure lookup/skip logic in the test function to guard for a
nil PlatformStatus before the switch, and only read PlatformStatus.Type after
confirming the pointer exists; if it is nil, skip or fail the test with a clear
message using the existing infra.Status.PlatformStatus access path.
- Around line 330-340: The value parsing in the query response aggregation
silently ignores malformed samples because the error from fmt.Sscanf is dropped.
In the helper that loops over queryResp.Data.Result, replace the Sscanf-based
conversion with explicit parsing in that block, handle the parse error
immediately, and propagate it instead of adding a default 0 to total. Keep the
fix localized to the result-processing logic so the calling path can surface bad
sample values clearly.

---

Nitpick comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 84-89: The deferred cleanup paths in selinux_mount_readiness.go
are using the timeout-bound testContext, so teardown can be canceled before
namespace/pod deletion runs. Update the deferred cleanup blocks for the
namespace and the pod cleanup helpers to use a fresh background context, ideally
with a short dedicated timeout for teardown instead of reusing testContext. Make
the change in the cleanup closures that call cleanupNamespace and the pod
cleanup functions so they can still run after the main test context expires.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9b3f00f5-6920-4de8-983c-302066099c87

📥 Commits

Reviewing files that changed from the base of the PR and between 1715686 and 3c0cbd4.

📒 Files selected for processing (1)

test/e2e/selinux_mount_readiness.go

coderabbitai · 2026-06-26T11:49:17Z

+			infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{})
+			if err != nil {
+				g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err))
+			}
+			switch infra.Status.PlatformStatus.Type {
+			case configv1.BareMetalPlatformType, configv1.NonePlatformType:
+				g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type))
+			}


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Possible nil-pointer dereference on PlatformStatus.

infra.Status.PlatformStatus is a pointer and can be nil. Accessing .Type on Line 69/71 would panic before any platform skip logic runs. Guard for nil first.

🛡️ Proposed guard

+ if infra.Status.PlatformStatus == nil { + g.Skip("Test not applicable: infrastructure PlatformStatus is not set") + } switch infra.Status.PlatformStatus.Type {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{})

if err != nil {

g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err))

}

switch infra.Status.PlatformStatus.Type {

case configv1.BareMetalPlatformType, configv1.NonePlatformType:

g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type))

}

infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{})

if err != nil {

g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err))

}

if infra.Status.PlatformStatus == nil {

g.Skip("Test not applicable: infrastructure PlatformStatus is not set")

}

switch infra.Status.PlatformStatus.Type {

case configv1.BareMetalPlatformType, configv1.NonePlatformType:

g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type))

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/selinux_mount_readiness.go` around lines 65 - 72, The platform check in the selinux mount readiness test can panic because infra.Status.PlatformStatus may be nil before accessing Type. Update the infrastructure lookup/skip logic in the test function to guard for a nil PlatformStatus before the switch, and only read PlatformStatus.Type after confirming the pointer exists; if it is nil, skip or fail the test with a clear message using the existing infra.Status.PlatformStatus access path.

Source: Linters/SAST tools

coderabbitai · 2026-06-26T11:49:17Z

+	var total float64
+	for _, result := range queryResp.Data.Result {
+		if len(result.Value) >= 2 {
+			if valStr, ok := result.Value[1].(string); ok {
+				var val float64
+				fmt.Sscanf(valStr, "%f", &val)
+				total += val
+			}
+		}
+	}
+	return total, nil


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Silent parse failure from unchecked fmt.Sscanf.

The fmt.Sscanf error is ignored (Line 335), so a malformed value silently contributes 0 to total rather than surfacing a parse problem. Prefer strconv.ParseFloat with explicit error handling for the sample value.

♻️ Proposed change

- if valStr, ok := result.Value[1].(string); ok { - var val float64 - fmt.Sscanf(valStr, "%f", &val) - total += val - } + if valStr, ok := result.Value[1].(string); ok { + val, err := strconv.ParseFloat(valStr, 64) + if err != nil { + return 0, fmt.Errorf("failed to parse metric value %q: %w", valStr, err) + } + total += val + }

As per path instructions: "Never ignore error returns".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

var total float64

for _, result := range queryResp.Data.Result {

if len(result.Value) >= 2 {

if valStr, ok := result.Value[1].(string); ok {

var val float64

fmt.Sscanf(valStr, "%f", &val)

total += val

}

}

}

return total, nil

var total float64

for _, result := range queryResp.Data.Result {

if len(result.Value) >= 2 {

if valStr, ok := result.Value[1].(string); ok {

val, err := strconv.ParseFloat(valStr, 64)

if err != nil {

return 0, fmt.Errorf("failed to parse metric value %q: %w", valStr, err)

}

total += val

}

}

}

return total, nil

🧰 Tools

🪛 golangci-lint (2.12.2)

[error] 335-335: Error return value of fmt.Sscanf is not checked

(errcheck)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/selinux_mount_readiness.go` around lines 330 - 340, The value parsing in the query response aggregation silently ignores malformed samples because the error from fmt.Sscanf is dropped. In the helper that loops over queryResp.Data.Result, replace the Sscanf-based conversion with explicit parsing in that block, handle the parse error immediately, and propagate it instead of adding a default 0 to total. Keep the fix localized to the result-processing logic so the calling path can surface bad sample values clearly.

Sources: Path instructions, Linters/SAST tools

rvagner78 · 2026-06-26T11:59:27Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/api#2882 openshift/kubernetes#2671 #714 #715

openshift-ci · 2026-06-26T11:59:34Z

@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7b96e9d0-7156-11f1-9c62-ad103feffb0c-0

openshift-ci · 2026-06-26T11:59:37Z

@rvagner78: This PR was included in a payload test run from #715
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7b96e9d0-7156-11f1-9c62-ad103feffb0c-0

jsafrane · 2026-06-26T13:40:21Z

/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview

Running the test without implementation to see how it fails. It should fail nicely.

openshift-ci · 2026-06-26T13:40:27Z

@jsafrane: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/943f7110-7164-11f1-93a6-16440f096f18-0

jsafrane · 2026-06-26T13:46:48Z

+			kubeConfig   *rest.Config
+		)
+
+		g.BeforeEach(func() {


I hink the platfom check + namespace creation should be in BeforeEach + their cleanup in AfterEach.
Maybe even PVC creation.

Ginkgo cancels the BeforeEach context when the hook returns, so derive the shared test context from context.Background() instead. Co-authored-by: Cursor <cursoragent@cursor.com>

The thanos-querier route rejects unauthenticated requests with 401. Use a bearer token from kubeconfig or the prometheus-k8s service account. Co-authored-by: Cursor <cursoragent@cursor.com>

rvagner78 · 2026-06-29T13:16:16Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715

openshift-ci · 2026-06-29T13:16:21Z

@rvagner78: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 71-73: The standalone timeout context creation in the BeforeEach
hook is triggering the fatcontext lint warning. Move the context setup out of
the BeforeEach function literal into a small helper (for example, a dedicated
function used by selinux_mount_readiness.go) that returns the context and cancel
pair, then call that helper from BeforeEach so the lifetime stays standalone
while satisfying golangci-lint.
- Around line 109-111: The shared-PVC SELinux test currently creates pod2
without ensuring it lands on the same node as pod1, so it can fail for
Multi-Attach instead of validating the SELinux-label behavior. In the SELinux
readiness test around newSelinuxTestPod and the pod2 creation flow, wait for
pod1 to reach Running, read pod1.Spec.NodeName, and force pod2 onto that node
(or use equivalent node affinity) before calling CoreV1().Pods(ns.Name).Create
for each shared-PVC case.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5dafda6e-c1eb-4a8a-b4b7-1540707759e1

📥 Commits

Reviewing files that changed from the base of the PR and between 3c0cbd4 and 4f50231.

📒 Files selected for processing (1)

test/e2e/selinux_mount_readiness.go

coderabbitai · 2026-06-29T13:23:16Z

+			// Use a standalone context: Ginkgo cancels the context passed to
+			// BeforeEach when the hook returns, before the It block runs.
+			testContext, cancel = context.WithTimeout(context.Background(), testTimeout)


📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Resolve the fatcontext lint failure before merge.

context.WithTimeout(context.Background(), ...) inside the BeforeEach function literal is currently reported by golangci-lint. Move the context construction into a small helper to preserve the standalone lifetime while satisfying the linter.

Proposed fix

+func newSelinuxReadinessContext() (context.Context, context.CancelFunc) { + return context.WithTimeout(context.Background(), testTimeout) +} + ... - testContext, cancel = context.WithTimeout(context.Background(), testTimeout) + testContext, cancel = newSelinuxReadinessContext()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Use a standalone context: Ginkgo cancels the context passed to

// BeforeEach when the hook returns, before the It block runs.

testContext, cancel = context.WithTimeout(context.Background(), testTimeout)

func newSelinuxReadinessContext() (context.Context, context.CancelFunc) {

return context.WithTimeout(context.Background(), testTimeout)

}

...

testContext, cancel = newSelinuxReadinessContext()

🧰 Tools

🪛 golangci-lint (2.12.2)

[error] 73-73: nested context in function literal

(fatcontext)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/selinux_mount_readiness.go` around lines 71 - 73, The standalone timeout context creation in the BeforeEach hook is triggering the fatcontext lint warning. Move the context setup out of the BeforeEach function literal into a small helper (for example, a dedicated function used by selinux_mount_readiness.go) that returns the context and cancel pair, then call that helper from BeforeEach so the lifetime stays standalone while satisfying golangci-lint.

Source: Linters/SAST tools

coderabbitai · 2026-06-29T13:23:16Z

+			g.By(fmt.Sprintf("Creating pod with SELinux level %s", selinuxTestLevelConflicting))
+			pod2 := newSelinuxTestPod(ns.Name, "selinux-test-pod-2", pvc.Name, selinuxTestLevelConflicting)
+			pod2, err = kubeClient.CoreV1().Pods(ns.Name).Create(testContext, pod2, metav1.CreateOptions{})


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the shared-PVC pod creation paths and whether pod2 is pinned to pod1's node. rg -n -C3 'ReadWriteOnce|newSelinuxTestPod|NodeName|PodAffinity|selinux-test-pod-2' test/e2e/selinux_mount_readiness.go

Repository: openshift/cluster-storage-operator

Length of output: 3834

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the relevant test flow and pod template helper. sed -n '80,140p' test/e2e/selinux_mount_readiness.go printf '\n---\n' sed -n '180,280p' test/e2e/selinux_mount_readiness.go printf '\n---\n' sed -n '280,420p' test/e2e/selinux_mount_readiness.go

Repository: openshift/cluster-storage-operator

Length of output: 12197

Pin pod2 to pod1’s node for the shared PVC tests.
RWO alone doesn’t guarantee co-location, so pod2 can land on another node and get stuck Pending/Multi-Attach instead of exercising the SELinux-label path. Wait for pod1 to be Running, read pod1.Spec.NodeName, and reuse it (or equivalent node affinity) in each shared-PVC case.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/selinux_mount_readiness.go` around lines 109 - 111, The shared-PVC SELinux test currently creates pod2 without ensuring it lands on the same node as pod1, so it can fail for Multi-Attach instead of validating the SELinux-label behavior. In the SELinux readiness test around newSelinuxTestPod and the pod2 creation flow, wait for pod1 to reach Running, read pod1.Spec.NodeName, and force pod2 onto that node (or use equivalent node affinity) before calling CoreV1().Pods(ns.Name).Create for each shared-PVC case.

rvagner78 · 2026-06-29T13:32:07Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715

openshift-ci · 2026-06-29T13:32:13Z

@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ecea6200-73be-11f1-80a5-d953eec5d313-0

openshift-ci · 2026-06-29T13:32:15Z

@rvagner78: This PR was included in a payload test run from #715
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ecea6200-73be-11f1-80a5-d953eec5d313-0

rvagner78 · 2026-06-30T14:35:51Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715

openshift-ci · 2026-06-30T14:36:21Z

@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ff343440-7490-11f1-893c-a6a79a083dc4-0

openshift-ci · 2026-06-30T14:36:25Z

@rvagner78: This PR was included in a payload test run from #715
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ff343440-7490-11f1-893c-a6a79a083dc4-0

jsafrane · 2026-07-01T08:55:55Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714 #715

this should succeed

openshift-ci · 2026-07-01T08:55:57Z

@jsafrane: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

jsafrane · 2026-07-01T08:56:42Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714

openshift-ci · 2026-07-01T08:56:46Z

@jsafrane: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-1of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-2of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c7f7fb70-752a-11f1-8353-8747cb10a2fe-0

jsafrane · 2026-07-01T08:57:14Z

/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial

this should fail, as the implementation is missing in KCM

openshift-ci · 2026-07-01T08:57:24Z

@jsafrane: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-1of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-2of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/db69db60-752a-11f1-8455-70456461eb30-0

rvagner78 · 2026-07-01T11:59:46Z

/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714 #715

openshift-ci · 2026-07-01T11:59:53Z

@rvagner78: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-1of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-2of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5b0f6ec0-7544-11f1-86d8-78f0905bdb7b-0

openshift-ci · 2026-07-01T11:59:54Z

@rvagner78: This PR was included in a payload test run from #715
trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-1of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-2of3
periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5b0f6ec0-7544-11f1-86d8-78f0905bdb7b-0

jsafrane · 2026-07-01T13:35:30Z

+	g.By("Creating test namespace")
+	ns := &v1.Namespace{
+		ObjectMeta: metav1.ObjectMeta{
+			GenerateName: "selinux-readiness-test-",


the namespace name should start with e2e-. Not all test namespaces have it, but it's a nice habbit.

jsafrane · 2026-07-01T13:45:35Z

+
+			// Use a standalone context: Ginkgo cancels the context passed to
+			// BeforeEach when the hook returns, before the It block runs.
+			testContext, cancel = context.WithTimeout(context.Background(), testTimeout)


This makes the test steps dependent on each other. If CSO takes 9 minutes to mark a cluster un-upgradeable, then it has remaining 1 minute to check the metrics + alert + wait for cluster upgradeable.

It's better to split them to individual timeouts - ClusterOperator change timeout say 5 minutes, alert 2 minutes, prometheus 2 minutes. Both Prometheus and alert should be already available when CSO marks the cluster upradeable=false, but let's give it some wiggle space.

In most cases, just replace PollUntilContextCancel with PollUntilContextTimeout with the right timeout

jsafrane · 2026-07-01T13:45:58Z

+	return ns
+}
+
+func cleanupSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) {


Suggested change

func cleanupSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) {

func cleanupSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) {

https://google.github.io/styleguide/go/decisions.html#initialisms

jsafrane · 2026-07-01T13:46:08Z

+	}
+}
+
+func createSelinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim {


Suggested change

func createSelinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim {

func createSELinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim {

jsafrane · 2026-07-01T13:49:12Z

+
+func waitForOperatorCondition(ctx context.Context, opClient opclient.Interface, conditionType string, expectedStatus operatorapi.ConditionStatus) error {
+	return wait.PollUntilContextCancel(ctx, waitPollInterval, true, func(ctx context.Context) (bool, error) {
+		storage, err := opClient.OperatorV1().Storages().Get(ctx, "cluster", metav1.GetOptions{})


I think we should test the overall output, i.e. that ClusterOperator storage has Upgradeable: False.
Testing Storage CR cluster is good to have too.

jsafrane · 2026-07-01T13:57:03Z

+		g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err))
+	}
+	switch infra.Status.PlatformStatus.Type {
+	case configv1.BareMetalPlatformType, configv1.NonePlatformType:


The test can run on any platform, but it must have the default StorageClass with a CSI driver that has SELinuxMount: true.

I.e. list all default storage classes (they have a distinct annotation), in each of them find provisioner: field, and for each provisioner find CSIDriver instance with the same name and its SELinuxMount fields. In case there are multiple default storage classes, all of them must match.

Skip the test if the CSIDriver does not exist or there is no default StorageClass. Throw an error on any other error.

jsafrane · 2026-07-01T14:01:07Z

+			}
+		})
+
+		g.It("should become upgradeable again when a conflicting pod is recreated with a compatible SELinux label", func() {


Add a similar test, but instead of setting the same SELinuxLabel, change SELinuxChangePolicy to Recursive in both Pods. The cluster should get upgradeable too.

And you can easily notice there is a lot of code shared among all e2e tests - always create 2 pods and wait for upgradeable=false, change one or both pods and wait for something. Make a function for this.

It might be better to create Deployments and then edit the Deployments to make the cluster upgradeable. It will reflect better what users do. They don't create Pods.

jsafrane · 2026-07-01T14:01:48Z

+	}
+}
+
+func createSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace {


Suggested change

func createSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace {

func createSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace {

jsafrane · 2026-07-01T14:14:28Z

+	spec, ok := route.Object["spec"].(map[string]interface{})
+	if !ok {
+		return "", fmt.Errorf("thanos-querier route has no spec")
+	}
+	host, ok := spec["host"].(string)


There are some nested helpers, try unstructured.NestedString(route.Object, "spec", "host") and unstructured.NestedString(route.Object, "status", "ingress", "0", "host") (AI generated, I am not sure about "0" there)

STOR-3013: Add e2e test for SELinuxMount upgrade readiness

3c0cbd4

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 26, 2026

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 26, 2026

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

This was referenced Jun 26, 2026

STOR-2771: Add SELinuxMountGAReadiness FG openshift/api#2882

Merged

STOR-2963: Save SELinuxWarningController upgradeability to a ConfigMap openshift/kubernetes#2671

Open

openshift-ci Bot mentioned this pull request Jun 26, 2026

STOR-3011: Implement CSO upgrade check and Prometheus alerts for SELinuxMount readiness #714

Draft

jsafrane reviewed Jun 26, 2026

View reviewed changes

rvagner78 and others added 3 commits June 29, 2026 13:44

STOR-3011: additional e2e tests

a52cd9f

STOR-3013: Fix canceled context in SELinux readiness e2e setup

c17c09b

Ginkgo cancels the BeforeEach context when the hook returns, so derive the shared test context from context.Background() instead. Co-authored-by: Cursor <cursoragent@cursor.com>

STOR-3013: Authenticate Thanos Querier requests with bearer token

4f50231

The thanos-querier route rejects unauthenticated requests with 401. Use a bearer token from kubeconfig or the prometheus-k8s service account. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

jsafrane reviewed Jul 1, 2026

View reviewed changes

	func cleanupSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) {
	func cleanupSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) {

	func createSelinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim {
	func createSELinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim {

	func createSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace {
	func createSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace {

Uh oh!

Conversation

rvagner78 commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented Jun 26, 2026

Uh oh!

openshift-ci-robot commented Jun 26, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (4 warnings)

Uh oh!

openshift-ci Bot commented Jun 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

rvagner78 commented Jun 26, 2026

Uh oh!

openshift-ci Bot commented Jun 26, 2026

Uh oh!

openshift-ci Bot commented Jun 26, 2026

Uh oh!

jsafrane commented Jun 26, 2026

Uh oh!

openshift-ci Bot commented Jun 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rvagner78 commented Jun 29, 2026

Uh oh!

openshift-ci Bot commented Jun 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

rvagner78 commented Jun 29, 2026

Uh oh!

openshift-ci Bot commented Jun 29, 2026

Uh oh!

openshift-ci Bot commented Jun 29, 2026

Uh oh!

rvagner78 commented Jun 30, 2026

Uh oh!

openshift-ci Bot commented Jun 30, 2026

Uh oh!

openshift-ci Bot commented Jun 30, 2026

Uh oh!

jsafrane commented Jul 1, 2026

Uh oh!

openshift-ci Bot commented Jul 1, 2026

Uh oh!

jsafrane commented Jul 1, 2026

Uh oh!

openshift-ci Bot commented Jul 1, 2026

Uh oh!

jsafrane commented Jul 1, 2026

Uh oh!

openshift-ci Bot commented Jul 1, 2026

Uh oh!

rvagner78 commented Jul 1, 2026

rvagner78 commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Jun 26, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading