STOR-3013: Add e2e test for SELinuxMount upgrade readiness#715
STOR-3013: Add e2e test for SELinuxMount upgrade readiness#715rvagner78 wants to merge 4 commits into
Conversation
|
Skipping CI for Draft Pull Request. |
|
@rvagner78: This pull request references STOR-3013 which is a valid jira issue. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds a new e2e test suite that creates SELinux-labeled pods on a shared PVC, checks the upgrade readiness condition, and queries Thanos/Prometheus metrics and alerts during conflicting and non-conflicting pod scenarios. ChangesSELinux mount readiness test
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 4❌ Failed checks (4 warnings)
✅ Passed checks (11 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.23.0)test/e2e/selinux_mount_readiness.go┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.13][ERROR]: unable to find a config; path Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rvagner78 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
test/e2e/selinux_mount_readiness.go (1)
84-89: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valueCleanup uses the timeout-bound
testContext.All deferred cleanups (namespace, pod1, pod2) use
testContext, which is derived fromctxwithtestTimeout. If the test body times out,testContextis already canceled/expired and the cleanup deletes will fail immediately, leaking the namespace/PVC/pods. Consider a freshcontext.Background()(with its own short timeout) for teardown.Also applies to: 119-123, 132-136
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/selinux_mount_readiness.go` around lines 84 - 89, The deferred cleanup paths in selinux_mount_readiness.go are using the timeout-bound testContext, so teardown can be canceled before namespace/pod deletion runs. Update the deferred cleanup blocks for the namespace and the pod cleanup helpers to use a fresh background context, ideally with a short dedicated timeout for teardown instead of reusing testContext. Make the change in the cleanup closures that call cleanupNamespace and the pod cleanup functions so they can still run after the main test context expires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 132-136: The deferred cleanup for pod2 in the
selinux_mount_readiness test is using pod2.Name after it has been cleared, so it
may try to delete an empty pod name. Update the cleanup logic around the
deferred deletePodWithWait closure and the later pod2 deletion path to track
“already deleted” with a separate boolean or guard instead of mutating
pod2.Name. Keep the deferred cleanup using the real pod2.Name only when the pod
still exists, and ensure the explicit delete path sets the flag rather than
blanking the name.
- Around line 65-72: The platform check in the selinux mount readiness test can
panic because infra.Status.PlatformStatus may be nil before accessing Type.
Update the infrastructure lookup/skip logic in the test function to guard for a
nil PlatformStatus before the switch, and only read PlatformStatus.Type after
confirming the pointer exists; if it is nil, skip or fail the test with a clear
message using the existing infra.Status.PlatformStatus access path.
- Around line 330-340: The value parsing in the query response aggregation
silently ignores malformed samples because the error from fmt.Sscanf is dropped.
In the helper that loops over queryResp.Data.Result, replace the Sscanf-based
conversion with explicit parsing in that block, handle the parse error
immediately, and propagate it instead of adding a default 0 to total. Keep the
fix localized to the result-processing logic so the calling path can surface bad
sample values clearly.
---
Nitpick comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 84-89: The deferred cleanup paths in selinux_mount_readiness.go
are using the timeout-bound testContext, so teardown can be canceled before
namespace/pod deletion runs. Update the deferred cleanup blocks for the
namespace and the pod cleanup helpers to use a fresh background context, ideally
with a short dedicated timeout for teardown instead of reusing testContext. Make
the change in the cleanup closures that call cleanupNamespace and the pod
cleanup functions so they can still run after the main test context expires.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9b3f00f5-6920-4de8-983c-302066099c87
📒 Files selected for processing (1)
test/e2e/selinux_mount_readiness.go
| infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{}) | ||
| if err != nil { | ||
| g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err)) | ||
| } | ||
| switch infra.Status.PlatformStatus.Type { | ||
| case configv1.BareMetalPlatformType, configv1.NonePlatformType: | ||
| g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type)) | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Possible nil-pointer dereference on PlatformStatus.
infra.Status.PlatformStatus is a pointer and can be nil. Accessing .Type on Line 69/71 would panic before any platform skip logic runs. Guard for nil first.
🛡️ Proposed guard
+ if infra.Status.PlatformStatus == nil {
+ g.Skip("Test not applicable: infrastructure PlatformStatus is not set")
+ }
switch infra.Status.PlatformStatus.Type {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{}) | |
| if err != nil { | |
| g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err)) | |
| } | |
| switch infra.Status.PlatformStatus.Type { | |
| case configv1.BareMetalPlatformType, configv1.NonePlatformType: | |
| g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type)) | |
| } | |
| infra, err := configClient.ConfigV1().Infrastructures().Get(testContext, "cluster", metav1.GetOptions{}) | |
| if err != nil { | |
| g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err)) | |
| } | |
| if infra.Status.PlatformStatus == nil { | |
| g.Skip("Test not applicable: infrastructure PlatformStatus is not set") | |
| } | |
| switch infra.Status.PlatformStatus.Type { | |
| case configv1.BareMetalPlatformType, configv1.NonePlatformType: | |
| g.Skip(fmt.Sprintf("Test not applicable for platform %s (no default CSI storage)", infra.Status.PlatformStatus.Type)) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/selinux_mount_readiness.go` around lines 65 - 72, The platform check
in the selinux mount readiness test can panic because
infra.Status.PlatformStatus may be nil before accessing Type. Update the
infrastructure lookup/skip logic in the test function to guard for a nil
PlatformStatus before the switch, and only read PlatformStatus.Type after
confirming the pointer exists; if it is nil, skip or fail the test with a clear
message using the existing infra.Status.PlatformStatus access path.
Source: Linters/SAST tools
| var total float64 | ||
| for _, result := range queryResp.Data.Result { | ||
| if len(result.Value) >= 2 { | ||
| if valStr, ok := result.Value[1].(string); ok { | ||
| var val float64 | ||
| fmt.Sscanf(valStr, "%f", &val) | ||
| total += val | ||
| } | ||
| } | ||
| } | ||
| return total, nil |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Silent parse failure from unchecked fmt.Sscanf.
The fmt.Sscanf error is ignored (Line 335), so a malformed value silently contributes 0 to total rather than surfacing a parse problem. Prefer strconv.ParseFloat with explicit error handling for the sample value.
♻️ Proposed change
- if valStr, ok := result.Value[1].(string); ok {
- var val float64
- fmt.Sscanf(valStr, "%f", &val)
- total += val
- }
+ if valStr, ok := result.Value[1].(string); ok {
+ val, err := strconv.ParseFloat(valStr, 64)
+ if err != nil {
+ return 0, fmt.Errorf("failed to parse metric value %q: %w", valStr, err)
+ }
+ total += val
+ }As per path instructions: "Never ignore error returns".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var total float64 | |
| for _, result := range queryResp.Data.Result { | |
| if len(result.Value) >= 2 { | |
| if valStr, ok := result.Value[1].(string); ok { | |
| var val float64 | |
| fmt.Sscanf(valStr, "%f", &val) | |
| total += val | |
| } | |
| } | |
| } | |
| return total, nil | |
| var total float64 | |
| for _, result := range queryResp.Data.Result { | |
| if len(result.Value) >= 2 { | |
| if valStr, ok := result.Value[1].(string); ok { | |
| val, err := strconv.ParseFloat(valStr, 64) | |
| if err != nil { | |
| return 0, fmt.Errorf("failed to parse metric value %q: %w", valStr, err) | |
| } | |
| total += val | |
| } | |
| } | |
| } | |
| return total, nil |
🧰 Tools
🪛 golangci-lint (2.12.2)
[error] 335-335: Error return value of fmt.Sscanf is not checked
(errcheck)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/selinux_mount_readiness.go` around lines 330 - 340, The value
parsing in the query response aggregation silently ignores malformed samples
because the error from fmt.Sscanf is dropped. In the helper that loops over
queryResp.Data.Result, replace the Sscanf-based conversion with explicit parsing
in that block, handle the parse error immediately, and propagate it instead of
adding a default 0 to total. Keep the fix localized to the result-processing
logic so the calling path can surface bad sample values clearly.
Sources: Path instructions, Linters/SAST tools
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/api#2882 openshift/kubernetes#2671 #714 #715 |
|
@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7b96e9d0-7156-11f1-9c62-ad103feffb0c-0 |
|
@rvagner78: This PR was included in a payload test run from #715
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7b96e9d0-7156-11f1-9c62-ad103feffb0c-0 |
|
/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview Running the test without implementation to see how it fails. It should fail nicely. |
|
@jsafrane: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/943f7110-7164-11f1-93a6-16440f096f18-0 |
| kubeConfig *rest.Config | ||
| ) | ||
|
|
||
| g.BeforeEach(func() { |
There was a problem hiding this comment.
I hink the platfom check + namespace creation should be in BeforeEach + their cleanup in AfterEach.
Maybe even PVC creation.
Ginkgo cancels the BeforeEach context when the hook returns, so derive the shared test context from context.Background() instead. Co-authored-by: Cursor <cursoragent@cursor.com>
The thanos-querier route rejects unauthenticated requests with 401. Use a bearer token from kubeconfig or the prometheus-k8s service account. Co-authored-by: Cursor <cursoragent@cursor.com>
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715 |
|
@rvagner78: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/selinux_mount_readiness.go`:
- Around line 71-73: The standalone timeout context creation in the BeforeEach
hook is triggering the fatcontext lint warning. Move the context setup out of
the BeforeEach function literal into a small helper (for example, a dedicated
function used by selinux_mount_readiness.go) that returns the context and cancel
pair, then call that helper from BeforeEach so the lifetime stays standalone
while satisfying golangci-lint.
- Around line 109-111: The shared-PVC SELinux test currently creates pod2
without ensuring it lands on the same node as pod1, so it can fail for
Multi-Attach instead of validating the SELinux-label behavior. In the SELinux
readiness test around newSelinuxTestPod and the pod2 creation flow, wait for
pod1 to reach Running, read pod1.Spec.NodeName, and force pod2 onto that node
(or use equivalent node affinity) before calling CoreV1().Pods(ns.Name).Create
for each shared-PVC case.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5dafda6e-c1eb-4a8a-b4b7-1540707759e1
📒 Files selected for processing (1)
test/e2e/selinux_mount_readiness.go
| // Use a standalone context: Ginkgo cancels the context passed to | ||
| // BeforeEach when the hook returns, before the It block runs. | ||
| testContext, cancel = context.WithTimeout(context.Background(), testTimeout) |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Resolve the fatcontext lint failure before merge.
context.WithTimeout(context.Background(), ...) inside the BeforeEach function literal is currently reported by golangci-lint. Move the context construction into a small helper to preserve the standalone lifetime while satisfying the linter.
Proposed fix
+func newSelinuxReadinessContext() (context.Context, context.CancelFunc) {
+ return context.WithTimeout(context.Background(), testTimeout)
+}
+
...
- testContext, cancel = context.WithTimeout(context.Background(), testTimeout)
+ testContext, cancel = newSelinuxReadinessContext()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Use a standalone context: Ginkgo cancels the context passed to | |
| // BeforeEach when the hook returns, before the It block runs. | |
| testContext, cancel = context.WithTimeout(context.Background(), testTimeout) | |
| func newSelinuxReadinessContext() (context.Context, context.CancelFunc) { | |
| return context.WithTimeout(context.Background(), testTimeout) | |
| } | |
| ... | |
| testContext, cancel = newSelinuxReadinessContext() |
🧰 Tools
🪛 golangci-lint (2.12.2)
[error] 73-73: nested context in function literal
(fatcontext)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/selinux_mount_readiness.go` around lines 71 - 73, The standalone
timeout context creation in the BeforeEach hook is triggering the fatcontext
lint warning. Move the context setup out of the BeforeEach function literal into
a small helper (for example, a dedicated function used by
selinux_mount_readiness.go) that returns the context and cancel pair, then call
that helper from BeforeEach so the lifetime stays standalone while satisfying
golangci-lint.
Source: Linters/SAST tools
| g.By(fmt.Sprintf("Creating pod with SELinux level %s", selinuxTestLevelConflicting)) | ||
| pod2 := newSelinuxTestPod(ns.Name, "selinux-test-pod-2", pvc.Name, selinuxTestLevelConflicting) | ||
| pod2, err = kubeClient.CoreV1().Pods(ns.Name).Create(testContext, pod2, metav1.CreateOptions{}) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the shared-PVC pod creation paths and whether pod2 is pinned to pod1's node.
rg -n -C3 'ReadWriteOnce|newSelinuxTestPod|NodeName|PodAffinity|selinux-test-pod-2' test/e2e/selinux_mount_readiness.goRepository: openshift/cluster-storage-operator
Length of output: 3834
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the relevant test flow and pod template helper.
sed -n '80,140p' test/e2e/selinux_mount_readiness.go
printf '\n---\n'
sed -n '180,280p' test/e2e/selinux_mount_readiness.go
printf '\n---\n'
sed -n '280,420p' test/e2e/selinux_mount_readiness.goRepository: openshift/cluster-storage-operator
Length of output: 12197
Pin pod2 to pod1’s node for the shared PVC tests.
RWO alone doesn’t guarantee co-location, so pod2 can land on another node and get stuck Pending/Multi-Attach instead of exercising the SELinux-label path. Wait for pod1 to be Running, read pod1.Spec.NodeName, and reuse it (or equivalent node affinity) in each shared-PVC case.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/selinux_mount_readiness.go` around lines 109 - 111, The shared-PVC
SELinux test currently creates pod2 without ensuring it lands on the same node
as pod1, so it can fail for Multi-Attach instead of validating the SELinux-label
behavior. In the SELinux readiness test around newSelinuxTestPod and the pod2
creation flow, wait for pod1 to reach Running, read pod1.Spec.NodeName, and
force pod2 onto that node (or use equivalent node affinity) before calling
CoreV1().Pods(ns.Name).Create for each shared-PVC case.
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715 |
|
@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ecea6200-73be-11f1-80a5-d953eec5d313-0 |
|
@rvagner78: This PR was included in a payload test run from #715
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ecea6200-73be-11f1-80a5-d953eec5d313-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn-techpreview openshift/kubernetes#2671 #714 #715 |
|
@rvagner78: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ff343440-7490-11f1-893c-a6a79a083dc4-0 |
|
@rvagner78: This PR was included in a payload test run from #715
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ff343440-7490-11f1-893c-a6a79a083dc4-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714 #715 this should succeed |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714 |
|
@jsafrane: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c7f7fb70-752a-11f1-8353-8747cb10a2fe-0 |
|
/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial this should fail, as the implementation is missing in KCM |
|
@jsafrane: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/db69db60-752a-11f1-8455-70456461eb30-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-techpreview-serial openshift/kubernetes#2671 #714 #715 |
|
@rvagner78: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5b0f6ec0-7544-11f1-86d8-78f0905bdb7b-0 |
|
@rvagner78: This PR was included in a payload test run from #715
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5b0f6ec0-7544-11f1-86d8-78f0905bdb7b-0 |
| g.By("Creating test namespace") | ||
| ns := &v1.Namespace{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| GenerateName: "selinux-readiness-test-", |
There was a problem hiding this comment.
the namespace name should start with e2e-. Not all test namespaces have it, but it's a nice habbit.
|
|
||
| // Use a standalone context: Ginkgo cancels the context passed to | ||
| // BeforeEach when the hook returns, before the It block runs. | ||
| testContext, cancel = context.WithTimeout(context.Background(), testTimeout) |
There was a problem hiding this comment.
This makes the test steps dependent on each other. If CSO takes 9 minutes to mark a cluster un-upgradeable, then it has remaining 1 minute to check the metrics + alert + wait for cluster upgradeable.
It's better to split them to individual timeouts - ClusterOperator change timeout say 5 minutes, alert 2 minutes, prometheus 2 minutes. Both Prometheus and alert should be already available when CSO marks the cluster upradeable=false, but let's give it some wiggle space.
There was a problem hiding this comment.
In most cases, just replace PollUntilContextCancel with PollUntilContextTimeout with the right timeout
| return ns | ||
| } | ||
|
|
||
| func cleanupSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) { |
There was a problem hiding this comment.
| func cleanupSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) { | |
| func cleanupSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface, namespace string) { |
There was a problem hiding this comment.
| } | ||
| } | ||
|
|
||
| func createSelinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim { |
There was a problem hiding this comment.
| func createSelinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim { | |
| func createSELinuxReadinessPVC(ctx context.Context, kubeClient kubernetes.Interface, namespace string) *v1.PersistentVolumeClaim { |
|
|
||
| func waitForOperatorCondition(ctx context.Context, opClient opclient.Interface, conditionType string, expectedStatus operatorapi.ConditionStatus) error { | ||
| return wait.PollUntilContextCancel(ctx, waitPollInterval, true, func(ctx context.Context) (bool, error) { | ||
| storage, err := opClient.OperatorV1().Storages().Get(ctx, "cluster", metav1.GetOptions{}) |
There was a problem hiding this comment.
I think we should test the overall output, i.e. that ClusterOperator storage has Upgradeable: False.
Testing Storage CR cluster is good to have too.
| g.Fail(fmt.Sprintf("Failed to get infrastructure: %v", err)) | ||
| } | ||
| switch infra.Status.PlatformStatus.Type { | ||
| case configv1.BareMetalPlatformType, configv1.NonePlatformType: |
There was a problem hiding this comment.
The test can run on any platform, but it must have the default StorageClass with a CSI driver that has SELinuxMount: true.
I.e. list all default storage classes (they have a distinct annotation), in each of them find provisioner: field, and for each provisioner find CSIDriver instance with the same name and its SELinuxMount fields. In case there are multiple default storage classes, all of them must match.
Skip the test if the CSIDriver does not exist or there is no default StorageClass. Throw an error on any other error.
| } | ||
| }) | ||
|
|
||
| g.It("should become upgradeable again when a conflicting pod is recreated with a compatible SELinux label", func() { |
There was a problem hiding this comment.
Add a similar test, but instead of setting the same SELinuxLabel, change SELinuxChangePolicy to Recursive in both Pods. The cluster should get upgradeable too.
And you can easily notice there is a lot of code shared among all e2e tests - always create 2 pods and wait for upgradeable=false, change one or both pods and wait for something. Make a function for this.
It might be better to create Deployments and then edit the Deployments to make the cluster upgradeable. It will reflect better what users do. They don't create Pods.
| } | ||
| } | ||
|
|
||
| func createSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace { |
There was a problem hiding this comment.
| func createSelinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace { | |
| func createSELinuxReadinessNamespace(ctx context.Context, kubeClient kubernetes.Interface) *v1.Namespace { |
| spec, ok := route.Object["spec"].(map[string]interface{}) | ||
| if !ok { | ||
| return "", fmt.Errorf("thanos-querier route has no spec") | ||
| } | ||
| host, ok := spec["host"].(string) |
There was a problem hiding this comment.
There are some nested helpers, try unstructured.NestedString(route.Object, "spec", "host") and unstructured.NestedString(route.Object, "status", "ingress", "0", "host") (AI generated, I am not sure about "0" there)
Summary by CodeRabbit