Skip to content

STOR-2868: Implement device relinking#598

Open
gnufied wants to merge 9 commits intoopenshift:mainfrom
gnufied:implement-device-relinking
Open

STOR-2868: Implement device relinking#598
gnufied wants to merge 9 commits intoopenshift:mainfrom
gnufied:implement-device-relinking

Conversation

@gnufied
Copy link
Member

@gnufied gnufied commented Mar 24, 2026

gnufied added 6 commits March 23, 2026 11:45
Fix rbac rules and bugs to enable creation of lvdl objects
refactor code to findexisting lvldl
Change UpdateStatus to ApplyStatus
Refactor the code to directly use BlockDevice
Add comments about processRejectedDevicesForDeviceLinks
@coderabbitai
Copy link

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds LocalVolumeDeviceLink management and symlink reconciliation: new internal DeviceLinkHandler and DiskLocation types, refactors PV creation to accept context/args and use a ClientReader, moves command execution to k8s exec, updates reconcilers to process rejected devices and use uncached readers, and adds extensive unit and e2e tests plus Makefile/HACKING.md/help script tweaks.

Changes

Cohort / File(s) Summary
Device-link core & internal types
pkg/internal/device_link_handler.go, pkg/internal/device_link_handler_test.go, pkg/internal/disk_location.go
New DeviceLinkHandler (FindLVDL, ApplyStatus, RecreateSymlinkIfNeeded, HasMismatchingSymlink) and exported DiskLocation type; comprehensive tests for status, symlink recreation, and error cases.
PV creation & symlink utilities
pkg/common/provisioner_utils.go, pkg/common/symlink_utils.go
Refactored CreateLocalPVCreateLocalPV(ctx, CreateLocalPVArgs) (adds ClientReader, BlockDevice, CurrentSymlink); integrates DeviceLinkHandler and conditional symlink recreation; updated symlink utility signatures.
Disk utilities / exec abstraction
pkg/internal/diskutil.go, pkg/internal/diskutil_test.go
Replaced os/exec with k8s.io/utils/exec (CmdExecutor), changed command helpers, refactored GetPathByID/GetUncachedPathID, and updated tests to use FakeExec.
Controllers & reconcilers
cmd/diskmaker-manager/manager.go, pkg/diskmaker/controllers/lv/reconcile.go, pkg/diskmaker/controllers/lvset/reconcile.go, pkg/controllers/localvolume/localvolume_controller.go
Pass mgr.GetAPIReader() into reconcilers; added ClientReader fields; migrated to internal.DiskLocation; split device processing into valid/new/existing/rejected flows; added LocalVolumeSetReconciler and related API changes.
Discovery & tests refactor
pkg/diskmaker/discovery/discovery.go, pkg/diskmaker/discovery/discovery_test.go, pkg/internal/*_test.go, pkg/diskmaker/controllers/*_test.go, pkg/diskmaker/controllers/lvset/*_test.go
Replaced legacy exec test harness with testing.FakeExec, updated tests and signatures, use t.Run/t.Cleanup patterns, and register LVDL as status subresource in fake clients.
API / CRD / CSV / RBAC
api/v1/localvolumedevicelink_types.go, config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml, config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
Made status.preferredLinkTarget and status.validLinkTargets optional; added Scheme registration init(); updated CRD schema; added RBAC for localvolumedevicelinks and its /status.
Event reasons
pkg/diskmaker/event_reporter.go, pkg/diskmaker/controllers/lv/event_reporter.go, pkg/diskmaker/controllers/lvset/event_reporter.go
Added event reason constants: FailedLVDLProcessing, ErrorProvisioningVolume, ErrorMaxCountReached.
E2E tests & helpers
test/e2e/*, test/e2e/cleanup_helper.go, test/e2e/gomega_util.go
Added/updated e2e flows and helpers to verify LVDL creation, preferred-link reconciliation, filesystem UUID propagation, multi-step preferred target changes, and a thread-safe cleanup runner and LVDL wait helper.
Build & docs & scripts
Makefile, HACKING.md, hack/test-e2e.sh
Set .DEFAULT_GOAL := build; added HACKING.md section for overriding deployed operator/CSV images; added suite parsing and help to hack/test-e2e.sh.
Misc & small changes
pkg/diskmaker/controllers/lv/create_pv_test.go, pkg/diskmaker/controllers/lv/reconcile_test.go, pkg/diskmaker/controllers/lvset/*, test/e2e/*, test/e2e/hostudev.go
Many test updates to use new CreateLocalPVArgs, adapt to internal.DiskLocation, minor log/job name tweaks, and adjusted callsites to GetPathByID() with no args.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@gnufied gnufied force-pushed the implement-device-relinking branch from 6098724 to 0a5f5ca Compare March 24, 2026 18:05
@openshift-ci openshift-ci bot requested review from dfajmon and mpatlasov March 24, 2026 18:06
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/common/provisioner_utils.go (1)

150-156: ⚠️ Potential issue | 🟠 Major

Return non-NotFound errors from the existing-PV lookup.

A timeout/cache/API failure from client.Get currently falls through as though the PV were absent, and provisioning continues against stale state.

Suggested fix
 existingPV := &corev1.PersistentVolume{ObjectMeta: metav1.ObjectMeta{Name: pvName}}
 err = client.Get(ctx, types.NamespacedName{Name: pvName}, existingPV)
-
-if err == nil && existingPV.Status.Phase == corev1.VolumeReleased {
+if err != nil {
+	if !apierrors.IsNotFound(err) {
+		return fmt.Errorf("error getting existing PV %q: %w", pvName, err)
+	}
+} else if existingPV.Status.Phase == corev1.VolumeReleased {
 	klog.InfoS("PV is still being cleaned, not going to recreate it", "pvName", pvName, "disk", deviceName)
 	// Caller should try again soon
 	return ErrTryAgain
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` around lines 150 - 156, The client.Get call
that looks up existingPV must not treat all errors as "not found"; change the
logic so that after calling client.Get(ctx, types.NamespacedName{Name: pvName},
existingPV) you check for an error and if it's non-nil and not
apierrors.IsNotFound(err) (or the equivalent IsNotFound helper) you return that
error immediately instead of proceeding—only continue with the
existingPV.Status.Phase check when err == nil, and keep returning ErrTryAgain
when the PV exists and Phase == corev1.VolumeReleased.
pkg/diskmaker/controllers/lv/reconcile.go (1)

401-423: ⚠️ Potential issue | 🟠 Major

Don’t return before the ignored-device reconciliation runs.

When all matching LV devices are already mounted/bind-mounted, validBlockDevices is empty and this exits before ignoredDevices are re-matched below. That skips processRejectedDevicesForDeviceLinks for the exact in-use/upgrade case this PR is trying to handle.

💡 Suggested fix
-	if len(validBlockDevices) == 0 {
-		klog.V(3).Info("unable to find any new disks")
-		return ctrl.Result{}, nil
-	}
-
-	deviceMap, err := r.findMatchingDisks(diskConfig, validBlockDevices)
-	if err != nil {
-		msg := fmt.Sprintf("error finding matching disks: %v", err)
-		r.eventSync.Report(r.localVolume, newDiskEvent(ErrorFindingMatchingDisk, msg, "", corev1.EventTypeWarning))
-		klog.Error(msg)
-		return ctrl.Result{}, nil
-	}
+	deviceMap := map[string][]internal.DiskLocation{}
+	if len(validBlockDevices) == 0 {
+		klog.V(3).Info("unable to find any new disks")
+	} else {
+		deviceMap, err = r.findMatchingDisks(diskConfig, validBlockDevices)
+		if err != nil {
+			msg := fmt.Sprintf("error finding matching disks: %v", err)
+			r.eventSync.Report(r.localVolume, newDiskEvent(ErrorFindingMatchingDisk, msg, "", corev1.EventTypeWarning))
+			klog.Error(msg)
+			return ctrl.Result{}, nil
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 401 - 423, The early
return when len(validBlockDevices) == 0 prevents reconciling ignoredDevices and
skips processRejectedDevicesForDeviceLinks; change the control flow in the
reconcile block so that when validBlockDevices is empty you do not return
immediately but still call r.findMatchingDisks(diskConfig, ignoredDevices) and
then r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap)
(preserving existing error logging for findMatchingDisks); only return after
those ignored-device checks complete (or after handling errors), ensuring
ignoredDevices are re-matched even when validBlockDevices has no entries.
🧹 Nitpick comments (2)
pkg/diskmaker/controllers/lv/create_pv_test.go (1)

374-377: Record and assert the requested by-id glob pattern.

internal.FilePathGlob returns fakeByIDLink for any input here, so this test still passes even if CreateLocalPV builds the by-id lookup from the wrong value. Tightening that fake would make the KName/device-path half of the regression real.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go` around lines 374 - 377, The
test currently stubs internal.FilePathGlob to unconditionally return
fakeByIDLink, masking whether CreateLocalPV builds the correct by-id glob;
update the fake for internal.FilePathGlob to capture the incoming pattern
argument (e.g., store it in a local variable like requestedByIDPattern) and
return fakeByIDLink, then after calling CreateLocalPV assert that
requestedByIDPattern equals the expected by-id glob (constructed from the test's
KName/device-path inputs) so the test fails if CreateLocalPV uses the wrong
value.
pkg/common/provisioner_utils.go (1)

121-129: Verify that CreateLocalPV is not relying on pre-populated TypeMeta.Kind.

This new guard rejects any LocalVolumeLikeObject whose GroupVersionKind().Kind is empty. The updated unit tests in this PR had to seed Kind manually, so the exported helper has effectively gained a hidden GVK precondition. Consider deriving the owner kind via scheme or passing it explicitly instead of relying on callers to prepopulate TypeMeta.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` around lines 121 - 129, The guard that
rejects objects with an empty GroupVersionKind().Kind makes CreateLocalPV (and
helpers that accept LocalVolumeLikeObject) implicitly require callers to
prepopulate TypeMeta; change CreateLocalPV and the helper that reads
obj.GetObjectKind().GroupVersionKind().Kind to derive the owner kind from the
runtime.Scheme (or accept an explicit kind parameter) instead of failing when
Kind is empty: update the logic around meta.Accessor(obj)/obj.GetObjectKind() to
attempt scheme.ObjectKinds(obj) (or accept a new explicit kind argument on
CreateLocalPV/related helper) and only error if both derivation and the explicit
value are unavailable, keeping existing metadata checks (name/namespace) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@HACKING.md`:
- Around line 9-13: Clarify and make the “Override diskmaker/operator image”
procedure safe and repeatable: instruct readers to identify exact Subscription
and CSV names (using oc get subscription -n openshift-local-storage and oc get
csv -n openshift-local-storage | grep local-storage-operator), temporarily
delete the Subscription only as a controlled step (oc delete subscription
<subscription-name> -n openshift-local-storage), edit the installed CSV to
change images under
spec.install.spec.deployments[].spec.template.spec.containers[] (mention
DISKMAKER_IMAGE and the operator container image by name) with oc edit csv
<csv-name> -n openshift-local-storage, and then explicitly restore OLM
management by recreating or applying the original Subscription manifest (or back
out the CSV changes) as the rollback step; include explicit command examples and
note namespaces and resource names so the operation is reversible.

In `@pkg/diskmaker/discovery/discovery_test.go`:
- Around line 58-65: The loop registers defer-based cleanup that executes only
at test exit, causing globals like internal.CmdExecutor and
internal.FilePathGlob to leak between iterations; change the per-iteration
cleanup to use t.Cleanup or an immediately-invoked function capturing local
copies so each iteration restores the originals immediately—capture the current
origExec := internal.CmdExecutor and origGlob := internal.FilePathGlob into
locals inside the iteration, set internal.CmdExecutor = newFakeExecutor(...) and
internal.FilePathGlob = tc.fakeGlobfunc, then call t.Cleanup(func(){
internal.CmdExecutor = origExec; internal.FilePathGlob = origGlob }) (or use an
IIFE to defer inside the IIFE) to ensure restoration happens per test case
rather than at function exit.

In `@pkg/internal/device_link_handler.go`:
- Around line 357-365: The call to dl.setStatusSymlinks(lvdl, blockDevice) may
return an error that is currently overwritten and ignored before updating
status; change the flow in the block that sets dl.currentSymlink and calls
setStatusSymlinks so you check and preserve the returned err immediately (e.g.,
assign copyToUpdate, err := dl.setStatusSymlinks(...); if err != nil { set or
append an OperatorCondition on copyToUpdate.Status indicating the failure and
return the error after attempting a status update } ), only clear
Status.Conditions when setStatusSymlinks succeeded, and ensure the final
dl.client.Status().Update(ctx, copyToUpdate) is performed with the correct
copyToUpdate and that any error from setStatusSymlinks is propagated (or
recorded in the status) instead of being silently discarded; use the existing
symbols copyToUpdate, dl.setStatusSymlinks, dl.client.Status().Update,
dl.currentSymlink, lvdl, blockDevice and ctx to locate and implement this fix.
- Around line 130-137: Replace the cached client read in the PV existence check
with the API reader: use dl.clientReader.Get(ctx, types.NamespacedName{Name:
pvName}, existingPV) instead of dl.client.Get(...) where existingPV is a
*corev1.PersistentVolume (same spot as the current block that calls
dl.client.Get). Mirror the pattern used in FindLVDL and, in findOrCreateLVDL,
add handling for an AlreadyExists race by treating conflicts from a concurrent
create as success (e.g., re-fetch the existing LVDL via clientReader.Get or call
FindLVDL) and continue instead of returning an error so the reconcile can
proceed.

In `@test/e2e/localvolume_test.go`:
- Around line 261-279: The helper eventuallyFindLVDLsForPVs currently only waits
for LVDL objects to exist; update it to also wait until each found
LocalVolumeDeviceLink has its status fields populated (check
lvdl.Status.CurrentLinkTarget, lvdl.Status.PreferredLinkTarget, and
lvdl.Status.ValidLinkTargets) before returning. Inside the matcher.Eventually
closure (in eventuallyFindLVDLsForPVs) after filtering by PersistentVolumeName,
fetch or use the listed lvdl items and ensure for every matched lvdl that
CurrentLinkTarget and PreferredLinkTarget are non-empty (and ValidLinkTargets is
non-nil/non-empty as appropriate); only return true when the count matches and
all status fields are populated so callers can rely on status fields like
CurrentLinkTarget, PreferredLinkTarget and ValidLinkTargets being present.

---

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 150-156: The client.Get call that looks up existingPV must not
treat all errors as "not found"; change the logic so that after calling
client.Get(ctx, types.NamespacedName{Name: pvName}, existingPV) you check for an
error and if it's non-nil and not apierrors.IsNotFound(err) (or the equivalent
IsNotFound helper) you return that error immediately instead of proceeding—only
continue with the existingPV.Status.Phase check when err == nil, and keep
returning ErrTryAgain when the PV exists and Phase == corev1.VolumeReleased.

In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 401-423: The early return when len(validBlockDevices) == 0
prevents reconciling ignoredDevices and skips
processRejectedDevicesForDeviceLinks; change the control flow in the reconcile
block so that when validBlockDevices is empty you do not return immediately but
still call r.findMatchingDisks(diskConfig, ignoredDevices) and then
r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap)
(preserving existing error logging for findMatchingDisks); only return after
those ignored-device checks complete (or after handling errors), ensuring
ignoredDevices are re-matched even when validBlockDevices has no entries.

---

Nitpick comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 121-129: The guard that rejects objects with an empty
GroupVersionKind().Kind makes CreateLocalPV (and helpers that accept
LocalVolumeLikeObject) implicitly require callers to prepopulate TypeMeta;
change CreateLocalPV and the helper that reads
obj.GetObjectKind().GroupVersionKind().Kind to derive the owner kind from the
runtime.Scheme (or accept an explicit kind parameter) instead of failing when
Kind is empty: update the logic around meta.Accessor(obj)/obj.GetObjectKind() to
attempt scheme.ObjectKinds(obj) (or accept a new explicit kind argument on
CreateLocalPV/related helper) and only error if both derivation and the explicit
value are unavailable, keeping existing metadata checks (name/namespace) intact.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go`:
- Around line 374-377: The test currently stubs internal.FilePathGlob to
unconditionally return fakeByIDLink, masking whether CreateLocalPV builds the
correct by-id glob; update the fake for internal.FilePathGlob to capture the
incoming pattern argument (e.g., store it in a local variable like
requestedByIDPattern) and return fakeByIDLink, then after calling CreateLocalPV
assert that requestedByIDPattern equals the expected by-id glob (constructed
from the test's KName/device-path inputs) so the test fails if CreateLocalPV
uses the wrong value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d1b8c5c7-deb5-46c4-9412-fac8c74a7158

📥 Commits

Reviewing files that changed from the base of the PR and between 9d54043 and 0a5f5ca.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/utils/exec/testing/fake_exec.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (26)
  • HACKING.md
  • Makefile
  • api/v1/localvolumedevicelink_types.go
  • cmd/diskmaker-manager/manager.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
  • test/e2e/hostudev.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
  • test/e2e/symlink_check.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml

@gnufied gnufied force-pushed the implement-device-relinking branch from 0a5f5ca to 75afc62 Compare March 24, 2026 18:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)

132-156: ⚠️ Potential issue | 🟠 Major

Don't relink a device before the released-PV retry gate.

Lines 142-145 can recreate the host symlink and update LVDL state, and then Lines 152-155 immediately return ErrTryAgain when the old PV is still Released. That mutates the path a still-cleaning PV refers to before the retry.

Possible fix
 	deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client, args.ClientReader)
 	lvdl, err := deviceHandler.FindLVDL(ctx, pvName, namespace)
 	if err != nil && !apierrors.IsNotFound(err) {
 		return fmt.Errorf("error finding localvolumedevicelink object %s: %w", pvName, err)
 	}
 
-	// Symlink recreation must happen before PV creation because it fixes the
-	// symlink that the PV will reference. RecreateSymlinkIfNeeded already
-	// updates the LVDL status, so we skip ApplyStatus later.
-	requiresSymlinkRecreation := internal.HasMismatchingSymlink(lvdl)
-	if requiresSymlinkRecreation {
-		if _, err := deviceHandler.RecreateSymlinkIfNeeded(ctx, lvdl, symLinkPath, args.BlockDevice); err != nil {
-			return fmt.Errorf("error recreating symlink: %w", err)
-		}
-	}
-
 	// Do not attempt to create or update existing PV's that have been released
 	existingPV := &corev1.PersistentVolume{ObjectMeta: metav1.ObjectMeta{Name: pvName}}
 	err = client.Get(ctx, types.NamespacedName{Name: pvName}, existingPV)
 
 	if err == nil && existingPV.Status.Phase == corev1.VolumeReleased {
 		klog.InfoS("PV is still being cleaned, not going to recreate it", "pvName", pvName, "disk", deviceName)
 		// Caller should try again soon
 		return ErrTryAgain
 	}
+
+	// Symlink recreation must happen before PV creation because it fixes the
+	// symlink that the PV will reference. RecreateSymlinkIfNeeded already
+	// updates the LVDL status, so we skip ApplyStatus later.
+	requiresSymlinkRecreation := internal.HasMismatchingSymlink(lvdl)
+	if requiresSymlinkRecreation {
+		if _, err := deviceHandler.RecreateSymlinkIfNeeded(ctx, lvdl, symLinkPath, args.BlockDevice); err != nil {
+			return fmt.Errorf("error recreating symlink: %w", err)
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` around lines 132 - 156, The code currently
recreates the host symlink (internal.HasMismatchingSymlink +
deviceHandler.RecreateSymlinkIfNeeded) before checking whether an existing PV is
in Phase Released and returning ErrTryAgain, which can mutate the symlink for a
still-cleaning PV; move the released-PV gate so the client.Get +
existingPV.Status.Phase == corev1.VolumeReleased check (and early return
ErrTryAgain) happens before calling internal.HasMismatchingSymlink /
deviceHandler.RecreateSymlinkIfNeeded, or alternatively wrap the
RecreateSymlinkIfNeeded call with a guard that skips symlink updates when
existingPV is Released (use the existingPV and ErrTryAgain symbols to implement
the guard).
♻️ Duplicate comments (1)
test/e2e/localvolume_test.go (1)

261-285: ⚠️ Potential issue | 🟡 Minor

Wait for ValidLinkTargets before returning from this helper.

Callers at Lines 169-173 assert ValidLinkTargets immediately, but this loop only blocks on CurrentLinkTarget and PreferredLinkTarget. That keeps the same status-update race and makes the e2e path flaky.

Possible fix
-			if ok := pvNameSet.Has(lvdl.Spec.PersistentVolumeName); ok {
-				// Wait until status fields are populated by the status update
-				if lvdl.Status.CurrentLinkTarget == "" || lvdl.Status.PreferredLinkTarget == "" {
+			if ok := pvNameSet.Has(lvdl.Spec.PersistentVolumeName); ok {
+				// Wait until status fields are populated by the status update
+				if lvdl.Status.CurrentLinkTarget == "" ||
+					lvdl.Status.PreferredLinkTarget == "" ||
+					len(lvdl.Status.ValidLinkTargets) == 0 {
 					return false
 				}
 				foundLVDLs = append(foundLVDLs, lvdl)
 			}
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 414-423: The early-return fast path currently exits when
validBlockDevices is empty and occurs before calling r.findMatchingDisks and
r.processRejectedDevicesForDeviceLinks, which prevents the LVDL refresh/relink
path from running on the upgrade/in-use case; change the control flow in
reconcile.go so that you call r.findMatchingDisks(...) and then
r.processRejectedDevicesForDeviceLinks(...) before returning, or only return
early when both validBlockDevices and ignoredButMatchingDeviceMap are
empty—adjust the checks around validBlockDevices and the return so
processRejectedDevicesForDeviceLinks (and the ignoredButMatchingDeviceMap
handling) always runs when applicable.
- Around line 531-540: The code is using the desired symlink target from
getSymlinkSourceAndTarget() when creating the device link handler and computing
status, which hides rejected-device cases; instead, read and use the existing
symlink's current target (the actual resolved value you store in
dl.currentSymlink / existingSymlink) wherever you currently use the "source" or
desired target (e.g., when constructing deviceHandler via
internal.NewDeviceLinkHandler and when computing
lvdlName/Status.CurrentLinkTarget and calling ApplyStatus), so replace the use
of the desired source/target returned by getSymlinkSourceAndTarget() with the
existing symlink's current target (or call os.Readlink on the existing symlink
path if needed) before creating the handler or setting status so
HasMismatchingSymlink() can detect mismatches on later reconciles.

In `@pkg/diskmaker/controllers/lvset/reconcile.go`:
- Around line 288-307: The code currently reconstructs symlinkSourcePath from
GetSymLinkSourceAndTarget but then passes the symlink filename (existingSymlink)
into calls that rely on the current link target, which can be stale after
relinks; change those calls to use the actual symlink target returned by
GetSymLinkSourceAndTarget (symlinkSourcePath) or freshly read via os.Readlink
instead of the filename. Specifically, when calling
internal.NewDeviceLinkHandler.RecreateSymlinkIfNeeded and ApplyStatus (and any
logic that sets CurrentLinkTarget), pass symlinkSourcePath (or the result of
os.Readlink on filepath.Join(symLinkDir, existingSymlink)) as the current link
target rather than existingSymlink so CurrentLinkTarget reflects the real
target.

In `@pkg/internal/device_link_handler.go`:
- Around line 407-422: In setLVDLCondition, avoid always overwriting
LastTransitionTime: when you find an existing condition (in
LocalVolumeDeviceLink.Status.Conditions) compare the meaningful fields (Status,
Reason, Message — and/or ObservedGeneration if relevant) to the incoming
condition; if they are identical, preserve existingCondition.LastTransitionTime
on the new condition instead of setting it to metav1.Now(); if they differ (or
when appending a new condition) set condition.LastTransitionTime = metav1.Now()
before writing it back. Update the logic in setLVDLCondition to perform that
conditional assignment so repeated reconciles with the same condition do not
change LastTransitionTime.

---

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 132-156: The code currently recreates the host symlink
(internal.HasMismatchingSymlink + deviceHandler.RecreateSymlinkIfNeeded) before
checking whether an existing PV is in Phase Released and returning ErrTryAgain,
which can mutate the symlink for a still-cleaning PV; move the released-PV gate
so the client.Get + existingPV.Status.Phase == corev1.VolumeReleased check (and
early return ErrTryAgain) happens before calling internal.HasMismatchingSymlink
/ deviceHandler.RecreateSymlinkIfNeeded, or alternatively wrap the
RecreateSymlinkIfNeeded call with a guard that skips symlink updates when
existingPV is Released (use the existingPV and ErrTryAgain symbols to implement
the guard).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: afa02d89-e7aa-48c3-a403-d073b55920a3

📥 Commits

Reviewing files that changed from the base of the PR and between 0a5f5ca and 75afc62.

📒 Files selected for processing (18)
  • HACKING.md
  • Makefile
  • cmd/diskmaker-manager/manager.go
  • pkg/common/provisioner_utils.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • test/e2e/hostudev.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
  • test/e2e/symlink_check.go
✅ Files skipped from review due to trivial changes (3)
  • pkg/diskmaker/event_reporter.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/internal/device_link_handler_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • HACKING.md
  • pkg/diskmaker/discovery/discovery_test.go
  • test/e2e/symlink_check.go
  • test/e2e/localvolumeset_test.go

@gnufied gnufied force-pushed the implement-device-relinking branch from 75afc62 to 36f5ccd Compare March 26, 2026 13:20
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/diskmaker/controllers/lv/create_pv_test.go (1)

253-255: ⚠️ Potential issue | 🟠 Major

Use continue instead of return to avoid skipping remaining test cases.

When a test case with shouldErr: true passes its assertion, the function returns early, skipping all subsequent test cases in testTable. This means only test cases up to and including the first error case are executed.

🐛 Proposed fix
 		if tc.shouldErr {
-			return
+			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go` around lines 253 - 255, In
the test loop over testTable in create_pv_test.go, the branch that checks
tc.shouldErr currently uses return which exits the whole test function and skips
remaining cases; change that return to continue so the loop proceeds to the next
test case (referencing tc.shouldErr and the testTable iteration, e.g., inside
the t.Run / loop that processes each test case).
pkg/common/provisioner_utils.go (1)

234-234: ⚠️ Potential issue | 🟡 Minor

Use the passed ctx instead of context.TODO().

The function accepts a context.Context parameter but line 234 uses context.TODO(). This should use ctx for consistent context propagation and proper cancellation support.

🔧 Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 234, Replace the use of
context.TODO() in the controllerutil.CreateOrUpdate call with the function's
received ctx parameter so context propagation and cancellation are respected;
update the call site that currently reads
controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, ...) to
controllerutil.CreateOrUpdate(ctx, client, existingPV, ...) and ensure the ctx
variable is in scope for that function.
🧹 Nitpick comments (2)
pkg/internal/diskutil.go (1)

214-230: Consider clarifying that GetUncachedPathID mutates PathByID on success.

The method name suggests it bypasses the cache, but line 225 updates b.PathByID when a match is found. This side effect should be documented or the method renamed to better reflect its behavior (e.g., RefreshPathByID).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/diskutil.go` around lines 214 - 230, GetUncachedPathID currently
mutates the receiver by setting b.PathByID when a match is found (via
b.findDeviceInSortedSymlink), which contradicts the "uncached" implication;
either document this side-effect in the method comment or rename the method
(e.g., RefreshPathByID or UpdatePathByID) to reflect mutation. Update the
GetUncachedPathID docstring to state that it will update b.PathByID on success,
or rename the function and adjust all callers to use the new name, ensuring
references to GetUncachedPathID, b.PathByID, and findDeviceInSortedSymlink are
updated consistently.
pkg/diskmaker/controllers/lv/create_pv_test.go (1)

26-46: Consider documenting the single-command limitation of fakeBlkidExecutor.

The FakeExec.CommandScript slice is consumed per command invocation. If CreateLocalPV invokes multiple external commands, subsequent calls won't use this mock. The test appears designed for scenarios with a single blkid call, but this behavior should be understood by future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go` around lines 26 - 46, The
fakeBlkidExecutor helper assumes a single external command invocation because it
sets testingexec.FakeExec.CommandScript with one FakeCommandAction; add a brief
doc comment to fakeBlkidExecutor explaining that FakeExec.CommandScript entries
are consumed per call so this mock only supports a single blkid invocation (or
that tests must supply a CommandScript with multiple actions when CreateLocalPV
runs multiple external commands), and optionally mention how to extend it (add
more FakeCommandAction entries or handle multiple calls) for future maintainers;
reference fakeBlkidExecutor and testingexec.FakeExec.CommandScript in the
comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hack/test-e2e.sh`:
- Around line 40-43: The current case branch for '--suite=*' assigns
suite="${1#*=}" but allows an empty value which causes a full-suite run; change
the branch so after the assignment you validate suite is non-empty (test [ -z
"$suite" ]), and if empty print a clear error to stderr and exit non-zero,
otherwise proceed with shift; update the '--suite=*' case handling around the
suite variable and shift call accordingly.

In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 676-680: The log statement in the error-handling block after
processing a LocalVolumeDeviceLink uses the wrong variable: replace the
stale/incorrect `err` passed into `klog.ErrorS` with the actual `lvdlError` so
the real processing error is logged; specifically update the block handling `if
lvdlError != nil` (which calls `r.eventSync.Report` and `newDiskEvent`) to call
`klog.ErrorS(lvdlError, "error updating LocalVolumeDeviceLink", "device",
blockDevice.Name)` so the correct error is reported.

In `@pkg/diskmaker/controllers/lvset/reconcile.go`:
- Around line 368-373: Fix the typo in the error message created for lvdl
processing: change the fmt.Errorf call that constructs msg (currently "failed to
prcoess lvdl %w") to use the correct wording "failed to process lvdl %w"; ensure
the corrected msg is passed to r.eventReporter.Report and retains the same
parameters (newDiskEvent(diskmaker.FailedLVDLProcessing, msg.Error(),
blockDevice.KName, corev1.EventTypeWarning)) so logging and event reporting
behavior for lvdlError remains unchanged.

In `@test/e2e/symlink_check.go`:
- Around line 297-303: The cleanup closure captures the loop variable step by
reference so every appended cleanup will use the final step value; fix this in
the loop that appends to cleanups by creating a new local variable (e.g., s :=
step) and use that local (s) inside the cleanupFn.fn so each closure calls
removeUdevSymlink with the correct target; ensure the referenced symbols are
cleanups, cleanupFn, removeUdevSymlink, and step (and preserve nodeHostName and
ctx usage).

---

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 234: Replace the use of context.TODO() in the
controllerutil.CreateOrUpdate call with the function's received ctx parameter so
context propagation and cancellation are respected; update the call site that
currently reads controllerutil.CreateOrUpdate(context.TODO(), client,
existingPV, ...) to controllerutil.CreateOrUpdate(ctx, client, existingPV, ...)
and ensure the ctx variable is in scope for that function.

In `@pkg/diskmaker/controllers/lv/create_pv_test.go`:
- Around line 253-255: In the test loop over testTable in create_pv_test.go, the
branch that checks tc.shouldErr currently uses return which exits the whole test
function and skips remaining cases; change that return to continue so the loop
proceeds to the next test case (referencing tc.shouldErr and the testTable
iteration, e.g., inside the t.Run / loop that processes each test case).

---

Nitpick comments:
In `@pkg/diskmaker/controllers/lv/create_pv_test.go`:
- Around line 26-46: The fakeBlkidExecutor helper assumes a single external
command invocation because it sets testingexec.FakeExec.CommandScript with one
FakeCommandAction; add a brief doc comment to fakeBlkidExecutor explaining that
FakeExec.CommandScript entries are consumed per call so this mock only supports
a single blkid invocation (or that tests must supply a CommandScript with
multiple actions when CreateLocalPV runs multiple external commands), and
optionally mention how to extend it (add more FakeCommandAction entries or
handle multiple calls) for future maintainers; reference fakeBlkidExecutor and
testingexec.FakeExec.CommandScript in the comment.

In `@pkg/internal/diskutil.go`:
- Around line 214-230: GetUncachedPathID currently mutates the receiver by
setting b.PathByID when a match is found (via b.findDeviceInSortedSymlink),
which contradicts the "uncached" implication; either document this side-effect
in the method comment or rename the method (e.g., RefreshPathByID or
UpdatePathByID) to reflect mutation. Update the GetUncachedPathID docstring to
state that it will update b.PathByID on success, or rename the function and
adjust all callers to use the new name, ensuring references to
GetUncachedPathID, b.PathByID, and findDeviceInSortedSymlink are updated
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2886048d-e5a3-4bc2-a031-5f7deb5ed60e

📥 Commits

Reviewing files that changed from the base of the PR and between 75afc62 and 36f5ccd.

📒 Files selected for processing (30)
  • HACKING.md
  • Makefile
  • cmd/diskmaker-manager/manager.go
  • hack/test-e2e.sh
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/event_reporter.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/event_reporter.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
  • test/e2e/aws_disks.go
  • test/e2e/cleanup_helper.go
  • test/e2e/gomega_util.go
  • test/e2e/hostudev.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
  • test/e2e/main_test.go
  • test/e2e/symlink_check.go
💤 Files with no reviewable changes (1)
  • test/e2e/main_test.go
✅ Files skipped from review due to trivial changes (6)
  • HACKING.md
  • pkg/diskmaker/event_reporter.go
  • pkg/diskmaker/controllers/lvset/event_reporter.go
  • pkg/diskmaker/controllers/lv/event_reporter.go
  • test/e2e/hostudev.go
  • test/e2e/localvolumeset_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • cmd/diskmaker-manager/manager.go
  • Makefile
  • pkg/internal/disk_location.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/internal/diskutil_test.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • test/e2e/localvolume_test.go
  • pkg/internal/device_link_handler.go
  • pkg/diskmaker/discovery/discovery_test.go

Add code change default symlinks if customer asks
@gnufied gnufied force-pushed the implement-device-relinking branch from 738157b to abe7a11 Compare March 26, 2026 13:48
@gnufied gnufied changed the title Implement device relinking STOR-2868: Implement device relinking Mar 26, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 26, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 26, 2026

@gnufied: This pull request references STOR-2868 which is a valid jira issue.

Details

In response to this:

Fixes https://redhat.atlassian.net/browse/STOR-2868

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
test/e2e/symlink_check.go (1)

156-163: Resolve the LVDL by spec.persistentVolumeName, not object name.

LocalVolumeDeviceLinkSpec.PersistentVolumeName is the API field that binds an LVDL to a PV. Looking it up by metadata.name makes these helpers brittle to harmless naming changes, and eventuallyFindLVDLsForPVs(...) already uses the spec-based match.

As per coding guidelines, Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/symlink_check.go` around lines 156 - 163, The helper
eventuallyGetLVDL currently fetches an LVDL by metadata.name; change it to
resolve by LocalVolumeDeviceLink.Spec.PersistentVolumeName instead: in
eventuallyGetLVDL (and any callers like eventuallyFindLVDLsForPVs), list
LocalVolumeDeviceLink objects in the target namespace (use
localv1.LocalVolumeDeviceLinkList and f.Client.List) inside the
gomega.Eventually loop, iterate the list to find an item whose
Spec.PersistentVolumeName equals the provided PV name, and return that object
once found; keep the timeout/retry behavior the same and adjust the parameter
name to indicate it expects a persistentVolumeName if needed.
pkg/diskmaker/controllers/lv/reconcile_test.go (1)

623-729: Exercise Reconcile in this regression test.

This test bypasses Reconcile and calls processRejectedDevicesForDeviceLinks directly, so it will still pass if the no-valid-device control flow in Reconcile regresses again. Driving Reconcile with only ignored devices would lock in the branch this PR is actually protecting.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile_test.go` around lines 623 - 729, The
test currently calls processRejectedDevicesForDeviceLinks directly; instead
exercise the Reconcile code path by invoking r.Reconcile with a
reconcile.Request for the LocalVolumeDeviceLink under test (use pvName and
testNamespace), e.g. build a reconcile.Request with types.NamespacedName{Name:
pvName, Namespace: testNamespace} and call r.Reconcile(context.TODO(), req),
assert no error, then fetch and validate the LocalVolumeDeviceLink status as
before; keep setup (ignoredDevices, diskConfig, fake client, runtimeConfig)
intact so Reconcile runs the same logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 666-675: The FindLVDL error handling should not treat transient
API/RBAC errors as "missing"—in the lv reconcile loop, check the error from
deviceHandler.FindLVDL and if err != nil and !apierrors.IsNotFound(err) return
or continue processing the next item (do not fall through), so transient lookup
failures are retried rather than letting lvdl stay nil. Also guard
internal.HasMismatchingSymlink with a nil-check on lvdl (or only call it when
lvdl != nil), and only call deviceHandler.ApplyStatus when you intentionally
want bootstrap semantics; otherwise require an explicit create path or a
bootstrap flag before delegating to ApplyStatus (which calls findOrCreateLVDL).
Update the logic around deviceHandler.FindLVDL, internal.HasMismatchingSymlink,
deviceHandler.RecreateSymlinkIfNeeded and deviceHandler.ApplyStatus (and
document that ApplyStatus invokes findOrCreateLVDL) so lookup errors, symlink
reconciliation, and implicit LVDL creation are handled and documented
explicitly.

In `@test/e2e/symlink_check.go`:
- Around line 85-92: The symlink helper registers cleanup only after assertions,
so failures leave symlinks on the node; modify addNewUdevSymlink (and the
similar helpers at the other locations) to register cleanup immediately when the
symlink is created by either calling t.Cleanup(func(){ removeUdevSymlink(t, ctx,
nodeHostName, newPreferredTarget) }) inside addNewUdevSymlink or by appending
the cleanupFn (name/removeUdevSymlink) to the shared cleanup list from within
the helper right after creation, ensuring removeUdevSymlink is guaranteed to run
even if subsequent Expect/Eventually fails.

---

Nitpick comments:
In `@pkg/diskmaker/controllers/lv/reconcile_test.go`:
- Around line 623-729: The test currently calls
processRejectedDevicesForDeviceLinks directly; instead exercise the Reconcile
code path by invoking r.Reconcile with a reconcile.Request for the
LocalVolumeDeviceLink under test (use pvName and testNamespace), e.g. build a
reconcile.Request with types.NamespacedName{Name: pvName, Namespace:
testNamespace} and call r.Reconcile(context.TODO(), req), assert no error, then
fetch and validate the LocalVolumeDeviceLink status as before; keep setup
(ignoredDevices, diskConfig, fake client, runtimeConfig) intact so Reconcile
runs the same logic.

In `@test/e2e/symlink_check.go`:
- Around line 156-163: The helper eventuallyGetLVDL currently fetches an LVDL by
metadata.name; change it to resolve by
LocalVolumeDeviceLink.Spec.PersistentVolumeName instead: in eventuallyGetLVDL
(and any callers like eventuallyFindLVDLsForPVs), list LocalVolumeDeviceLink
objects in the target namespace (use localv1.LocalVolumeDeviceLinkList and
f.Client.List) inside the gomega.Eventually loop, iterate the list to find an
item whose Spec.PersistentVolumeName equals the provided PV name, and return
that object once found; keep the timeout/retry behavior the same and adjust the
parameter name to indicate it expects a persistentVolumeName if needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b96b00ae-c08f-4344-a43b-df79393881f4

📥 Commits

Reviewing files that changed from the base of the PR and between 36f5ccd and 738157b.

📒 Files selected for processing (4)
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • test/e2e/localvolumeset_test.go
  • test/e2e/symlink_check.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)

234-234: ⚠️ Potential issue | 🟡 Minor

Use the provided ctx instead of context.TODO().

The function receives a ctx parameter but uses context.TODO() for CreateOrUpdate. This inconsistency could cause issues if the caller's context carries cancellation signals or deadlines.

Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 234, Replace the use of
context.TODO() with the provided ctx parameter in the call to
controllerutil.CreateOrUpdate within the function. This ensures the function
properly respects cancellation signals and deadlines from the caller's context
by consistently using the passed-in ctx.
♻️ Duplicate comments (2)
test/e2e/symlink_check.go (2)

85-92: ⚠️ Potential issue | 🟠 Major

Register symlink cleanup immediately after creation to avoid orphaned symlinks on test failure.

If any assertion fails between addNewUdevSymlink (line 85) and the caller appending the returned cleanupFn to the shared cleanup list, the synthetic symlink remains on the node. This can poison subsequent e2e runs.

Consider using t.Cleanup() inside addNewUdevSymlink or restructuring so cleanup is registered before assertions run.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/symlink_check.go` around lines 85 - 92, The symlink created by
addNewUdevSymlink can be left behind if assertions fail before the returned
cleanupFn is appended, so register cleanup immediately: modify addNewUdevSymlink
to call t.Cleanup(func(){ removeUdevSymlink(t, ctx, nodeHostName,
newPreferredTarget) }) right after creating the symlink (or alternatively return
the cleanup and ensure the caller registers it with t.Cleanup before performing
assertions); this ensures the removeUdevSymlink cleanup is always invoked even
on test failures and prevents orphaned symlinks.

269-277: ⚠️ Potential issue | 🟠 Major

Same cleanup timing concern applies to step 1 symlink creation.

The step 1 symlink (line 270) has the same issue - if assertions fail after addNewUdevSymlink but before the function returns cleanups to the caller, the symlink is orphaned.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/symlink_check.go` around lines 269 - 277, The step 1 symlink is
created by addNewUdevSymlink but its cleanup is only appended to cleanups
afterwards, so a test failure between creation and append will orphan the
symlink; move the cleanup registration for steps[0].target so it is appended to
the cleanups slice before calling addNewUdevSymlink (or use t.Cleanup) to
guarantee removeUdevSymlink will run even on early failures; refer to
addNewUdevSymlink, cleanups, cleanupFn, removeUdevSymlink and steps[0].target
when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 234: Replace the use of context.TODO() with the provided ctx parameter in
the call to controllerutil.CreateOrUpdate within the function. This ensures the
function properly respects cancellation signals and deadlines from the caller's
context by consistently using the passed-in ctx.

---

Duplicate comments:
In `@test/e2e/symlink_check.go`:
- Around line 85-92: The symlink created by addNewUdevSymlink can be left behind
if assertions fail before the returned cleanupFn is appended, so register
cleanup immediately: modify addNewUdevSymlink to call t.Cleanup(func(){
removeUdevSymlink(t, ctx, nodeHostName, newPreferredTarget) }) right after
creating the symlink (or alternatively return the cleanup and ensure the caller
registers it with t.Cleanup before performing assertions); this ensures the
removeUdevSymlink cleanup is always invoked even on test failures and prevents
orphaned symlinks.
- Around line 269-277: The step 1 symlink is created by addNewUdevSymlink but
its cleanup is only appended to cleanups afterwards, so a test failure between
creation and append will orphan the symlink; move the cleanup registration for
steps[0].target so it is appended to the cleanups slice before calling
addNewUdevSymlink (or use t.Cleanup) to guarantee removeUdevSymlink will run
even on early failures; refer to addNewUdevSymlink, cleanups, cleanupFn,
removeUdevSymlink and steps[0].target when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1bbbdaef-641f-44b8-96b4-cba10bc5fdd8

📥 Commits

Reviewing files that changed from the base of the PR and between 738157b and abe7a11.

📒 Files selected for processing (30)
  • HACKING.md
  • Makefile
  • cmd/diskmaker-manager/manager.go
  • hack/test-e2e.sh
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/event_reporter.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/event_reporter.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/discovery/discovery.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
  • test/e2e/aws_disks.go
  • test/e2e/cleanup_helper.go
  • test/e2e/gomega_util.go
  • test/e2e/hostudev.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
  • test/e2e/main_test.go
  • test/e2e/symlink_check.go
💤 Files with no reviewable changes (1)
  • test/e2e/main_test.go
✅ Files skipped from review due to trivial changes (8)
  • pkg/diskmaker/controllers/lvset/event_reporter.go
  • pkg/diskmaker/event_reporter.go
  • pkg/diskmaker/controllers/lv/event_reporter.go
  • pkg/internal/disk_location.go
  • test/e2e/aws_disks.go
  • HACKING.md
  • pkg/internal/device_link_handler_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • pkg/diskmaker/discovery/discovery.go
  • Makefile
  • test/e2e/hostudev.go
  • hack/test-e2e.sh
  • pkg/diskmaker/discovery/discovery_test.go
  • cmd/diskmaker-manager/manager.go
  • test/e2e/gomega_util.go
  • test/e2e/cleanup_helper.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/internal/device_link_handler.go

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 26, 2026

@gnufied: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants