Skip to content

STOR-2870: Implement creation of LocalVolumeDeviceLink objects#592

Open
gnufied wants to merge 6 commits intoopenshift:mainfrom
gnufied:implement-lvdl-creation
Open

STOR-2870: Implement creation of LocalVolumeDeviceLink objects#592
gnufied wants to merge 6 commits intoopenshift:mainfrom
gnufied:implement-lvdl-creation

Conversation

@gnufied
Copy link
Member

@gnufied gnufied commented Mar 13, 2026

Fixes https://issues.redhat.com/browse/STOR-2870

Created with: AI assistance for tests.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 13, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 13, 2026

@gnufied: This pull request references STOR-2870 which is a valid jira issue.

Details

In response to this:

Fixes https://issues.redhat.com/browse/STOR-2870

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from dfajmon and dobsonj March 13, 2026 19:14
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 13, 2026
// +optional
// +listType=set
// +kubebuilder:validation:MaxItems=256
ValidLinkTargets []string `json:"validLinkTargets,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to relax these validations, because I realized that in some environments, we may not get any preferredLink targers in /dev/disk/by-id at all... So rather than not creating any objects in those environments, we still create LVDL object

@dfajmon
Copy link
Contributor

dfajmon commented Mar 16, 2026

@coderabbitai start review process

Comment on lines +82 to 84
// +optional
// +kubebuilder:validation:MaxLength=4096
PreferredLinkTarget string `json:"preferredLinkTarget,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be a pointer, if it's allowed to be empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to - https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md

"In custom resource based APIs specifically, we advise to avoid making fields pointers unless there is an absolute need to do so. An absolute need being the need to distinguish between the zero value and an unset value", it should be fine to keep it optional and still a string type i think.

"paths", noMatch, "directory", symLinkDir)
}

r.processRejectedDevicesForDeviceLinks(ctx, lvset, rejectedDevices, symLinkDir, storageClassName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no error handling and exponential backoff? Add a comment that this is called periodically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling, I can add. we can add some events. But I am unsure about exponential backoff. This reconciler loop runs every 1 minute, if we were to backoff on creation of individual LVDL objects, we may have to add separate queue. I am not sure if complexity is worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably fine to retry once per minute. There should be a comment about that.

}

if len(ignoredButMatchingDeviceMap) != 0 {
r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no error handling and exponential backoff? Add a comment that this is called periodically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added error handling. same comment about exponential backoff.

Comment on lines +206 to +209
if isNilOwnerObject(ownerObj) {
klog.Warningf("missing lvdl object %s during status update, but owner is nil; skipping creation for device: %s", pvName, devicePath)
return nil, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to understand when is this function called with nil owner. Please add a comment about that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be pretty rare/impossible for owner to be nil. May be when LV or LVSet is being deleted? The code here merely guards against such a scenario

},
existing: &v1.LocalVolumeDeviceLink{
ObjectMeta: metav1.ObjectMeta{Name: "local-pv-statustest", Namespace: "default"},
Spec: v1.LocalVolumeDeviceLinkSpec{PersistentVolumeName: "local-pv-statustest"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a case where LSO actually updates existing PreferredLinkTarget from say /dev/disk/by-id/wwn-current to /dev/disk/by-id/wwn-preferred. All existing unit test have the status empty when calling UpdateStatus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree, but I am not sure if device_handler is the right place for the test right? It has to be some higher level controller that should have the test, because all device_handler code does is - updates whatever was supplied to it.

@jsafrane
Copy link
Contributor

https://github.com/coderabbitai review this please

@jsafrane
Copy link
Contributor

@coderabbitai review this please

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

@jsafrane Sure, I'll review this PR now!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)

222-222: ⚠️ Potential issue | 🟡 Minor

Use passed ctx instead of context.TODO().

The function signature now accepts ctx context.Context, but controllerutil.CreateOrUpdate still uses context.TODO(). This should be updated to use ctx for consistent context propagation and proper cancellation/timeout handling.

Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 222, The call to
controllerutil.CreateOrUpdate is using context.TODO() instead of the function's
passed ctx; update the call site so controllerutil.CreateOrUpdate(ctx, client,
existingPV, func() error { ... }) uses the existing ctx parameter to ensure
proper cancellation and timeout propagation (look for
controllerutil.CreateOrUpdate, existingPV and opRes in this function).
🧹 Nitpick comments (1)
pkg/diskmaker/controllers/lvset/reconcile.go (1)

471-477: TODO comment indicates potential over-strict error handling.

The comment notes this may not be a fatal error, but the code currently returns an error if GetPathByID fails. For devices without stable by-id paths (common in some virtualized environments), this could unnecessarily block PV provisioning. Consider deferring this fix to a follow-up if it's not blocking the current use cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lvset/reconcile.go` around lines 471 - 477, The
current code treats a failure from dev.GetPathByID("") as fatal and returns the
error; change this to non-fatal handling so devices without by-id paths don't
block reconciliation: in the reconcile logic around dev.GetPathByID("") (the
symbol preferredSymLink and its err), catch the error, log it at warning/verbose
level with context (include disk/devLabelPath and the error) and set
preferredSymLink to an empty string or nil so processing continues; only return
an error if subsequent logic actually requires a by-id path (leave a TODO
comment referencing this change so it can be revisited), ensuring the function
does not return early on GetPathByID failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@HACKING.md`:
- Around line 9-12: Update the section title and steps to accurately describe
that deleting the Subscription only stops future upgrades (OLM will continue to
run the currently installed CSV), and instruct to edit the CSV to override
either the operator image or the diskmaker image as needed; specifically mention
the Subscription and CSV objects by name, clarify that you must edit the CSV's
container image(s) (operator image or diskmaker image) to apply the override,
and note that if you need to revert OLM entirely you must also remove the
installed CSV or recreate a replacement CSV.

In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 521-525: The code currently calls
getSymlinkSourceAndTarget(diskLocation, symLinkDirPath) and uses break on error,
which aborts processing all remaining devices in the storage class; change the
error handling to use continue instead so a failed device is skipped but the
loop keeps processing other devices (replace the break in the err block with
continue and preserve the klog.ErrorS call). Ensure this change is made in the
loop that iterates devices where diskLocation and symLinkDirPath are used.

In `@pkg/internal/device_link_handler.go`:
- Around line 56-70: The early return when existing.Spec.PersistentVolumeName ==
pvName skips reconciling metadata.ownerReferences on preexisting
LocalVolumeDeviceLink objects; instead, detect when the PV name matches and
still reconcile ownerReferences on the existing object (compare desired
ownerRefs vs existing.ObjectMeta.OwnerReferences), and if they differ create a
deep copy (existing.DeepCopy()), set the updated OwnerReferences, and call
dl.client.Update(ctx, copied) (same error handling as the create/update path).
Ensure you reference and reuse the same ownerRefs computation used for new
objects so owner-based cleanup is applied to preexisting LocalVolumeDeviceLink
instances.

---

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 222: The call to controllerutil.CreateOrUpdate is using context.TODO()
instead of the function's passed ctx; update the call site so
controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error { ... })
uses the existing ctx parameter to ensure proper cancellation and timeout
propagation (look for controllerutil.CreateOrUpdate, existingPV and opRes in
this function).

---

Nitpick comments:
In `@pkg/diskmaker/controllers/lvset/reconcile.go`:
- Around line 471-477: The current code treats a failure from
dev.GetPathByID("") as fatal and returns the error; change this to non-fatal
handling so devices without by-id paths don't block reconciliation: in the
reconcile logic around dev.GetPathByID("") (the symbol preferredSymLink and its
err), catch the error, log it at warning/verbose level with context (include
disk/devLabelPath and the error) and set preferredSymLink to an empty string or
nil so processing continues; only return an error if subsequent logic actually
requires a by-id path (leave a TODO comment referencing this change so it can be
revisited), ensuring the function does not return early on GetPathByID failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb0d32cc-ef07-489c-95ad-bf2d493c7701

📥 Commits

Reviewing files that changed from the base of the PR and between b6ff9d4 and f46ac96.

📒 Files selected for processing (20)
  • HACKING.md
  • api/v1/localvolumedevicelink_types.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml

Comment on lines +56 to +70
// Keep user-configured policy and only reconcile the pv name.
if existing.Spec.PersistentVolumeName == pvName {
return existing, nil
}

klog.Infof("updating lvdl object: %s", pvName)

existingCopy := existing.DeepCopy()
existingCopy.Spec.PersistentVolumeName = pvName

err = dl.client.Update(ctx, existingCopy)
if err != nil {
return nil, fmt.Errorf("error updating localvolumedevicelink object: %w", err)
}
return existingCopy, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconcile owner refs on existing LVDLs too.

The early return on Line 57 preserves user policy, but it also leaves any preexisting LocalVolumeDeviceLink without refreshed metadata.ownerReferences. That means manually precreated objects, or objects carried over from earlier revisions, won't participate in owner-based cleanup even after this PR adds owner refs for newly created ones.

🛠️ Proposed fix
-	// Keep user-configured policy and only reconcile the pv name.
-	if existing.Spec.PersistentVolumeName == pvName {
-		return existing, nil
-	}
-
-	klog.Infof("updating lvdl object: %s", pvName)
-
-	existingCopy := existing.DeepCopy()
-	existingCopy.Spec.PersistentVolumeName = pvName
+	existingCopy := existing.DeepCopy()
+	needsUpdate := false
+	if existingCopy.Spec.PersistentVolumeName != pvName {
+		existingCopy.Spec.PersistentVolumeName = pvName
+		needsUpdate = true
+	}
+	if !isNilOwnerObject(ownerObj) {
+		desiredOwnerRefs, err := buildOwnerRefs(ownerObj)
+		if err != nil {
+			return nil, err
+		}
+		if !equality.Semantic.DeepEqual(existingCopy.OwnerReferences, desiredOwnerRefs) {
+			existingCopy.OwnerReferences = desiredOwnerRefs
+			needsUpdate = true
+		}
+	}
+	if !needsUpdate {
+		return existing, nil
+	}
+
+	klog.Infof("updating lvdl object: %s", pvName)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/device_link_handler.go` around lines 56 - 70, The early return
when existing.Spec.PersistentVolumeName == pvName skips reconciling
metadata.ownerReferences on preexisting LocalVolumeDeviceLink objects; instead,
detect when the PV name matches and still reconcile ownerReferences on the
existing object (compare desired ownerRefs vs
existing.ObjectMeta.OwnerReferences), and if they differ create a deep copy
(existing.DeepCopy()), set the updated OwnerReferences, and call
dl.client.Update(ctx, copied) (same error handling as the create/update path).
Ensure you reference and reuse the same ownerRefs computation used for new
objects so owner-based cleanup is applied to preexisting LocalVolumeDeviceLink
instances.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 18, 2026

@gnufied: This pull request references STOR-2870 which is a valid jira issue.

Details

In response to this:

Fixes https://issues.redhat.com/browse/STOR-2870

Created with: AI assistance for tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jsafrane
Copy link
Contributor

@CodeRabbit please cross-check this PR with the enhancement: https://github.com/openshift/enhancements/blob/master/enhancements/storage/lso-recreate-symlinks.md
What's missing in the implementation? What's not mentioned in the KEP?

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me fetch the enhancement document and then compare it against the PR implementation.


🏁 Script executed:

#!/bin/bash
curl -sL "https://raw.githubusercontent.com/openshift/enhancements/master/enhancements/storage/lso-recreate-symlinks.md"

Unrecoverable error: Request timed out after 300000ms (requestId=8ceba741-de8a-4825-9ffd-b7a5f3b61db9)


Let me try fetching the enhancement doc via the GitHub CLI instead.


🏁 Script executed:

#!/bin/bash
gh api repos/openshift/enhancements/contents/enhancements/storage/lso-recreate-symlinks.md --jq '.content' | base64 -d

Length of output: 75


Let me try another approach to fetch the document.


🏁 Script executed:

#!/bin/bash
wget -q -O - --timeout=30 "https://raw.githubusercontent.com/openshift/enhancements/refs/heads/master/enhancements/storage/lso-recreate-symlinks.md" 2>&1 || echo "wget failed"
# also try with curl short timeout
curl -sL --max-time 30 "https://raw.githubusercontent.com/openshift/enhancements/refs/heads/master/enhancements/storage/lso-recreate-symlinks.md" 2>&1 || echo "curl failed"

Length of output: 16680


@jsafrane Here is a detailed cross-check between this PR and the lso-recreate-symlinks enhancement:


✅ What the PR implements (matching the KEP)

KEP Requirement Status
Create LocalVolumeDeviceLink object per PV from diskmaker ✅ Done (DeviceLinkHandler.Create)
status.currentLinkTarget, status.preferredLinkTarget, status.validLinkTargets, status.filesystemUUID populated ✅ Done (DeviceLinkHandler.UpdateStatus)
Owner reference set to LocalVolume or LocalVolumeSet ✅ Done (buildOwnerRefs)
LVDL deleted when owner object is deleted (via owner ref GC) ✅ Done
PV deleted/recreated but LVDL policy persists ✅ Done (PR commit message covers this)
Basic unit tests and e2e tests for LVDL creation/status/deletion ✅ Done

❌ What's missing (KEP requires, PR doesn't implement)

1. DefaultDeviceLinkPolicy on LocalVolume and LocalVolumeSet

The KEP specifies adding DefaultDeviceLinkPolicy LocalVolumeDeviceLinkPolicy to both LocalVolumeSpec and LocalVolumeSetSpec. Neither the AI summary nor the diff for these types shows this field being added.

2. Alert logic (two conditions)

The KEP specifies LSO should fire an alert when:

  • spec.policy == None AND status.currentLinkTarget != status.preferredLinkTarget
  • spec.policy == None AND no by-id symlink is found at all

No PrometheusRule or alerting code is introduced in this PR.

3. Symlink recreation when spec.policy == PreferredLinkTarget

This is the core feature of the enhancement. When an administrator sets spec.policy = PreferredLinkTarget, diskmaker should recreate the /mnt/local-storage/<sc>/<name> symlink to point to the new status.preferredLinkTarget. The PR only creates and populates LVDL objects — it does not act on the policy field.

4. Failure condition on symlink recreation error

The KEP states: "diskmaker will add a failure condition and retry as part of the reconcile loop" if symlink recreation fails. The status.Conditions []operatorv1.OperatorCondition field exists in the type, but no condition-setting logic is implemented.

5. Udev event triggered reconcile with rate limiting

The KEP states reconcile should be triggered by udev disk changes as an indication that by-id symlinks may have changed, with rate limiting. No udev-triggered reconcile or rate limiter is introduced in this PR.

6. E2e tests for policy-driven scenarios

The KEP's test plan calls for e2e tests covering:

  • Alert fires when policy == None
  • Symlinks fixed automatically when policy == PreferredLinkTarget
  • No alert and no change when policy == CurrentLinkTarget

The PR's e2e tests only cover LVDL creation, status field population, and deletion.


🆕 What's in the PR but not described in the KEP

PR Addition Notes
processRejectedDevicesForDeviceLinks Creates LVDL entries for devices that matched but were rejected by device filters. KEP only mentions LVDL creation for provisioned PVs.
CreateLocalPVArgs struct refactoring Implementation detail, not API-level.
GetSymlinkedForCurrentSC utility Internal helper, not described.
Exported internal.DiskLocation struct Refactoring to support the new code paths.

Summary

This PR is clearly scoped to phase 1 — LVDL object creation and status population. The following significant KEP requirements remain unimplemented:

  1. DefaultDeviceLinkPolicy on LocalVolume / LocalVolumeSet
  2. Alert rules (two conditions)
  3. Symlink recreation when policy == PreferredLinkTarget (the headline feature)
  4. Udev-triggered reconcile with rate limiting
  5. E2e tests for policy-driven scenarios

@jsafrane
Copy link
Contributor

@CodeRabbit, does the PR do anything wrong, against what was agreed in the KEP?

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me re-fetch the KEP and cross-check specific implementation details for correctness.


🏁 Script executed:

#!/bin/bash
curl -sL --max-time 30 "https://raw.githubusercontent.com/openshift/enhancements/refs/heads/master/enhancements/storage/lso-recreate-symlinks.md" 2>&1

Length of output: 16625


🏁 Script executed:

#!/bin/bash
# Check the device_link_handler to see what default policy is set
cat pkg/internal/device_link_handler.go

Length of output: 7882


🏁 Script executed:

#!/bin/bash
# Check the LVDL types to understand spec fields
cat api/v1/localvolumedevicelink_types.go

Length of output: 5914


🏁 Script executed:

#!/bin/bash
# Check how rejected devices LVDL creation works
grep -n "processRejectedDevicesForDeviceLinks\|rejectedDevices" pkg/diskmaker/controllers/lv/reconcile.go pkg/diskmaker/controllers/lvset/reconcile.go

Length of output: 1376


🏁 Script executed:

#!/bin/bash
# Check what policy is set during LVDL generation and if DefaultDeviceLinkPolicy is used
grep -n "policy\|Policy\|DefaultDevice" pkg/internal/device_link_handler.go pkg/diskmaker/controllers/lv/reconcile.go pkg/diskmaker/controllers/lvset/reconcile.go api/v1/localvolume_types.go api/v1alpha1/localvolumeset_types.go 2>/dev/null | head -80

Length of output: 191


@jsafrane Here is a focused analysis of what the PR implements incorrectly or contrary to the KEP, rather than what's merely missing:


1. 🔴 LVDLs created for rejected devices — contradicts KEP

The KEP states:

"LSO will create new LocalVolumeDeviceLink objects for each PV created by LSO's diskmaker."

The PR introduces processRejectedDevicesForDeviceLinks in both the lv and lvset reconcilers, which creates LVDL objects for devices that matched the selector but were rejected by device filters (e.g., already in use, wrong size, etc.). These devices have no PV.

This directly contradicts the KEP's stated scope. It also creates a tension with spec.persistentVolumeName being marked +required with +kubebuilder:validation:MinLength=1 — if there's no PV, what gets set there?


2. 🟡 json:"status,omitzero" is not a valid standard Go JSON tag

Status LocalVolumeDeviceLinkStatus `json:"status,omitzero"`

omitzero is only valid in Go 1.24's experimental encoding/json/v2. The standard encoding/json used by Kubernetes will silently ignore this option, meaning the Status field will never be omitted even when empty — the opposite of the intent. All other Kubernetes types use omitempty.


3. 🟡 +required kubebuilder marker contradicts omitempty on CurrentLinkTarget

// +required
// +kubebuilder:validation:MinLength=1
CurrentLinkTarget string `json:"currentLinkTarget,omitempty"`

+required tells kubebuilder to emit this field as required in CRD validation (it will reject objects without it). But omitempty tells the JSON serializer to omit the field when it's empty. This means diskmaker code can produce an LVDL object that fails its own CRD schema validation during the status update window before the field is populated. The KEP's API definition shows CurrentLinkTarget as omitempty only — it does not mark it as required.


4. 🟢 Minor: spec.policy is omitempty but KEP shows it as non-optional

The KEP defines:

Policy LocalVolumeDeviceLinkPolicy `json:"policy"`

The PR uses:

Policy DeviceLinkPolicy `json:"policy,omitempty"`

With omitempty, a None policy value (which is "") would be omitted from serialized output. Since the API has an enum validator and the enum includes None, this is a minor inconsistency but not functionally harmful since the default behavior is None.


Summary table

Issue Severity Contradicts KEP?
LVDLs created for rejected (no-PV) devices High ✅ Yes — KEP scopes LVDL to provisioned PVs only
omitzero JSON tag on Status field Medium No — implementation bug
+required + omitempty on CurrentLinkTarget Medium Yes — KEP shows it as omitempty only
policy is omitempty vs non-optional in KEP Low Minor

@openshift openshift deleted a comment from coderabbitai bot Mar 18, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds LocalVolumeDeviceLink (LVDL) support: CRD/type changes making some status fields optional, RBAC updates, a new internal DeviceLinkHandler, refactors to context+args PV creation and DiskLocation usage, reconciler logic to populate LVDL status, many unit/e2e test additions, and a HACKING.md note.

Changes

Cohort / File(s) Summary
API & CRD
api/v1/localvolumedevicelink_types.go, config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml, config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
Made PreferredLinkTarget/ValidLinkTargets optional in types/CRD (removed minLength/required), registered LVDL types in scheme, and added RBAC resources localvolumedevicelinks and localvolumedevicelinks/status.
Common utilities
pkg/common/provisioner_utils.go, pkg/common/symlink_utils.go
Replaced CreateLocalPV signature with CreateLocalPV(ctx, CreateLocalPVArgs) and added CreateLocalPVArgs; integrated internal.NewDeviceLinkHandler and ApplyStatus calls; added GetSymlinkedForCurrentSC.
Internal package
pkg/internal/device_link_handler.go, pkg/internal/device_link_handler_test.go, pkg/internal/disk_location.go, pkg/internal/diskutil.go
Added DeviceLinkHandler (Create/ApplyStatus) and tests; introduced exported DiskLocation type; refactored disk ID/path resolution to use CmdExecutor and added GetUncachedPathID and other diskutil changes.
DiskMaker controllers (lv)
pkg/diskmaker/controllers/lv/reconcile.go, pkg/diskmaker/controllers/lv/create_pv_test.go, pkg/diskmaker/controllers/lv/reconcile_test.go
Replaced local DiskLocation with internal.DiskLocation; split device filtering to capture rejected devices; added processRejectedDevicesForDeviceLinks; switched PV creation to CreateLocalPVArgs; added blkid/symlink test helpers.
DiskMaker controllers (lvset)
pkg/diskmaker/controllers/lvset/reconcile.go, pkg/diskmaker/controllers/lvset/create_pv_test.go, pkg/diskmaker/controllers/lvset/device_age_test.go, pkg/diskmaker/controllers/lvset/reconcile_test.go
getValidDevices now returns rejected devices; provisionPV gained ctx and uses CreateLocalPVArgs; added processing of rejected devices for LVDLs; updated tests and fake client status-subresource handling.
LocalVolume controller
pkg/controllers/localvolume/localvolume_controller.go
Removed nil-guard and unconditionally appends DaemonSet GenerationStatus after successful Get.
Event reporter
pkg/diskmaker/event_reporter.go
Added event reason constant ErrorCreatingLVDL.
Tests — unit & discovery
pkg/diskmaker/discovery/discovery_test.go, pkg/internal/diskutil_test.go, pkg/diskmaker/.../*_test.go, pkg/internal/device_link_handler_test.go
Replaced os/exec TestHelperProcess with k8s.io/utils/exec/testing.FakeExec helpers; updated tests to override internal.CmdExecutor; added extensive unit tests for DeviceLinkHandler, LVDL status updates, and related flows.
E2E tests
test/e2e/localvolume_test.go, test/e2e/localvolumeset_test.go
Extended e2e flows to wait for and verify LVDL objects/status (current/preferred/valid targets, filesystem UUID) and to ensure LVDL deletion; added helper wait/verify functions.
Docs/HACKING
HACKING.md
Added section describing how to override diskmaker/operator image post-deployment (delete subscription, edit CSV).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/v1/localvolumedevicelink_types.go (1)

44-44: ⚠️ Potential issue | 🟠 Major

Resolve the +required and omitempty contradiction at line 78.

Line 78 marks CurrentLinkTarget with +required but uses json:"currentLinkTarget,omitempty", which allows the field to be omitted from JSON serialization. This conflicts with the required constraint.

Line 44 (Status field) is correctly marked +optional and uses json:"status,omitzero", which is supported by Go 1.20+.

Suggested fix
 	// currentLinkTarget is the current devicePath being used by local storage operator
 	// ...
 	// +required
 	// +kubebuilder:validation:MinLength=1
 	// +kubebuilder:validation:MaxLength=4096
-	CurrentLinkTarget string `json:"currentLinkTarget,omitempty"`
+	CurrentLinkTarget string `json:"currentLinkTarget"`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1/localvolumedevicelink_types.go` at line 44, The field
CurrentLinkTarget is declared as +required but its JSON tag uses omitempty which
contradicts the required marker; update the struct field CurrentLinkTarget to
remove the omitempty from the json tag (e.g. change
json:"currentLinkTarget,omitempty" to json:"currentLinkTarget") so the +required
constraint is honored, ensuring the generated CRD/schema treats
CurrentLinkTarget as mandatory; alternatively, if the intent is optional, remove
the +required marker instead—apply the change to the CurrentLinkTarget
declaration in the localvolumedevicelink types.
♻️ Duplicate comments (1)
HACKING.md (1)

9-12: ⚠️ Potential issue | 🟡 Minor

Align the section title with the actual steps and clarify OLM behavior.

The title mentions both "diskmaker image or operator image" but step 2 only covers the diskmaker image. Additionally, deleting the Subscription stops future upgrades but does not stop OLM from using the already-installed CSV.

✏️ Suggested alignment
-1. Delete the subscription object so as OLM is no longer managing the operator.
-2. Edit the CSV object to override diskmaker image.
+1. Delete the Subscription to stop OLM from replacing the installed CSV during upgrades.
+2. Edit the installed CSV to override the operator image and/or the `DISKMAKER_IMAGE` value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@HACKING.md` around lines 9 - 12, The section title and steps are inconsistent
and the OLM behavior is unclear: change the heading to reflect that you're
overriding the CSV-installed images (e.g., "Overriding CSV images after
deployment") and update the steps to (1) explain that deleting the Subscription
only stops future upgrades but does not revert the currently-installed CSV, (2)
instruct to edit the ClusterServiceVersion (CSV) to override both the diskmaker
image and the operator image fields as needed, and (3) note that OLM will
continue to use the installed CSV until it's replaced or removed; reference
Subscription and CSV in the updated prose so readers know which resource to
modify.
🧹 Nitpick comments (2)
pkg/internal/device_link_handler.go (1)

56-70: Consider reconciling owner references on existing LVDLs.

The current implementation preserves user policy but doesn't update OwnerReferences on pre-existing LVDLs. Objects created before this PR or manually will miss owner-based garbage collection. This could leave orphaned LVDLs when the parent LocalVolume/LocalVolumeSet is deleted.

♻️ Proposed fix to reconcile owner refs
 	// Keep user-configured policy and only reconcile the pv name.
-	if existing.Spec.PersistentVolumeName == pvName {
-		return existing, nil
-	}
-
-	klog.InfoS("updating lvdl object", "pvName", pvName)
-
 	existingCopy := existing.DeepCopy()
-	existingCopy.Spec.PersistentVolumeName = pvName
+	needsUpdate := false
+
+	if existingCopy.Spec.PersistentVolumeName != pvName {
+		existingCopy.Spec.PersistentVolumeName = pvName
+		needsUpdate = true
+	}
+
+	// Reconcile owner references for proper GC
+	if !isNilOwnerObject(ownerObj) && len(existingCopy.OwnerReferences) == 0 {
+		ownerRefs, err := buildOwnerRefs(ownerObj)
+		if err == nil {
+			existingCopy.OwnerReferences = ownerRefs
+			needsUpdate = true
+		}
+	}
+
+	if !needsUpdate {
+		return existing, nil
+	}
+
+	klog.InfoS("updating lvdl object", "pvName", pvName)
 
 	err = dl.client.Update(ctx, existingCopy)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/device_link_handler.go` around lines 56 - 70, When updating an
existing LocalVolumeDeviceLink (the block that checks
existing.Spec.PersistentVolumeName and then creates existingCopy), also
reconcile the ObjectMeta.OwnerReferences on existingCopy so owner-based GC is
preserved: compute the desired owner refs (e.g., from whatever creates LVDLs in
this package — use the same source of truth you use when creating new LVDLs),
merge/replace existingCopy.ObjectMeta.OwnerReferences if they differ, and then
call dl.client.Update(ctx, existingCopy). Update the logic around existing,
existingCopy, and dl.client.Update to compare and set OwnerReferences in
addition to Spec.PersistentVolumeName before performing the update.
pkg/diskmaker/controllers/lv/reconcile.go (1)

411-420: Add a comment clarifying error handling strategy and periodic reconciliation.

Errors from findMatchingDisks and processRejectedDevicesForDeviceLinks are logged but not propagated. This appears intentional for best-effort processing, but a brief comment would clarify that:

  1. This is called periodically via reconcile loop
  2. Failures are non-fatal since the next reconcile will retry
Suggested documentation
+	// Process ignored but matching devices for LVDL status updates.
+	// This is best-effort: errors are logged and retried on the next periodic reconcile.
 	ignoredButMatchingDeviceMap, err := r.findMatchingDisks(diskConfig, ignoredDevices)
 	if err != nil {
 		klog.ErrorS(err, "error finding matching devices from ignored list")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 411 - 420, Add a
concise comment above the block that calls findMatchingDisks and
processRejectedDevicesForDeviceLinks explaining the error-handling strategy:
note that this code runs as part of the periodic Reconcile loop, failures from
findMatchingDisks and processRejectedDevicesForDeviceLinks are intentionally
logged (best-effort) and not returned to allow the controller to continue, and
any transient errors will be retried on the next reconcile; place the comment
immediately above the existing ignoredButMatchingDeviceMap /
processRejectedDevicesForDeviceLinks calls and reference the functions by name
in the comment for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 529-533: Replace the malformed klog.ErrorS call that currently
passes an empty message string and pushes the message into the key/value list;
locate the call near the blockDevice.GetPathByID("") handling and change the
invocation from klog.ErrorS(err, "", "failed to get preferred device link",
"disk", blockDevice.Name) to klog.ErrorS(err, "failed to get preferred device
link", "disk", blockDevice.Name) so the human-readable message is passed as the
msg argument and disk metadata remains in the key/value pairs.
- Around line 441-445: The failure branch after calling
getSymlinkSourceAndTarget(deviceNameLocation, symLinkDirPath) currently uses
break which aborts processing the rest of the devices in the current loop;
change that break to continue so a single symlink lookup error skips only the
current device and allows the loop to continue processing other devices (update
the error handling at the source, target, idExists, err :=
getSymlinkSourceAndTarget(...) block to use continue instead of break).

In `@pkg/internal/device_link_handler_test.go`:
- Around line 87-118: Add a TestHelperProcess helper function to this test file
to satisfy the subprocess mocks used by helperCommandBlkid and
helperCommandBlkidEmpty: implement TestHelperProcess to check the
GO_WANT_HELPER_PROCESS env var, read the COMMAND env var, and for "blkid" write
the BLKIDOUT env value to stdout and exit 0; for "blkid_noop" exit 0 with empty
stdout; for any unexpected COMMAND write an error to stderr and exit non‑zero.
Ensure TestHelperProcess returns immediately when GO_WANT_HELPER_PROCESS is not
set so normal tests run.

In `@test/e2e/localvolumeset_test.go`:
- Around line 369-371: The test currently hard-fails when optional LVDL status
fields are empty; change the assertions so only CurrentLinkTarget is required
and PreferredLinkTarget and ValidLinkTargets are checked conditionally (i.e.,
only assert non-empty when the field is present/defined). Locate the assertions
using matcher.Expect and the symbols lvdl.Status.CurrentLinkTarget,
lvdl.Status.PreferredLinkTarget, and lvdl.Status.ValidLinkTargets and replace
the unconditional Expect calls for PreferredLinkTarget and ValidLinkTargets with
an if-check that only calls matcher.Expect(...).ToNot(gomega.BeEmpty(), ...)
when the field is non-nil/non-empty indicator (for strings check != "" or for
slices check len(...)>0/presence) to avoid hard failures in environments without
by-id links.

---

Outside diff comments:
In `@api/v1/localvolumedevicelink_types.go`:
- Line 44: The field CurrentLinkTarget is declared as +required but its JSON tag
uses omitempty which contradicts the required marker; update the struct field
CurrentLinkTarget to remove the omitempty from the json tag (e.g. change
json:"currentLinkTarget,omitempty" to json:"currentLinkTarget") so the +required
constraint is honored, ensuring the generated CRD/schema treats
CurrentLinkTarget as mandatory; alternatively, if the intent is optional, remove
the +required marker instead—apply the change to the CurrentLinkTarget
declaration in the localvolumedevicelink types.

---

Duplicate comments:
In `@HACKING.md`:
- Around line 9-12: The section title and steps are inconsistent and the OLM
behavior is unclear: change the heading to reflect that you're overriding the
CSV-installed images (e.g., "Overriding CSV images after deployment") and update
the steps to (1) explain that deleting the Subscription only stops future
upgrades but does not revert the currently-installed CSV, (2) instruct to edit
the ClusterServiceVersion (CSV) to override both the diskmaker image and the
operator image fields as needed, and (3) note that OLM will continue to use the
installed CSV until it's replaced or removed; reference Subscription and CSV in
the updated prose so readers know which resource to modify.

---

Nitpick comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 411-420: Add a concise comment above the block that calls
findMatchingDisks and processRejectedDevicesForDeviceLinks explaining the
error-handling strategy: note that this code runs as part of the periodic
Reconcile loop, failures from findMatchingDisks and
processRejectedDevicesForDeviceLinks are intentionally logged (best-effort) and
not returned to allow the controller to continue, and any transient errors will
be retried on the next reconcile; place the comment immediately above the
existing ignoredButMatchingDeviceMap / processRejectedDevicesForDeviceLinks
calls and reference the functions by name in the comment for clarity.

In `@pkg/internal/device_link_handler.go`:
- Around line 56-70: When updating an existing LocalVolumeDeviceLink (the block
that checks existing.Spec.PersistentVolumeName and then creates existingCopy),
also reconcile the ObjectMeta.OwnerReferences on existingCopy so owner-based GC
is preserved: compute the desired owner refs (e.g., from whatever creates LVDLs
in this package — use the same source of truth you use when creating new LVDLs),
merge/replace existingCopy.ObjectMeta.OwnerReferences if they differ, and then
call dl.client.Update(ctx, existingCopy). Update the logic around existing,
existingCopy, and dl.client.Update to compare and set OwnerReferences in
addition to Spec.PersistentVolumeName before performing the update.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e3894c0b-0423-46fa-9309-767082dedec7

📥 Commits

Reviewing files that changed from the base of the PR and between b6ff9d4 and 84f8c6c.

📒 Files selected for processing (20)
  • HACKING.md
  • api/v1/localvolumedevicelink_types.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml

Comment on lines +529 to +533
preferredSymLink, err := blockDevice.GetPathByID("")
if err != nil {
klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name)
continue
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix malformed klog.ErrorS call - empty message string.

The error message is placed in the key-value pairs instead of the message argument. This will produce confusing log output.

Proposed fix
 			preferredSymLink, err := blockDevice.GetPathByID("")
 			if err != nil {
-				klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name)
+				klog.ErrorS(err, "failed to get preferred device link", "disk", blockDevice.Name)
 				continue
 			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
preferredSymLink, err := blockDevice.GetPathByID("")
if err != nil {
klog.ErrorS(err, "", "failed to get preferred device link", "disk", blockDevice.Name)
continue
}
preferredSymLink, err := blockDevice.GetPathByID("")
if err != nil {
klog.ErrorS(err, "failed to get preferred device link", "disk", blockDevice.Name)
continue
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 529 - 533, Replace
the malformed klog.ErrorS call that currently passes an empty message string and
pushes the message into the key/value list; locate the call near the
blockDevice.GetPathByID("") handling and change the invocation from
klog.ErrorS(err, "", "failed to get preferred device link", "disk",
blockDevice.Name) to klog.ErrorS(err, "failed to get preferred device link",
"disk", blockDevice.Name) so the human-readable message is passed as the msg
argument and disk metadata remains in the key/value pairs.

Comment on lines +369 to +371
matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name)
matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name)
matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t hard-fail on optional LVDL status fields in e2e.

These assertions require PreferredLinkTarget and ValidLinkTargets to be non-empty for every PV, but the API defines both as optional. This can fail in valid environments where by-id links are unavailable.

Suggested test adjustment
 		for _, lvdl := range fsLVDLs {
 			matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name)
-			matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name)
-			matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name)
+			if lvdl.Status.PreferredLinkTarget != "" {
+				matcher.Expect(lvdl.Status.ValidLinkTargets).To(
+					gomega.ContainElement(lvdl.Status.PreferredLinkTarget),
+					"expected PreferredLinkTarget to be listed in ValidLinkTargets for LVDL %q", lvdl.Name,
+				)
+			}
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name)
matcher.Expect(lvdl.Status.PreferredLinkTarget).ToNot(gomega.BeEmpty(), "expected PreferredLinkTarget for LVDL %q", lvdl.Name)
matcher.Expect(lvdl.Status.ValidLinkTargets).ToNot(gomega.BeEmpty(), "expected ValidLinkTargets for LVDL %q", lvdl.Name)
matcher.Expect(lvdl.Status.CurrentLinkTarget).ToNot(gomega.BeEmpty(), "expected CurrentLinkTarget for LVDL %q", lvdl.Name)
if lvdl.Status.PreferredLinkTarget != "" {
matcher.Expect(lvdl.Status.ValidLinkTargets).To(
gomega.ContainElement(lvdl.Status.PreferredLinkTarget),
"expected PreferredLinkTarget to be listed in ValidLinkTargets for LVDL %q", lvdl.Name,
)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/localvolumeset_test.go` around lines 369 - 371, The test currently
hard-fails when optional LVDL status fields are empty; change the assertions so
only CurrentLinkTarget is required and PreferredLinkTarget and ValidLinkTargets
are checked conditionally (i.e., only assert non-empty when the field is
present/defined). Locate the assertions using matcher.Expect and the symbols
lvdl.Status.CurrentLinkTarget, lvdl.Status.PreferredLinkTarget, and
lvdl.Status.ValidLinkTargets and replace the unconditional Expect calls for
PreferredLinkTarget and ValidLinkTargets with an if-check that only calls
matcher.Expect(...).ToNot(gomega.BeEmpty(), ...) when the field is
non-nil/non-empty indicator (for strings check != "" or for slices check
len(...)>0/presence) to avoid hard failures in environments without by-id links.

@gnufied gnufied force-pushed the implement-lvdl-creation branch from 84f8c6c to 54da887 Compare March 19, 2026 14:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/common/provisioner_utils.go (1)

215-215: ⚠️ Potential issue | 🟡 Minor

Inconsistent context usage in CreateOrUpdate call.

The function receives ctx as a parameter and uses it for other operations (lines 131, 259), but controllerutil.CreateOrUpdate on line 215 still uses context.TODO(). This should use the passed ctx for consistency and proper cancellation propagation.

🔧 Proposed fix
-	opRes, err := controllerutil.CreateOrUpdate(context.TODO(), client, existingPV, func() error {
+	opRes, err := controllerutil.CreateOrUpdate(ctx, client, existingPV, func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` at line 215, The CreateOrUpdate call uses
context.TODO() instead of the function's passed ctx, causing inconsistent
context/cancellation handling; update the controllerutil.CreateOrUpdate
invocation to pass the existing ctx parameter (keep using the same client,
existingPV, and update func) so opRes and err are produced under the correct
context, matching the other calls that use ctx.
🧹 Nitpick comments (1)
pkg/common/provisioner_utils.go (1)

122-127: LVDL creation before PV ensures tracking object exists first.

Creating the LocalVolumeDeviceLink before checking for existing PVs ensures that for any new PVs, a tracking LVDL already exists. This is the correct ordering per the past review discussion. Consider adding a brief comment explaining this intentional sequencing.

📝 Suggested comment
 	deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client)
 
+	// Create LVDL before PV to ensure tracking object exists for any new PVs.
+	// For upgraded clusters, ApplyStatus below also creates LVDLs if missing.
 	_, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
 	if err != nil {
 		return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` around lines 122 - 127, The
LocalVolumeDeviceLink creation is intentionally done before verifying existing
PVs to ensure a tracking LVDL exists for any new PVs; update the code around
internal.NewDeviceLinkHandler and deviceHandler.Create (which creates the
LocalVolumeDeviceLink for pvName) to include a short comment stating this
deliberate ordering and why (e.g., "create LVDL before PV check so tracking
object exists for new PVs"), leaving the Create call and error handling
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/common/provisioner_utils.go`:
- Line 215: The CreateOrUpdate call uses context.TODO() instead of the
function's passed ctx, causing inconsistent context/cancellation handling;
update the controllerutil.CreateOrUpdate invocation to pass the existing ctx
parameter (keep using the same client, existingPV, and update func) so opRes and
err are produced under the correct context, matching the other calls that use
ctx.

---

Nitpick comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 122-127: The LocalVolumeDeviceLink creation is intentionally done
before verifying existing PVs to ensure a tracking LVDL exists for any new PVs;
update the code around internal.NewDeviceLinkHandler and deviceHandler.Create
(which creates the LocalVolumeDeviceLink for pvName) to include a short comment
stating this deliberate ordering and why (e.g., "create LVDL before PV check so
tracking object exists for new PVs"), leaving the Create call and error handling
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 88c95db0-7308-42ec-a5ac-6723980626b3

📥 Commits

Reviewing files that changed from the base of the PR and between 84f8c6c and 54da887.

📒 Files selected for processing (21)
  • HACKING.md
  • api/v1/localvolumedevicelink_types.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
  • pkg/common/provisioner_utils.go
  • pkg/common/symlink_utils.go
  • pkg/controllers/localvolume/localvolume_controller.go
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/disk_location.go
  • pkg/internal/diskutil.go
  • test/e2e/localvolume_test.go
  • test/e2e/localvolumeset_test.go
💤 Files with no reviewable changes (1)
  • config/manifests/stable/local.storage.openshift.io_localvolumedevicelinks.yaml
✅ Files skipped from review due to trivial changes (3)
  • HACKING.md
  • pkg/diskmaker/event_reporter.go
  • pkg/internal/disk_location.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • pkg/diskmaker/controllers/lvset/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/device_age_test.go
  • config/manifests/stable/local-storage-operator.clusterserviceversion.yaml
  • test/e2e/localvolumeset_test.go
  • test/e2e/localvolume_test.go
  • pkg/common/symlink_utils.go
  • api/v1/localvolumedevicelink_types.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/diskutil.go
  • pkg/diskmaker/controllers/lv/reconcile.go

}

if len(ignoredButMatchingDeviceMap) != 0 {
r.processRejectedDevicesForDeviceLinks(ctx, ignoredButMatchingDeviceMap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I have a Pod that uses a PV, its LocalVolumeDeviceLink gets updated when udev creates a new symlink. But when the PV is unused, it's not updated at all.

Should be processRejectedDevicesForDeviceLinks (with a better name) called on all devices?

(The code location is arbitrary. This code path actually works - ignored (=used) devices are updated, just the unused are not)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/internal/device_link_handler.go (1)

256-258: Consider reducing log verbosity for getFilesystemUUID.

The klog.InfoS at Line 257 runs on every status reconciliation. For frequently reconciled devices, this could generate excessive logs. Consider using klog.V(4).InfoS for this diagnostic message.

Proposed fix
 func getFilesystemUUID(devicePath string) (string, error) {
-	klog.InfoS("trying to get filesystem information", "devicePath", devicePath)
+	klog.V(4).InfoS("trying to get filesystem information", "devicePath", devicePath)
 	cmd := ExecCommand("blkid", "-s", "UUID", "-o", "value", devicePath)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/device_link_handler.go` around lines 256 - 258, The
getFilesystemUUID function logs a diagnostic message at Info level on every
reconciliation which can be noisy; change the klog.InfoS call inside
getFilesystemUUID to klog.V(4).InfoS so the message is emitted only when
verbosity >=4, preserving the same message and structured key ("devicePath") to
reduce log verbosity for frequently reconciled devices.
pkg/diskmaker/controllers/lv/reconcile.go (1)

546-554: Function never returns an error.

The function signature includes an error return value, but it always returns nil. Either remove the error return or add validation that could produce an error (e.g., checking for empty paths).

Option 1: Remove error return
-func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool, error) {
+func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool) {
 	if devLocation.DiskID != "" {
 		target := path.Join(symlinkDir, filepath.Base(devLocation.DiskID))
-		return devLocation.DiskID, target, true, nil
+		return devLocation.DiskID, target, true
 	} else {
 		target := path.Join(symlinkDir, filepath.Base(devLocation.UserProvidedPath))
-		return devLocation.UserProvidedPath, target, false, nil
+		return devLocation.UserProvidedPath, target, false
 	}
 }
Option 2: Add validation
 func getSymlinkSourceAndTarget(devLocation internal.DiskLocation, symlinkDir string) (string, string, bool, error) {
 	if devLocation.DiskID != "" {
 		target := path.Join(symlinkDir, filepath.Base(devLocation.DiskID))
 		return devLocation.DiskID, target, true, nil
-	} else {
+	}
+	if devLocation.UserProvidedPath != "" {
 		target := path.Join(symlinkDir, filepath.Base(devLocation.UserProvidedPath))
 		return devLocation.UserProvidedPath, target, false, nil
 	}
+	return "", "", false, fmt.Errorf("no DiskID or UserProvidedPath for device %s", devLocation.DiskNamePath)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile.go` around lines 546 - 554, The
function getSymlinkSourceAndTarget declares an error return but never returns a
non-nil error; either remove the error from the signature or add validation and
return errors when inputs are invalid. Fix option A: remove the error return and
update all callers of getSymlinkSourceAndTarget to handle the new (string,
string, bool) signature. Fix option B: add input validation inside
getSymlinkSourceAndTarget (check that symlinkDir is not empty and that either
devLocation.DiskID or devLocation.UserProvidedPath is non-empty), and return a
descriptive error (e.g., fmt.Errorf("invalid symlink inputs: ...")) when
validation fails, keeping the existing signature; update callers to handle the
possible error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/diskmaker/controllers/lv/reconcile.go`:
- Around line 546-554: The function getSymlinkSourceAndTarget declares an error
return but never returns a non-nil error; either remove the error from the
signature or add validation and return errors when inputs are invalid. Fix
option A: remove the error return and update all callers of
getSymlinkSourceAndTarget to handle the new (string, string, bool) signature.
Fix option B: add input validation inside getSymlinkSourceAndTarget (check that
symlinkDir is not empty and that either devLocation.DiskID or
devLocation.UserProvidedPath is non-empty), and return a descriptive error
(e.g., fmt.Errorf("invalid symlink inputs: ...")) when validation fails, keeping
the existing signature; update callers to handle the possible error.

In `@pkg/internal/device_link_handler.go`:
- Around line 256-258: The getFilesystemUUID function logs a diagnostic message
at Info level on every reconciliation which can be noisy; change the klog.InfoS
call inside getFilesystemUUID to klog.V(4).InfoS so the message is emitted only
when verbosity >=4, preserving the same message and structured key
("devicePath") to reduce log verbosity for frequently reconciled devices.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 014cd4ad-3dc8-4cdf-9656-9ce3830560ac

📥 Commits

Reviewing files that changed from the base of the PR and between 54da887 and c85e33a.

📒 Files selected for processing (4)
  • pkg/diskmaker/controllers/lv/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/internal/device_link_handler.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/diskmaker/controllers/lvset/reconcile.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
pkg/diskmaker/controllers/lv/reconcile_test.go (1)

375-380: ⚠️ Potential issue | 🟠 Major

This “does not create LVDL” case is only proving the nil-owner shortcut.

r.localVolume is never populated here, so a missing LVDL is skipped because ownerObj == nil in pkg/internal/device_link_handler.go, not because rejected devices are actually exempt from creation. If the intent is to lock down real reconciler behavior, this case needs to run with a real LocalVolume owner; otherwise it should be renamed to match what it is testing.

Also applies to: 440-443

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile_test.go` around lines 375 - 380, The
test "does not create lvdl while processing rejected devices" in
reconcile_test.go is exercising the nil-owner shortcut because r.localVolume is
never set; either populate r.localVolume with a real LocalVolume object so the
handler in pkg/internal/device_link_handler.go evaluates rejected-device logic
with a non-nil ownerObj, or rename the test to state it verifies the nil-owner
shortcut. Specifically, set r.localVolume (or construct a LocalVolume and assign
as owner) before calling the reconcile flow referenced by the test, and apply
the same fix to the duplicated case around the lines corresponding to the other
occurrence (the case at 440-443). Ensure references to ownerObj in
device_link_handler.go are exercised by the non-nil owner when asserting that
rejected devices are exempt from LVDL creation.
pkg/internal/device_link_handler.go (1)

53-67: ⚠️ Potential issue | 🟠 Major

Reconcile OwnerReferences on existing LVDLs too.

Both paths return an existing object unchanged. That leaves preexisting/manual LocalVolumeDeviceLink objects without the LocalVolume/LocalVolumeSet owner, and ApplyStatus cannot fix that later because it only updates the status subresource. The GC-based deletion model added in this PR will not work for those objects.

Also applies to: 213-232

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/device_link_handler.go` around lines 53 - 67, When the code
returns an existing LocalVolumeDeviceLink unchanged (the branch that checks
existing.Spec.PersistentVolumeName == pvName and the other similar return), also
reconcile and persist OwnerReferences on that existing object before returning:
compute the desired ownerRefs for the LocalVolume/LocalVolumeSet (use the same
owner-ref construction logic you use when creating a new LVDL), compare/merge
with existing.OwnerReferences, and if they differ, DeepCopy the existing object,
set the merged OwnerReferences and call dl.client.Update(ctx, updatedCopy)
(handle and wrap the error similarly). Apply the same owner-ref reconciliation
to the other return path that currently returns without updating so
preexisting/manual LVDLs get the correct owners.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/internal/diskutil.go`:
- Around line 447-448: GetPVCreationLock builds the find invocation incorrectly
by joining dirs into one pathname; change the CmdExecutor.Command call to pass
each search directory as a separate argument instead of strings.Join(dirs, " "),
i.e. construct the args slice as ["-L", <each dir from dirs...>, "-samefile",
path] and call CmdExecutor.Command with that variadic args list so
executeCmdWithCombinedOutput runs find over all roots correctly.

---

Duplicate comments:
In `@pkg/diskmaker/controllers/lv/reconcile_test.go`:
- Around line 375-380: The test "does not create lvdl while processing rejected
devices" in reconcile_test.go is exercising the nil-owner shortcut because
r.localVolume is never set; either populate r.localVolume with a real
LocalVolume object so the handler in pkg/internal/device_link_handler.go
evaluates rejected-device logic with a non-nil ownerObj, or rename the test to
state it verifies the nil-owner shortcut. Specifically, set r.localVolume (or
construct a LocalVolume and assign as owner) before calling the reconcile flow
referenced by the test, and apply the same fix to the duplicated case around the
lines corresponding to the other occurrence (the case at 440-443). Ensure
references to ownerObj in device_link_handler.go are exercised by the non-nil
owner when asserting that rejected devices are exempt from LVDL creation.

In `@pkg/internal/device_link_handler.go`:
- Around line 53-67: When the code returns an existing LocalVolumeDeviceLink
unchanged (the branch that checks existing.Spec.PersistentVolumeName == pvName
and the other similar return), also reconcile and persist OwnerReferences on
that existing object before returning: compute the desired ownerRefs for the
LocalVolume/LocalVolumeSet (use the same owner-ref construction logic you use
when creating a new LVDL), compare/merge with existing.OwnerReferences, and if
they differ, DeepCopy the existing object, set the merged OwnerReferences and
call dl.client.Update(ctx, updatedCopy) (handle and wrap the error similarly).
Apply the same owner-ref reconciliation to the other return path that currently
returns without updating so preexisting/manual LVDLs get the correct owners.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 402063dc-1b9b-45ab-93f1-96ea1b50ac10

📥 Commits

Reviewing files that changed from the base of the PR and between c85e33a and 0e101d2.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/utils/exec/testing/fake_exec.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (8)
  • pkg/diskmaker/controllers/lv/create_pv_test.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/diskmaker/controllers/lvset/create_pv_test.go
  • pkg/diskmaker/discovery/discovery_test.go
  • pkg/internal/device_link_handler.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/diskutil.go
  • pkg/internal/diskutil_test.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/internal/device_link_handler_test.go

@gnufied gnufied force-pushed the implement-lvdl-creation branch from 0e101d2 to 49e5f17 Compare March 20, 2026 19:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
pkg/diskmaker/controllers/lv/reconcile_test.go (1)

440-443: ⚠️ Potential issue | 🟠 Major

Initialize the reconciler owner in this test setup.

The "does not create lvdl while processing rejected devices" case currently passes because getFakeDiskMaker leaves r.localVolume unset, so processRejectedDevicesForDeviceLinks never exercises the real owner-backed path. That can hide rejected-device LVDL creation in normal reconcile flow.

Patch sketch
+			lv := &localv1.LocalVolume{
+				ObjectMeta: metav1.ObjectMeta{
+					Name:      "test-lv",
+					Namespace: testNamespace,
+				},
+			}
+			objs = append(objs, lv)
+
 			r, tcCtx := getFakeDiskMaker(t, tmpRoot, objs...)
+			r.localVolume = lv
 			r.runtimeConfig.Node = &corev1.Node{ObjectMeta: metav1.ObjectMeta{Name: testNodeName}}
 			r.runtimeConfig.Namespace = testNamespace

Also applies to: 523-526

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/diskmaker/controllers/lv/reconcile_test.go` around lines 440 - 443, The
test's reconciler isn't setting r.localVolume so
processRejectedDevicesForDeviceLinks skips the owner-backed path; in the test
case that checks "does not create lvdl while processing rejected devices" (and
similarly at the other affected lines) initialize the reconciler owner by
assigning a LocalVolume object to r.localVolume after calling getFakeDiskMaker
(e.g., create a minimal core object with the test namespace/name) so
processRejectedDevicesForDeviceLinks executes the real owner-backed logic;
update the test setup where getFakeDiskMaker is used to set r.localVolume and
ensure its metadata (Name/Namespace/UID) matches expectations used by
processRejectedDevicesForDeviceLinks.
pkg/internal/diskutil.go (1)

447-450: ⚠️ Potential issue | 🔴 Critical

Build the find argv slice instead of joining the directories.

Line 447 turns all search roots into one pathname. utilexec.Command does not shell-split that string, so multi-directory checks can miss existing symlinks and let the same device be claimed twice.

Patch sketch
 func GetMatchingSymlinksInDirs(path string, dirs ...string) ([]string, error) {
-	cmd := CmdExecutor.Command("find", "-L", strings.Join(dirs, " "), "-samefile", path)
+	args := append([]string{"-L"}, dirs...)
+	args = append(args, "-samefile", path)
+	cmd := CmdExecutor.Command("find", args...)
 	output, err := executeCmdWithCombinedOutput(cmd)

Run this to verify the current argv construction versus the fixed one. Expected result: joined fails with “No such file or directory”, while separate finds both symlinks.

#!/bin/bash
set -euo pipefail

sed -n '444,450p' pkg/internal/diskutil.go

tmpdir="$(mktemp -d)"
mkdir -p "$tmpdir/foo" "$tmpdir/bar"
touch "$tmpdir/dev"
ln -s "$tmpdir/dev" "$tmpdir/foo/link1"
ln -s "$tmpdir/dev" "$tmpdir/bar/link2"

python - "$tmpdir" <<'PY'
import os
import subprocess
import sys

root = sys.argv[1]
dev = os.path.join(root, "dev")
joined = os.path.join(root, "foo") + " " + os.path.join(root, "bar")

cases = {
    "joined":   ["find", "-L", joined, "-samefile", dev],
    "separate": ["find", "-L", os.path.join(root, "foo"), os.path.join(root, "bar"), "-samefile", dev],
}

for name, argv in cases.items():
    proc = subprocess.run(argv, capture_output=True, text=True)
    print(f"== {name} ==")
    print("argv:", argv)
    print("rc:", proc.returncode)
    print("stdout:", proc.stdout.strip())
    print("stderr:", proc.stderr.strip())
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/internal/diskutil.go` around lines 447 - 450, The code incorrectly joins
dirs into one argument when calling CmdExecutor.Command, so build the argv slice
instead: create an args []string starting with "-L", append each directory from
dirs separately, then append "-samefile" and path, and pass those as separate
arguments to CmdExecutor.Command (used with executeCmdWithCombinedOutput).
Update the call around CmdExecutor.Command and ensure
executeCmdWithCombinedOutput(cmd) receives the command constructed from the args
slice rather than a single joined-directory string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/common/provisioner_utils.go`:
- Around line 122-127: The code currently calls deviceHandler.Create(...) to
persist a LocalVolumeDeviceLink (LVDL) too early (deviceHandler.Create, variable
deviceHandler), before performing final mountpoint/capacity validations and
before writing the PersistentVolume (PV); move the Create call so it executes
only after all remaining PV validation checks succeed and after the PV write
completes to avoid leaving stale LVDL objects referencing non-existent PVs.
Locate every early Create usage (the shown deviceHandler.Create call and the
other similar Create calls later in the file around the referenced blocks) and
refactor so that: perform mountpoint and capacity checks, write the PV, and only
then call deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj);
ensure error handling remains intact and that rollback/cleanup is considered if
PV write fails.

---

Duplicate comments:
In `@pkg/diskmaker/controllers/lv/reconcile_test.go`:
- Around line 440-443: The test's reconciler isn't setting r.localVolume so
processRejectedDevicesForDeviceLinks skips the owner-backed path; in the test
case that checks "does not create lvdl while processing rejected devices" (and
similarly at the other affected lines) initialize the reconciler owner by
assigning a LocalVolume object to r.localVolume after calling getFakeDiskMaker
(e.g., create a minimal core object with the test namespace/name) so
processRejectedDevicesForDeviceLinks executes the real owner-backed logic;
update the test setup where getFakeDiskMaker is used to set r.localVolume and
ensure its metadata (Name/Namespace/UID) matches expectations used by
processRejectedDevicesForDeviceLinks.

In `@pkg/internal/diskutil.go`:
- Around line 447-450: The code incorrectly joins dirs into one argument when
calling CmdExecutor.Command, so build the argv slice instead: create an args
[]string starting with "-L", append each directory from dirs separately, then
append "-samefile" and path, and pass those as separate arguments to
CmdExecutor.Command (used with executeCmdWithCombinedOutput). Update the call
around CmdExecutor.Command and ensure executeCmdWithCombinedOutput(cmd) receives
the command constructed from the args slice rather than a single
joined-directory string.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b3a2a073-d6a2-412a-8701-8b16cdd63aab

📥 Commits

Reviewing files that changed from the base of the PR and between 0e101d2 and 49e5f17.

⛔ Files ignored due to path filters (1)
  • vendor/k8s.io/utils/exec/testing/fake_exec.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (4)
  • pkg/common/provisioner_utils.go
  • pkg/diskmaker/controllers/lv/reconcile_test.go
  • pkg/internal/device_link_handler_test.go
  • pkg/internal/diskutil.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/internal/device_link_handler_test.go

Comment on lines +122 to +127
deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client)

_, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
if err != nil {
return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Create the LVDL only after the remaining PV validations pass.

Lines 124-127 persist the LocalVolumeDeviceLink before the last mountpoint/capacity checks and before the PV write. Any later return leaves an LVDL whose persistentVolumeName never exists as a PV, so retries can accumulate stale device-link objects.

Patch sketch
 	deviceHandler := internal.NewDeviceLinkHandler(args.CurrentSymlink, client)
-
-	_, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
-	if err != nil {
-		return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
-	}
 
 	// Do not attempt to create or update existing PV's that have been released
 	existingPV := &corev1.PersistentVolume{ObjectMeta: metav1.ObjectMeta{Name: pvName}}
 	err = client.Get(ctx, types.NamespacedName{Name: pvName}, existingPV)
@@
 	newPV := provCommon.CreateLocalPVSpec(localPVConfig)
 
 	// Add finalizer for diskmaker symlink cleanup
 	controllerutil.AddFinalizer(newPV, LSOSymlinkDeleterFinalizer)
+
+	_, err = deviceHandler.Create(ctx, pvName, runtimeConfig.Namespace, obj)
+	if err != nil {
+		return fmt.Errorf("error creating localvolumedevicelink object: %w", err)
+	}
 
 	klog.InfoS("creating PV", "pvName", pvName)

Also applies to: 129-165, 255-257

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/provisioner_utils.go` around lines 122 - 127, The code currently
calls deviceHandler.Create(...) to persist a LocalVolumeDeviceLink (LVDL) too
early (deviceHandler.Create, variable deviceHandler), before performing final
mountpoint/capacity validations and before writing the PersistentVolume (PV);
move the Create call so it executes only after all remaining PV validation
checks succeed and after the PV write completes to avoid leaving stale LVDL
objects referencing non-existent PVs. Locate every early Create usage (the shown
deviceHandler.Create call and the other similar Create calls later in the file
around the referenced blocks) and refactor so that: perform mountpoint and
capacity checks, write the PV, and only then call deviceHandler.Create(ctx,
pvName, runtimeConfig.Namespace, obj); ensure error handling remains intact and
that rollback/cleanup is considered if PV write fails.

@gnufied gnufied force-pushed the implement-lvdl-creation branch from 49e5f17 to a36f021 Compare March 20, 2026 19:24
@gnufied gnufied force-pushed the implement-lvdl-creation branch from a36f021 to 0c70945 Compare March 23, 2026 11:39
gnufied added 5 commits March 23, 2026 11:45
Fix rbac rules and bugs to enable creation of lvdl objects
refactor code to findexisting lvldl
Change UpdateStatus to ApplyStatus
Refactor the code to directly use BlockDevice
Add comments about processRejectedDevicesForDeviceLinks
@gnufied gnufied force-pushed the implement-lvdl-creation branch from 0c70945 to 88063ca Compare March 23, 2026 15:55
@gnufied gnufied force-pushed the implement-lvdl-creation branch from 88063ca to 2b33356 Compare March 23, 2026 16:40
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 23, 2026

@gnufied: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

// DevicePath is the full path to the device (e.g. /dev/sda).
// filepath.Base is applied internally where only the kernel name is needed.
DevicePath string
SymLinkPath string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation to be corrected:

$ gofmt -d pkg/common/provisioner_utils.go
diff pkg/common/provisioner_utils.go.orig pkg/common/provisioner_utils.go
--- pkg/common/provisioner_utils.go.orig
+++ pkg/common/provisioner_utils.go
@@ -50,9 +50,9 @@
 	StorageClass          storagev1.StorageClass
 	MountPointMap         sets.String
 	Client                client.Client
-	SymLinkPath      string
-	IDExists         bool
-	ExtraLabelsForPV map[string]string
+	SymLinkPath           string
+	IDExists              bool
+	ExtraLabelsForPV      map[string]string
 
 	// CurrentSymlink points to source to which SymLinkPath points to.
 	CurrentSymlink string

I see you do it in the next commit, but it's better to fix it here (because the next commit has nothing to do with such fix)

type DiskLocation struct {
// DiskNamePath stores full device name path - "/dev/sda"
DiskNamePath string
// UserProvidedPath is the path supplied by the user in the LocalVolume CR. Empty in LocalVolumeSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct (DiskLocation) is essentially copy/pasta of what we had in pkg/diskmaker/controllers/lv/reconcile.go before this PR. This makes me think that this struct must be used somewhere beyond LV Reconcile. And this comment hints to LVS. But in fact, with all commits of this PR applied, this struct is still used only in LV reconcile.go. Please clarify what actually justifies this movement of this struct to this new file and give some details about the comment (Empty in LocalVolumeSet). I attempted to find answers myself, but couldn't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. So a version of this PR used DiskLocation from other places and hence I had to make it public outside its own package, but then later on I think I scrapped that change but type was still exported publicly.

I can undo that if it helps.


_, err = deviceHandler.ApplyStatus(ctx, pvName, runtimeConfig.Namespace, args.BlockDevice, obj)
if err != nil {
return fmt.Errorf("error updating localvolumedevicelink object: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be logical to do r.eventSync.Report(r.localVolume, newDiskEvent(diskmaker.ErrorCreatingLVDL, ...) before returning here. Is this a bug or you omitted such Report() intentionally here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event already gets reported up the chain from place where CreateLocalPV gets called. reporting another event here will simply cause duplicated events.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied , sorry, I cannot find where it gets reported up the chain from place where CreateLocalPV gets called. Please give me file name and line of code.

I also made some debugging adding artificial error in the end of CreateLocalPV():

	// _, err = deviceHandler.ApplyStatus(ctx, pvName, runtimeConfig.Namespace, args.BlockDevice, obj)
	err = fmt.Errorf("ZZZZZ: MY ARTIFICIAL ERRROR: error updating localvolumedevicelink ZZZZZ")
	if err != nil {
		return fmt.Errorf("error updating localvolumedevicelink object: %w", err)
	}

The result is expected:

E0326 05:36:58.837106  469396 reconcile.go:486] "could not create local PV" err="error updating localvolumedevicelink object: ZZZZZ: MY ARTIFICIAL ERRROR: error updating localvolumedevicelink ZZZZZ" deviceNameLocation={"DiskNamePath":"/dev/loop1","UserProvidedPath":"/dev/loop1","DiskID":"","BlockDevice":{"name":"loop1","kname":"loop1","type":"loop","fsType":"","size":"1073741824","rota":"0","ro":"0","rm":"0"},"ForceWipe":false}

And LVDL object is crippled (no Status part at all):

$ oc get -n openshift-local-storage LocalVolumeDeviceLink local-pv-ff140cb4 -o yaml
apiVersion: local.storage.openshift.io/v1
kind: LocalVolumeDeviceLink
metadata:
  creationTimestamp: "2026-03-26T05:36:58Z"
  generation: 1
  name: local-pv-ff140cb4
  namespace: openshift-local-storage
  ownerReferences:
  - apiVersion: local.storage.openshift.io/v1
    controller: true
    kind: LocalVolume
    name: local-disks1
    uid: d0d4ce6a-e8b2-4f86-b3f3-f367fac84e1e
  resourceVersion: "360213"
  uid: 574c3ed3-302d-4ffa-a9f1-098d005a063d
spec:
  persistentVolumeName: local-pv-ff140cb4
  policy: None

But processRejectedDevicesForDeviceLinks() was never called and I cannot see any other places where we report an event for diskmaker.ErrorCreatingLVDL from LV (not LVS) code-paths. Please give me some details backing up your words about "reporting another event here will simply cause duplicated events".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processRejectedDevicesForDeviceLinks is only called for devices which were already mounted and consumed by previously created LSO PV. It won't be called for devices which has a backing PV, but it isn't used.

In - Reconcile function of lvset reconciler, where we call provisionPV:

		err = r.provisionPV(ctx, lvset, blockDevice, *storageClass, mountPointMap, symlinkSourcePath, symlinkPath, idExists)
		if err == common.ErrTryAgain {
			requeueTime = fastRequeueTime
		} else if err != nil {
			msg := fmt.Sprintf("provisioning failed for %s: %v", blockDevice.Name, err)
			r.eventReporter.Report(lvset, newDiskEvent(diskmaker.ErrorProvisioningDisk, msg, blockDevice.KName, corev1.EventTypeWarning))
			klog.Error(msg)
			continue
		}

This will log any errors coming from provisionPV which in turn calls CreateLocalPV to be reported as event.

It looks like for lv/reconciler.go though, we are simply logging the error and not regenerating an event. I could fix it, but a lot of those code has changed in #598 which actually implements device relinking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namespace: diskMakerDS.Namespace,
Name: diskMakerDS.Name,
LastGeneration: diskMakerDS.Generation,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this change relevant to the goal of this PR? Assuming we get rid of this change, would anything become broken?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not. I will fix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this in #598

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants