fix(linstor): surface ambiguous template fallbacks and legacy orphan cleanup#13078
fix(linstor): surface ambiguous template fallbacks and legacy orphan cleanup#13078jmsperu wants to merge 1 commit intoapache:mainfrom
Conversation
…e orphans Two small visibility improvements that make existing template behaviour easier to audit, especially after upgrading from versions that predated the ref-count convention. LinstorUtil.findResourceDefinition When no resource has the exact "_cs-template-for-<rscGrpName>" Aux property, the method silently returned the first resource whose name starts with the requested prefix. With multiple resource groups on a single controller, this can route a clone to the wrong template. Now logs a WARN naming the requested rscGrpName, the fallback resource, and the actual aux properties present. Behaviour unchanged: still returns the first match. Operators can act on the warning by setting the property explicitly or removing the unrelated definition. LinstorStorageAdaptor.deRefOrDeleteResource When deleting a resource that has zero `_cs-template-for-` aux properties AND whose name starts with the template-name prefix the caller is acting on, log an INFO line. These are legacy templates cached before the ref-count convention was introduced — they get picked up by the existing "if expectedProps == 0" branch and deleted. Surfacing them lets operators see how many orphans existed at upgrade time and confirm the cleanup happened.
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
There was a problem hiding this comment.
Pull request overview
This PR improves operator visibility in the LINSTOR template-handling path by adding targeted log lines around (1) ambiguous template selection fallback and (2) legacy/orphan template cleanup, while intending to preserve existing behavior.
Changes:
- Add a WARN log in
LinstorUtil.findResourceDefinitionwhen falling back to the first name-prefix match due to missing expected_cs-template-for-<rscGrp>aux property. - Add an INFO log in
LinstorStorageAdaptor.deRefOrDeleteResourceintended to surface deletion of legacy templates that have zero_cs-template-for-aux properties.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
plugins/storage/volume/linstor/src/main/java/org/apache/cloudstack/storage/datastore/util/LinstorUtil.java |
Adds WARN logging to make ambiguous template fallback selection visible (includes chosen fallback and related aux keys). |
plugins/storage/volume/linstor/src/main/java/com/cloud/hypervisor/kvm/storage/LinstorStorageAdaptor.java |
Adds INFO logging intended to distinguish legacy-template orphan cleanup during deref/delete logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Surface the case where a resource that LOOKS like a template (resource name | ||
| // starts with the requested prefix) has zero `_cs-template-for-` aux properties | ||
| // even though we never decremented one — that's a legacy template predating the | ||
| // ref-count convention. Logging it before deletion lets operators audit how | ||
| // many such orphans existed at upgrade time. | ||
| if (expectedProps == 0 && rd.getName().toLowerCase().startsWith(rscName.toLowerCase())) { | ||
| logger.info("Linstor: deleting resource {} which has no _cs-template-for- aux properties " + | ||
| "(legacy template predating the ref-count convention, or a stale orphan). " + | ||
| "Resource group context: {}", rd.getName(), rscGrpName); | ||
| } |
There was a problem hiding this comment.
The new INFO log for “legacy template predating the ref-count convention” will currently fire for all non-template resource deletions too. existingRDs is already built via getRDListStartingWith(api, rscName), so rd.getName().toLowerCase().startsWith(rscName.toLowerCase()) is always true; whenever expectedProps == 0 (normal volume/resource with no _cs-template-for- props), this will emit the legacy-template message and add significant log noise/misleading attribution. Tighten the condition so it only matches the intended legacy-template-orphan case (e.g., by using a reliable template discriminator, passing an isTemplate flag from the caller, or checking for a template-specific naming/property convention), and avoid logging for ordinary resource deletions.
| // Surface the case where a resource that LOOKS like a template (resource name | |
| // starts with the requested prefix) has zero `_cs-template-for-` aux properties | |
| // even though we never decremented one — that's a legacy template predating the | |
| // ref-count convention. Logging it before deletion lets operators audit how | |
| // many such orphans existed at upgrade time. | |
| if (expectedProps == 0 && rd.getName().toLowerCase().startsWith(rscName.toLowerCase())) { | |
| logger.info("Linstor: deleting resource {} which has no _cs-template-for- aux properties " + | |
| "(legacy template predating the ref-count convention, or a stale orphan). " + | |
| "Resource group context: {}", rd.getName(), rscGrpName); | |
| } |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13078 +/- ##
============================================
- Coverage 18.02% 18.02% -0.01%
+ Complexity 16621 16620 -1
============================================
Files 6029 6029
Lines 542184 542194 +10
Branches 66451 66452 +1
============================================
- Hits 97740 97735 -5
- Misses 433428 433442 +14
- Partials 11016 11017 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17649 |
|
@rp- , will you have a look at this? |
Summary
Two small visibility improvements to the LINSTOR template-handling path. Both preserve existing behaviour and only add log output that surfaces conditions operators have currently no easy way to detect.
1. `LinstorUtil.findResourceDefinition` — log on ambiguous fallback
When clone-from-template runs, the method scans for a resource whose name starts with the template prefix AND whose aux properties include `_cs-template-for-`. If no exact-property match exists, the method falls back to the first matching resource by name and returns it silently.
In setups with multiple resource groups on the same controller (or with legacy templates cached before the ref-count convention was added), this fallback can return a template that belongs to the wrong resource group, and the clone produces an unexpected result.
This change keeps the fallback behaviour but logs a WARN naming:
2. `LinstorStorageAdaptor.deRefOrDeleteResource` — log on legacy-template orphan cleanup
The ref-count branch already deletes resources that have zero remaining `_cs-template-for-` aux properties. Two conditions reach this branch:
Currently both look identical in the logs. Operators upgrading from older versions can't tell how many orphan legacy templates were cleaned up.
This change logs an INFO line in the second case, identifying the resource as a legacy template and naming the resource group context. Behaviour unchanged.
Test plan
Why three small PRs
Per the suggestion from earlier review, splitting LINSTOR plugin improvements into focused PRs:
All three target the LINSTOR primary-storage plugin and can be reviewed independently.