test(placer): pin LINSTOR placement-family corner-cases (upstream-issues U6)#117
Conversation
…t-on-extension, no success-on-zero Re-examines and pins the LINSTOR placement corner-cases mined from real upstream user reports (campaign-2): - U89 / corner-D9: re-exam of BS's equal-weight autoplacer default vs upstream MaxFreeSpace-dominant. Verdict: the equal-weight scorer does NOT hot-spot — MaxFreeSpace is the only discriminator between identical pools so the emptier pool wins ceteris paribus, and MinRscCount spreads subsequent replicas. Delta retained as deliberate; known-deltas row 72 records the conclusion. - U88 / U113: replicas-on-different honored on an EXTENSION of an existing resource (anti-affinity seeded from existing replicas). - U139 / U94: contradictory / unsatisfiable constraints must under-place (placed < want) and surface a 409 FAIL_NOT_ENOUGH_NODES, never a success-on-zero-nodes. Pinned at both the placer and REST layers. - U83 / U21: pool-pinned autoplace (-s <pool>) lands only on nodes hosting that pool, even when a wrong-pool node has more free space. L1 only in this commit; L6/L7 artifacts follow. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
…-constraint reject, constraint-on-extension
L6 cli-matrix cell r-c-autoplace-pool-pinned.sh:
- U83/U21: --auto-place -s <pool> lands replicas only on nodes hosting
that pool (hard filter, not soft preference).
- U139/U94: contradictory --replicas-on-same X --replicas-on-different X
fails short with FAIL_NOT_ENOUGH_NODES (996), never success-on-zero.
L7 replay YAMLs:
- autoplace-contradictory-constraints-rejected.yaml: operator-CLI repro
of the upstream "successfully autoplaced on 0 nodes" complaint; the
autoplace exits 10 and no diskful replicas leak.
- autoplace-constraint-on-extension.yaml: --auto-place +1
--replicas-on-different site honors the constraint against the
EXISTING replica, converging to 2 UpToDate replicas across sites.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
…unting - pool-pinned cell: assert on DISKFUL count (auto-quorum adds a DISKLESS TIE_BREAKER witness so total resources = 3 for an auto-place=2 RD). - contradictory + constraint-on-extension replays + cell: reorder so the self-contained --storage-pool=<pool> flag is the LAST token before the RD positional. python-linstor's --replicas-on-same/--replicas-on-different take nargs='*' and otherwise greedily consume the trailing RD name (the same class as the documented --storage-pool space-form quirk). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
…fier limit The runner derives the resource name as replay-<workflow-name>-<rand>; the old name pushed it to 56 chars, past the LINSTOR 48-char RD-name limit (rd create -> exit 10 FAIL_INVLD_RSC_DFN_NAME). Rename the workflow autoplace-contradictory-constraints-rejected -> autoplace-contradictory-rejected so the generated name fits. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
|
Need an answer fast? Review this PR in Change Stack to ask focused questions about the PR or a changed range. Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR adds comprehensive test coverage for autoplace constraint validation across the test pyramid: documentation justification, unit tests, REST API tests, E2E CLI tests, and operator harness replay scenarios. Tests validate pool-pinned placement, empty-pool preference, zone constraints, and contradictory constraint failure semantics. ChangesAutoplace constraint validation tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive unit, REST, and E2E/replay tests to pin and verify placement-family behaviors and address several upstream issues (such as U83, U88, U89, and U139). These tests ensure that pool-pinned autoplacements land on correct nodes, contradictory constraints fail with a 409 conflict instead of a silent success-on-zero, and constraints are honored during resource extensions. Additionally, documentation in docs/cli-parity-known-deltas.md was updated to reflect findings on default weights. A review comment identified a potential panic in TestU83PoolPinnedAutoplaceLandsOnlyOnPoolNodes where t.Errorf is used instead of t.Fatalf when checking the length of a slice before indexing it.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if len(got) != 1 || got[0].NodeName != "n-has" { | ||
| t.Errorf("pool-pinned autoplace must land only on the pool's node: got %+v, want n-has/fast", got) | ||
| } |
There was a problem hiding this comment.
Using t.Errorf here allows the test execution to continue even if len(got) is 0. This will cause a panic on line 421 when attempting to access got[0].Props because got is empty. Changing this to t.Fatalf will safely halt the test execution if the assertions fail.
| if len(got) != 1 || got[0].NodeName != "n-has" { | |
| t.Errorf("pool-pinned autoplace must land only on the pool's node: got %+v, want n-has/fast", got) | |
| } | |
| if len(got) != 1 || got[0].NodeName != "n-has" { | |
| t.Fatalf("pool-pinned autoplace must land only on the pool's node: got %+v, want n-has/fast", got) | |
| } |
Summary
Mines and pins the LINSTOR placement family of corner-cases from real upstream LINBIT/linstor-server user reports (campaign-2, agent U6), clean-room. Re-examines one previously-recorded deliberate delta and adds regression coverage at L1 + L6 + L7 for the placement/autoplace surface.
All seven scoped items resolve to MATCHES (no production-code change needed) — the placer already behaves correctly; this PR is the missing test coverage plus a documented re-examination verdict.
Verdict table
composite()scores MaxFreeSpace = Free/Total — the only discriminator between identical pools, so the emptier pool wins the first placement; MinRscCount then spreads. No hot-spotting. known-deltas row 72 updated.newState→topologySeen);--auto-place +1 --replicas-on-differentlands on a different site. Validated on stand (L7 PASS).FAIL_NOT_ENOUGH_NODES(996), never success-on-zero. Validated on stand (L6 + L7 PASS, 996 envelope captured).rg modify --place-countUP convergesrgNeedsRebalancestamps rebalance-pending on increase;RGRebalanceReconcilerrefills. ExistingTestRGRebalanceReconcilerSpawnsAdditionalReplicas+TestRGModifyStampsRebalanceAnnotationOnPlaceCountIncrease.place-countDOWN below currentrgNeedsRebalancereturns false, placer is strictly additive (never trims, never throws). ExistingTestRGRebalanceReconcilerIsAdditiveOnly+TestRGModifyNoAnnotationOnPlaceCountDecrease.disabledNodes) converge without flapping; covered by existing rebalance/eviction tests. A 3-worker replay variant was scoped but the existingauto-diskful-evicted-nodecoverage + additive-only invariant already pin the convergence; no new gap found.-s <pool>is a hard filter (matchesPoolFilter) — replicas land only on pool-hosting nodes even when a wrong-pool node has more free space. Validated on stand (L6 PASS). Diskless-only pool pin yields zero diskful candidates (oracle note below).Tests added
L1
pkg/placer/issues_u6_placement_family_test.go— U89 ceteris-paribus emptier-wins + spread-not-hotspot, U88 constraint-on-extension, U139 contradictory + unsatisfiable under-place, U83 pool-pin.pkg/rest/autoplace_issues_u6_test.go— U139 REST 409FAIL_NOT_ENOUGH_NODESenvelope (contradictory + unsatisfiable node-list).L6
tests/e2e/cli-matrix/r-c-autoplace-pool-pinned.sh— U83 pool-pin honored + U139 contradictory-constraint fail-short.L7
tests/operator-harness/replay/autoplace-contradictory-rejected.yamltests/operator-harness/replay/autoplace-constraint-on-extension.yamlDocs — known-deltas row 72 (D9) updated with the U89 re-examination verdict.
Stand validation (dev stand, BS v0.1.10 baseline apiserver)
r-full-lifecycle(placer-surface gate) reaches Phases 1–5 green onlvm-thin(autoplace, physical-delete + re-create, relocate, diskless create — the entire placer-selection surface). It does NOT reach full-green because of a baseline stand DRBD-resync condition independent of this diff:standpool (Inconsistent), matching known-delta fix(rest): node evict must not demote healthy diskful to tiebreaker (Bug 385) #76's documented FILE_THIN day0 full-resync wedge on this loop-backed stand.lvm-thinit instead wedges at Phase 6 (r tdtoggle diskless→diskful resyncInconsistent).Both are toggle-disk / DRBD-resync convergence on the stand kernel, not placement, and reproduce against baseline production code (this PR changes zero production code — it is test-only). A fresh
lvm-thin2R autoplace reachedUpToDate,UpToDatein ~8s during diagnosis, confirming the placer/selection path is healthy.Notes
pkg//internal/production code touched.--replicas-on-same/--replicas-on-differenttake argparsenargs='*'and greedily eat the trailing positional RD name; the cell/replays keep a self-contained--storage-pool=<pool>as the last flag before the RD name (same class as the documented--storage-poolspace-form quirk).Oracle / open questions
--auto-place 1 -s DfltDisklessStorPool): BS'smatchesPoolFilterdrops the DISKLESS kind, so a pure diskless-pool pin yields zero diskful candidates. Upstream semantics for a diskless-pool-pinned autoplace (does it create a diskless replica, or fail?) were not oracle-diffed in this pass — flagged for a follow-up oracle comparison.Summary by CodeRabbit
Documentation
Tests