fix(controller,satellite): operational-lifecycle parity round 1 — Bugs 384–387 + L7 harness hardening (388) by kvaps · Pull Request #83 · cozystack/blockstor

Andrei Kvapil (kvaps) · 2026-06-02T23:25:02Z

Summary

Round 1 of the operational-lifecycle parity work: four user-found lifecycle bugs that the earlier REST/wire-validation bug-hunt did not cover, plus a hardening pass on the L7 replay harness that surfaced while validating them on the live stand.

Validated as a unit on the dev stand (mTLS apiserver, linstor-client 1.27.1): every fix behaves correctly under direct operator-CLI exercise, and all target + adjacent replays pass green through the corrected harness. No regressions.

Bugs fixed

384 (P0, data integrity) — adding a volume to an existing multi-replica resource (vd c) left the new volume Inconsistent on every replica forever. The late-add seed path passed isWinner=false unconditionally because first-activation election was gated on !rdInitialized, so no replica seeded the new volume UpToDate. Fix recomputes the lowest-node-id winner per fresh volume in the satellite reconciler.
385 (P1) — node evict with a free tiebreaker demoted a healthy diskful replica to TieBreaker and left the evicted node's replica counted. The RD tiebreaker decision now excludes replicas on EVICTED/LOST nodes and reaps stranded witnesses.
386 (P1) — node restore did not recreate the auto-tiebreaker (the RD reconciler did not watch Node), leaving two diskful UpToDate with no witness. Adds a Node watch so the tiebreaker invariant re-runs after restore.
387 (P1) — r d of a diskful replica on a 2-diskful + 1-INACTIVE resource spawned a useless TieBreaker because the tiebreaker count treated the INACTIVE (deactivated) replica as a voting diskful. INACTIVE replicas are now excluded from the voting set.
388 (test harness) — the L7 replay assertions all_uptodate/wait_settle filtered on spec.resourceName (the CRD field is spec.resourceDefinitionName) and the tiebreaker assertions used a case-insensitive grep that matched the rd name — both silent no-ops that gave false-green. Fixed, plus no_orphans settle window, linstor-client 1.27.1 compat (-m instead of --output-fmt=json, --storage-pool= equals form, a node evict REST shim), and tolerance for Diskless/TieBreaker rows.

Test tiers

Each code bug ships its L1/L2 regression, an L6 cli-matrix cell, and an L7 replay YAML. The L7 replays were run green on the live stand through the corrected (non-vacuous) harness; the harness fix is proven non-vacuous (an Inconsistent mid-sync replica now fails all_uptodate where it previously passed).

Supersedes the per-bug branches fix/bug-384-vd-late-seed-gi, fix/bug-385-evict-tb-shuffle, fix/bug-386-restore-recreate-tb, fix/bug-387-rd-inactive-no-tb, fix/bug-388-l7-assertion-noop (their commits are included here).

Summary by CodeRabbit

Bug Fixes
- Fixed tiebreaker witness handling when cluster nodes are evicted
- Fixed tiebreaker recreation after evicted nodes are restored
- Fixed inactive replicas being incorrectly counted in quorum calculations
- Fixed late-added volume replica election logic
Tests
- Added comprehensive test coverage for tiebreaker scenarios
- Added end-to-end regression tests

… (Bug 385) ensureTiebreaker counted every replica of an RD, including a TIE_BREAKER witness sitting on a just-EVICTED node. The witness invariant then read as satisfied (diskful=2 + witness=1) and the reconciler never relocated the witness off the drained node, so `linstor n e <tiebreaker-node>` did not actually take effect; downstream the stranded witness blocked a fresh one from landing on a healthy spare, which surfaced as a healthy diskful drifting into the tiebreaker role. Treat replicas on EVICTED / LOST nodes as draining placements (mirroring the placer's disabledNodes semantic): exclude them from the witness / quorum decision and reap a TIE_BREAKER stranded on a disabled node so a fresh witness can relocate to a healthy spare, or quorum falls back to off when none remains. A healthy diskful is never demoted. Diskful replicas on disabled nodes are left to the NodeReconciler's migration path. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…Bug 385) L6 cli-matrix/n-evict-tiebreaker-no-shuffle.sh drives the real `linstor n e` CLI on the 2r+tb shape and asserts the witness leaves the evicted node while neither healthy diskful is demoted to TIE_BREAKER. L7 replay/n-evict-tiebreaker-no-shuffle.yaml codifies the operator sequence (rd c → vd c → 2x r c → tiebreaker on node3 → n e node3) and the convergence assertions: witness reaped off the evicted node (resource_absent), both diskful replicas stay UpToDate. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…g count (Bug 387) An INACTIVE replica is `drbdadm down` (operator deactivation) — its DRBD device is not up, so it casts no vote in the quorum:majority decision the auto-tiebreaker invariant defends. The RD reconciler's ensureTiebreaker counted it as a full diskful, so deleting one active diskful on a 2-active + 1-INACTIVE resource looked like "2 diskful, even parity, no user-diskless" and spuriously grew a TIE_BREAKER witness (1 active diskful + 1 witness = a 2-voter quorum with no majority protection), diverging from upstream LINSTOR. Drop INACTIVE replicas from the voting set before the diskful/diskless split so they influence neither the diskful count nor the diskless/witness count. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…wns no tiebreaker L1 regression. TestBug387InactiveReplicaNotCountedAsVotingDiskful reproduces the operator repro (1 active diskful + 1 INACTIVE after `r d`): EnsureTiebreaker must create no witness and write quorum=off. TestBug387TwoActiveDiskfulStillGetWitness is the positive control — 2 genuine active diskful + 1 ignored INACTIVE still grows a witness with quorum=majority, so the INACTIVE-exclusion does not regress the canonical auto-witness invariant. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

L6 r-d-inactive-no-tiebreaker.sh: 3 diskful → deactivate one (INACTIVE) → r d one active diskful, asserts no TIE_BREAKER is spawned and the deleted node does not re-appear as a witness. L7 r-d-inactive-no-tiebreaker.yaml: codifies the same operator sequence with no_tiebreaker + resource_absent convergence assertions. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

The RD reconciler watched only ResourceDefinitions and Resources, never Nodes. An EVICTED/LOST node is excluded as a tiebreaker witness candidate, so a 2-diskful resource whose witness collapsed while a node was drained could not regain it. After `linstor n rst` cleared the EVICTED flag, nothing enqueued the RD, so ensureTiebreaker never re-ran and the resource sat at two diskful UpToDate replicas with no quorum witness until the next periodic re-sync (split-brain risk in between). Add a Node watch (nodeDrainFlagChanged predicate + enqueueRDsForNode mapper) that re-enqueues every RD when a node's EVICTED/LOST flag set transitions, so node restore re-runs the tiebreaker invariant and the witness is re-placed. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

L1 regression coverage for the Bug 386 fix: - TestBug386NodeRestoreRecreatesTiebreaker drives the reconcile-level outcome: a 2-diskful RD with an EVICTED third node has no witness; clearing EVICTED (`n rst`) and re-running ensureTiebreaker re-places the TIE_BREAKER on the restored node. - TestBug386NodeHasDrainFlag pins the EVICTED/LOST flag-set probe that gates the Node watch predicate (Spec + Status flags). - TestBug386EnqueueRDsForNode pins the cluster-wide RD fan-out. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

… 386) Stand-side companion to the L1 unit pin: drives the real python-linstor CLI on the 2r-tb shape, evacuates the witness node to collapse the TieBreaker, restores it with `n rst`, and asserts the TieBreaker reappears in `linstor r l` within 60s. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Add the n-rst-recreates-tiebreaker replay workflow: autoplace a 2r-tb shape, `n evacuate` the witness node to collapse the TieBreaker, then `n rst` and assert the witness is re-created. Add a `tiebreaker_present` await kind (inverse of no_tiebreaker) so the workflow can assert the witness EXISTS after restore. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…ToDate (Bug 384) Adding a volume to an already-initialized multi-replica resource left the new volume stuck Inconsistent on every replica with no SyncSource. The late-add seed path (seedFreshVolumes) runs with firstActivation= false on the parent RD, so the dispatcher's first-activation winner election (auto-primary, gated on !rdInitialized) never fires once the RD is Initialized. The code therefore passed isWinner=false unconditionally and every diskful replica took the case-A skip-init-sync seed (clean bitmap, no UpToDate flag). With no replica declaring itself the UpToDate source and no primary --force on this path, the freshly-carved volume came up Inconsistent on ALL replicas and never converged. Recompute the same lowest-node-id election locally per fresh volume (isLateAddWinner): exactly one diskful replica seeds the new volume Consistent+UpToDate (case B) and becomes the SyncSource, the rest take case A. resolveVolumeSeed's PeerHasData / AnyConnectedPeerHasDataForVolume vetoes still fire first, so a relocate/migrate target never wins itself UpToDate-empty; the winner seed shares the day0 lineage anchor, so a staggered dual election agrees rather than split-brains. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

L1 (reconciler_drbd_test.go): - TestApplyLateAddedVolumeWinnerSeedsUpToDate: the lowest-node-id replica seeds the late-added vol-1 Consistent+UpToDate (positional idx4=1/idx5=1 at current=day0); vol-0 is never re-seeded. Fails pre-fix with the case-A <day0>:0:0:0 tuple. - TestApplyLateAddedVolumeNonWinnerTakesSkipInitSync: a non-lowest-id replica seeds case-A skip-init-sync and MUST NOT carry the UpToDate flag (split-brain guard). L6 (cli-matrix/bug-384-late-vd-thin-converges.sh): 2-diskful thin RD, vol-0 UpToDate, then a SECOND vd c adds vol-1; assert vol-1 reaches UpToDate on both replicas plus a kernel-truth drbdadm status guard against the stuck-Inconsistent surface. L7 (replay/vd-late-create-second-volume.yaml): operator sequence rd c -> vd c vol-0 -> r c x2 -> wait all_uptodate -> vd c vol-1, with all_uptodate (every volume of every replica) as the convergence assertion. Distinct from vd-late-create.yaml, which exercises the always-green first-volume late-add path. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

… into integration/operational-lifecycle-384-387

…to integration/operational-lifecycle-384-387

…nto integration/operational-lifecycle-384-387 Signed-off-by: Andrei Kvapil <kvapss@gmail.com> # Conflicts: # internal/controller/resourcedefinition_controller.go

…ow (Bug 388) The L7 replay assertions all_uptodate and wait_settle filtered replicas on spec.resourceName, but the Resource CRD field is spec.resourceDefinitionName. The filter matched nothing, so all_uptodate counted zero non-UpToDate replicas and returned green vacuously — a silent no-op across all 16 replay YAMLs that use it. Convergence was never actually asserted. Fix the field name and make all_uptodate tolerate Diskless/TieBreaker replicas (never UpToDate by design) while still flagging a diskful replica stuck Inconsistent/Outdated/SyncTarget. Treat an rd with zero observed replicas as not-yet-converged so the waiter keeps polling instead of passing on an absent resource. Also give the no_orphans invariant a settle window: teardown removes Resource CRDs asynchronously, so a single post-teardown snapshot races GC and reports phantom orphans. Poll up to NO_ORPHANS_SETTLE_S (default 30s). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…teps linstor-client 1.27.1 declares --storage-pool with nargs='*', so the form "--storage-pool stand <rd>" makes argparse greedily consume the rd positional into the storage-pool list and the resource create fails with "too few arguments". Use the equals form --storage-pool=<sp> so the rd positional is preserved. Affects every replay that drives resource create --auto-place ... --storage-pool ... Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Three client-compat defects surfaced running the L7 replays against the stand client (1.27.1): - replica_count / vd_size_kib used --output-fmt=json, which 1.27.1 does not accept (only -m / --machine-readable). The CLI errored, the parser saw empty input and returned 0, so every replica_count await timed out. Switch to -m; the nested-list parser already handles its shape. - no_tiebreaker / tiebreaker_present grepped case-INSENSITIVELY for TieBreaker. The replay RD names contain the lowercase substring "tiebreaker" (replay-*-tiebreaker-*), so grep -ci matched the resource name on every data row: no_tiebreaker could never reach 0 and timed out on a correct cluster, tiebreaker_present passed vacuously. Match the State token TieBreaker case-sensitively. - node evict is a controller action that 1.27.1 does not expose as a subcommand (only evacuate/lost/restore). Route node evict through the REST endpoint PUT /v1/nodes/<node>/evict so the replay step stays operator-faithful. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

…skless The Bug 385 replay hand-created the node3 witness with r c node3 --diskless. Bug 385's stranded-witness reaping (removeStrandedWitnesses) only deletes replicas carrying the TieBreaker flag; a user-placed --diskless replica is operator-owned and is never auto-reaped on evict, so the resource_absent await timed out against a correct controller. Let the RD reconciler auto-place the TIE_BREAKER on the only remaining node (node3) and assert tiebreaker_present before the evict — the same auto-witness shape the cli-matrix catcher uses. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Bug 384 is a deterministic per-volume winner-election defect in the satellite (seedFreshVolumes / isLateAddWinner), not a sync-timing race, so the trigger is size-independent. A 1G fresh-replica sync on an I/O-constrained stand (~0.5 MB/s) overshoots the 240s all_uptodate await; 64M converges in seconds and exercises the identical late-add code path. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

coderabbitai · 2026-06-02T23:25:09Z

📝 Walkthrough

Walkthrough

This PR fixes four interconnected bugs affecting DRBD tiebreaker auto-placement and satellite volume election. The satellite now correctly elects a per-volume winner for late-added volumes. The controller refines tiebreaker logic to exclude evicted/lost/inactive nodes, removes stranded witnesses, and re-runs when node drain flags toggle. Unit, e2e, and replay tests validate all scenarios.

Changes

Bug Fixes: Satellite and Controller

Layer / File(s)	Summary
Late-added volume winner election (Bug 384) `pkg/satellite/reconciler.go`	`seedFreshVolumes` now computes `isWinner` by comparing local node-id to peer node-ids, ensuring only the lowest-id replica seeds UpToDate and others use skip-init-sync on late-added volumes.
Controller tiebreaker filtering and disabled-node handling (Bugs 385–387) `internal/controller/resourcedefinition_controller.go`, `internal/controller/export_test.go`	`ensureTiebreaker` derives a disabled-node set for EVICTED/LOST nodes, removes stranded TIE_BREAKER witnesses, filters INACTIVE replicas before quorum decisions, and adds a Node watch to re-run when drain flags toggle; exports test helpers `EnqueueRDsForNode` and `NodeHasDrainFlag`.

Test Coverage and Harness

Layer / File(s)	Summary
Unit tests for late-add winner election (Bug 384) `pkg/satellite/reconciler_drbd_test.go`	Two tests verify that the winner replica seeds UpToDate while non-winners use skip-init-sync, and that per-volume metadata creation and convergence work correctly.
Unit tests for evicted node witness handling (Bug 385) `internal/controller/ensure_tiebreaker_evict_bug_385_test.go`	Three tests verify that stranded TIE_BREAKER witnesses are removed from evicted nodes, optionally relocated to spare nodes, and that evicted diskful replicas are not counted in witness quorum.
Unit tests for node drain flag watch and witness recreation (Bug 386) `internal/controller/ensure_tiebreaker_test.go`	Three tests verify that witnesses are recreated when evicted nodes are restored, that drain-flag predicates correctly identify EVICTED/LOST transitions, and that node changes fan out reconcile requests to all ResourceDefinitions.
Unit tests for INACTIVE replica exclusion (Bug 387) `internal/controller/ensure_tiebreaker_inactive_bug_387_test.go`	Two tests verify that INACTIVE replicas are not counted toward the diskful voting set, so a single active diskful replica with INACTIVE peers does not trigger auto-tiebreaker, while two active diskfuls still do.
Test harness improvements for node evict and assertions `tests/operator-harness/lib.sh`, `tests/operator-harness/replay-runner.sh`	Harness adds REST-based node evict support, fixes TieBreaker detection to use case-sensitive grep, treats Diskless replicas as acceptable in uptodate checks, filters snapshots by resourceDefinitionName, and extends orphan-cleanup polling; documents new `tiebreaker_present` assertion kind.
E2E test scenarios for late-add volume convergence (Bug 384) `tests/e2e/cli-matrix/bug-384-late-vd-thin-converges.sh`, `tests/operator-harness/replay/vd-late-create-second-volume.yaml`	CLI-matrix and replay tests verify that a second volume added to an existing resource converges to UpToDate on all replicas without stuck Inconsistent state or split-brain.
E2E test scenarios for evicted tiebreaker handling (Bug 385) `tests/e2e/cli-matrix/n-evict-tiebreaker-no-shuffle.sh`, `tests/operator-harness/replay/n-evict-tiebreaker-no-shuffle.yaml`	CLI-matrix and replay tests verify that evicting a tiebreaker node reaps or relocates the witness, does not demote healthy diskfuls, and handles node cleanup correctly.
E2E test scenarios for witness recreation on node restore (Bug 386) `tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh`, `tests/operator-harness/replay/n-rst-recreates-tiebreaker.yaml`	CLI-matrix and replay tests verify that evacuating a tiebreaker node collapses the witness, and restoring the node via `n rst` triggers re-creation of the witness.
E2E test scenarios for INACTIVE replica behavior (Bug 387) `tests/e2e/cli-matrix/r-d-inactive-no-tiebreaker.sh`, `tests/operator-harness/replay/r-d-inactive-no-tiebreaker.yaml`	CLI-matrix and replay tests verify that deactivating a replica and deleting another does not auto-spawn a tiebreaker when only one active diskful remains.
YAML test format updates (--storage-pool argument) `tests/operator-harness/replay/auto-place-.yaml`, `tests/operator-harness/replay/-lifecycle.yaml`	Multiple replay YAML files updated to pass `--storage-pool` as a single combined argument instead of separate tokens for consistency with CLI invocation semantics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

cozystack/blockstor#36: Modifies ensureTiebreaker witness-handling logic in the same controller file; direct code-level connection on tiebreaker removal semantics.
cozystack/blockstor#31: Validates orphan TIE_BREAKER cleanup when diskful replicas are removed; related to this PR's witness cleanup invariants.

Poem

🐰 Four bugs now mended in the witness realm,
Late volumes elected, no split-brain's helm,
Drained nodes shed their orphaned ties,
Inactive whispers pass unwise,
The tiebreaker dance—at last—stands true! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: fixing four operational-lifecycle bugs (384-387) and hardening the L7 harness (388) in controller and satellite components.
Docstring Coverage	✅ Passed	Docstring coverage is 81.48% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch integration/operational-lifecycle-384-387

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request addresses several critical bugs (Bugs 384, 385, 386, and 387) in the blockstor controller and satellite reconciler, improving volume initialization, node eviction, node restoration, and inactive replica handling. The review feedback suggests adding defensive nil checks in isLateAddWinner and nodeHasDrainFlag to prevent potential panics, logging a swallowed error in enqueueRDsForNode, and using grep -F in the test harness scripts to avoid unexpected regex matching.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-02T23:26:33Z

+func isLateAddWinner(dr *intent.DesiredResource) bool {
+	localID, ok := localNodeIDFromOpts(dr)
+	if !ok {
+		return false
+	}


If dr is nil, calling localNodeIDFromOpts(dr) or accessing its properties will cause a nil pointer dereference panic. Adding a defensive nil check at the beginning of the function ensures robustness.

func isLateAddWinner(dr *intent.DesiredResource) bool { if dr == nil { return false } localID, ok := localNodeIDFromOpts(dr) if !ok { return false }

gemini-code-assist · 2026-06-02T23:26:33Z

+func nodeHasDrainFlag(n *blockstoriov1alpha1.Node) bool {
+	for _, f := range n.Spec.Flags {
+		if f == apiv1.NodeFlagEvicted || f == apiv1.NodeFlagLost {


If n is nil, accessing n.Spec.Flags will cause a nil pointer dereference panic. Adding a defensive nil check at the beginning of the function ensures robustness.

func nodeHasDrainFlag(n *blockstoriov1alpha1.Node) bool { if n == nil { return false } for _, f := range n.Spec.Flags { if f == apiv1.NodeFlagEvicted || f == apiv1.NodeFlagLost {

gemini-code-assist · 2026-06-02T23:26:33Z

+	err := r.List(ctx, &rdList)
+	if err != nil {
+		return nil
+	}


The error returned by r.List is silently swallowed. In Kubernetes controllers, swallowing errors in event handlers or mappers makes debugging extremely difficult because there are no logs indicating why the event failed to trigger reconciliations. It is highly recommended to log the error using a logger (e.g., log.FromContext(ctx)) before returning nil.

gemini-code-assist · 2026-06-02T23:26:33Z

+        leftover=$(kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \
+            | grep -c "$prefix" || true)


Using grep without the -F flag treats the $prefix as a regular expression. If the prefix contains regex metacharacters (such as .), it could lead to unexpected matches and false positives/negatives. Since $prefix is a literal string, using grep -F (or grep -Fc) is safer and more robust.

Suggested change

leftover=$(kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \

| grep -c "$prefix" || true)

leftover=$(kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \

| grep -Fc "$prefix" || true)

gemini-code-assist · 2026-06-02T23:26:33Z

+        sleep 2
+    done
+    echo "  INVARIANT FAIL: $leftover Resource CRD(s) for $prefix still present after ${settle_s}s" >&2
+    kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null | grep "$prefix" >&2 || true


Using grep without the -F flag treats the $prefix as a regular expression. Since $prefix is a literal string, using grep -F is safer and more robust to prevent unexpected regex matching.

Suggested change

kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null | grep "$prefix" >&2 || true

kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null | grep -F "$prefix" >&2 || true

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/operator-harness/replay/r-full-lifecycle.yaml (1)
128-128: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Inconsistent --storage-pool argument form.

Line 128 still uses the split form (--storage-pool, {{sp}}), while the rest of this PR normalizes to the equals form (--storage-pool={{sp}}). For consistency with linstor-client 1.27.1 compatibility and the harness command runner, consider updating this line as well.
🔧 Proposed fix
-    cmd: ["resource", "toggle-disk", "--storage-pool", "{{sp}}", "{{node1}}", "{{rd}}"]
+    cmd: ["resource", "toggle-disk", "--storage-pool={{sp}}", "{{node1}}", "{{rd}}"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/operator-harness/replay/r-full-lifecycle.yaml` at line 128, The cmd
entry for the "resource toggle-disk" step uses the split argument form
"--storage-pool", "{{sp}}" which is inconsistent with the rest of the PR; update
the cmd array element so the storage-pool argument uses the equals form (i.e.,
"--storage-pool={{sp}}") in the same "cmd" that invokes "resource toggle-disk"
to match the harness command runner and linstor-client 1.27.1 expectations.

🧹 Nitpick comments (1)

internal/controller/ensure_tiebreaker_test.go (1)
1510-1604: ⚡ Quick win

Add a predicate-level regression test for the Bug 386 watch.

These tests cover the helper and the fan-out mapper, but they still bypass nodeDrainFlagChanged(), which is the load-bearing gate for the restore path. A refactor that breaks the false→true / true→false update check would still pass here.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/ensure_tiebreaker_test.go` around lines 1510 - 1604,
Tests currently cover NodeHasDrainFlag and EnqueueRDsForNode but omit exercising
the actual predicate nodeDrainFlagChanged (the UpdateEvent gate); add unit tests
that construct old/new *blockstoriov1alpha1.Node objects and call
nodeDrainFlagChanged().Update with events that: 1) change from no drain flags to
spec/ status EVICTED/LOST (expect true), 2) change from drained to no flags
(expect true), and 3) make unrelated metadata changes or unrelated flag changes
(expect false); place these cases in
internal/controller/ensure_tiebreaker_test.go alongside the existing tests and
reference nodeDrainFlagChanged and the controllerpkg.NodeHasDrainFlag helper to
build expected outcomes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/cli-matrix/bug-384-late-vd-thin-converges.sh`:
- Around line 59-64: Replace the bare trap usage with register_strict_cleanup:
remove or stop using "trap cleanup EXIT" and instead call
register_strict_cleanup cleanup (imported from tests/e2e/lib.sh) so the existing
cleanup() function (which calls delete_rd, assert_no_orphans and
linstor_cli_teardown) is executed via register_strict_cleanup to ensure strict
teardown and cluster verification.

In `@tests/e2e/cli-matrix/n-evict-tiebreaker-no-shuffle.sh`:
- Around line 70-80: Replace the bare trap registration with the repository’s
strict cleanup helper: remove or stop using the direct "trap cleanup EXIT"
registration and instead call register_strict_cleanup to register the existing
cleanup() function (which restores "$EVICTED_NODE", calls delete_rd "$RD",
assert_no_orphans "$RD", and linstor_cli_teardown). Locate the cleanup()
definition and the trailing "trap cleanup EXIT" and replace that trap invocation
with a call to register_strict_cleanup so the test uses the shared strict
cleanup and cluster-state verification from tests/e2e/lib.sh.
- Around line 171-191: The test currently only checks .spec.flags for DISKFUL_A
and DISKFUL_B and can miss that a surviving diskful degraded its diskState to
Inconsistent/Unknown; update the loop that iterates over DISKFUL_A and DISKFUL_B
(the block referencing variables DISKFUL_A, DISKFUL_B and RD) to also query and
assert the replica's runtime diskState from the resource status (e.g., the
status field returned by kubectl for the resource) and fail if diskState is not
the expected healthy state (e.g., "UpToDate" or "Consistent"); keep the existing
flag checks but add the diskState check immediately after fetching flags so the
test fails when a surviving diskful has an unacceptable diskState.

In `@tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh`:
- Around line 177-188: The test currently only checks n_tb == 1 (using
linstor_r_l_json and variable wire) but must also verify that exactly two
non-TIE_BREAKER replicas are present and healthy after n rst; update the
assertion to parse wire (output of linstor_r_l_json "$RD") to count entries
where (.rsc_flags // []) does NOT contain "TIE_BREAKER" and whose status/state
indicates a healthy diskful replica (e.g., .state == "UpToDate" or the CLI field
representing healthy), assert that count equals 2, and fail if not; keep the
original TIE_BREAKER check but add this second check using the same wire/RD
context so the test ensures one tie-breaker plus two healthy non-TIE_BREAKER
replicas.
- Around line 64-72: The cleanup() trap currently always restores node "$N3"
which is incorrect when the evacuated witness may be a different node; modify
cleanup() to record the actual evacuated node (e.g., capture it to a variable
like EVACUATED_NODE when you run the evacuate or resource placement) and then
restore "${EVACUATED_NODE}" instead of "$N3", keeping the existing calls to
delete_rd "$RD", assert_no_orphans "$RD", and linstor_cli_teardown; replace the
bare trap cleanup EXIT with a call to register_strict_cleanup cleanup so the
test harness performs strict teardown and cluster verification (reference
functions/variables: cleanup, LCTL, N3, RD, delete_rd, assert_no_orphans,
linstor_cli_teardown, register_strict_cleanup).

In `@tests/e2e/cli-matrix/r-d-inactive-no-tiebreaker.sh`:
- Around line 62-70: Replace the bare trap call with the shared strict cleanup
registration: move the cleanup() function as-is but remove the trailing "trap
cleanup EXIT" and call register_strict_cleanup cleanup instead (ensure
register_strict_cleanup from tests/e2e/lib.sh is imported/available); this
ensures cleanup()—which calls "${LCTL[@]} resource activate", delete_rd,
assert_no_orphans and linstor_cli_teardown—runs via the standardized strict
cleanup path.
- Around line 161-184: Add an assertion that the surviving active replica ${N3}
is actually UpToDate: after computing n3_flags and before exiting success, fetch
the replica status (e.g. with kubectl get
"resources.blockstor.cozystack.io/${RD}.${N3}" -o jsonpath for the status field
you use for DRBD status) into a variable like n3_status and fail the test if
n3_status does not include or equal "UpToDate". Reference the existing symbols
RD, N3 and n3_flags when adding this check so it verifies status convergence in
addition to flag checks.

In `@tests/operator-harness/lib.sh`:
- Around line 268-280: The new tiebreaker_present check violates the replay
await.kind contract by adding an unsupported kind; remove/replace the
tiebreaker_present handler and instead implement its logic under an allowed
await.kind (e.g., reuse or extend the existing no_tiebreaker flow) so replay
YAMLs remain schema-compliant, or map any incoming await.kind value to existing
checks before use; do not change the schema/validation here — if a new kind is
truly required, add the schema/validation update in a separate PR.

In `@tests/operator-harness/replay/n-rst-recreates-tiebreaker.yaml`:
- Around line 33-68: The replay assumes node3 is the witness but auto-place-2
lets the controller pick diskfuls; update the scenario so the witness is
guaranteed before evacuating: either replace the "auto-place-2" step with an
explicit resource create that pins diskful replicas to {{node1}} and {{node2}}
(matching the Bug 385 replay), or insert a pre-evacuate verification step (e.g.,
after "wait-tiebreaker-spawned") that runs "resource list" / a check to assert
the tiebreaker/witness is on {{node3}} and fail if not; then keep the
"evacuate-node3", "restore-node3", and "tiebreaker-recreated" steps unchanged.

In `@tests/operator-harness/replay/r-d-inactive-no-tiebreaker.yaml`:
- Around line 60-75: The replay is missing a post-failure status-convergence
assertion for the surviving active replica: after the existing steps
no-tiebreaker-spawned and node2-stays-absent (which use the resource_absent
await and rd/node variables), add a new step that runs the same resource check
and awaits a status-convergence/UpToDate condition for rd on node "{{node3}}"
(use the same rd and node3 symbols), e.g. an await kind like resource_status or
status_converged with rd: "{{rd}}", node: "{{node3}}", and timeout_s: 30; also
ensure the e2e CLI matrix tests under tests/e2e/cli-matrix/** mirror this
assertion to verify real LINSTOR/DRBD status convergence.
- Around line 33-40: The replay is missing a post-delete convergence assertion:
after the existing resource_absent step that removes {{node2}} under the
no_tiebreaker scenario, add an await.kind: all_uptodate check for the resource
{{rd}} to ensure surviving diskful replica(s) (e.g., {{node3}}) return to
UpToDate; locate the delete step named resource_absent and insert an await block
with kind: all_uptodate targeting {{rd}} immediately after it so the test
verifies convergence.

In `@tests/operator-harness/replay/vd-late-create-second-volume.yaml`:
- Around line 48-53: The test commands r-c-node1 and r-c-node2 use the
two-element flag form for the storage pool argument; update each cmd array to
use the combined equals form so the harness command runner accepts it (replace
["--storage-pool", "{{sp}}"] with ["--storage-pool={{sp}}"] in the command
entries for r-c-node1 and r-c-node2).

---

Outside diff comments:
In `@tests/operator-harness/replay/r-full-lifecycle.yaml`:
- Line 128: The cmd entry for the "resource toggle-disk" step uses the split
argument form "--storage-pool", "{{sp}}" which is inconsistent with the rest of
the PR; update the cmd array element so the storage-pool argument uses the
equals form (i.e., "--storage-pool={{sp}}") in the same "cmd" that invokes
"resource toggle-disk" to match the harness command runner and linstor-client
1.27.1 expectations.

---

Nitpick comments:
In `@internal/controller/ensure_tiebreaker_test.go`:
- Around line 1510-1604: Tests currently cover NodeHasDrainFlag and
EnqueueRDsForNode but omit exercising the actual predicate nodeDrainFlagChanged
(the UpdateEvent gate); add unit tests that construct old/new
*blockstoriov1alpha1.Node objects and call nodeDrainFlagChanged().Update with
events that: 1) change from no drain flags to spec/ status EVICTED/LOST (expect
true), 2) change from drained to no flags (expect true), and 3) make unrelated
metadata changes or unrelated flag changes (expect false); place these cases in
internal/controller/ensure_tiebreaker_test.go alongside the existing tests and
reference nodeDrainFlagChanged and the controllerpkg.NodeHasDrainFlag helper to
build expected outcomes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e23c93c-6d63-4dba-aef6-ce070bc48e50

📥 Commits

Reviewing files that changed from the base of the PR and between 900c709 and 787c279.

📒 Files selected for processing (24)

internal/controller/ensure_tiebreaker_evict_bug_385_test.go
internal/controller/ensure_tiebreaker_inactive_bug_387_test.go
internal/controller/ensure_tiebreaker_test.go
internal/controller/export_test.go
internal/controller/resourcedefinition_controller.go
pkg/satellite/reconciler.go
pkg/satellite/reconciler_drbd_test.go
tests/e2e/cli-matrix/bug-384-late-vd-thin-converges.sh
tests/e2e/cli-matrix/n-evict-tiebreaker-no-shuffle.sh
tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh
tests/e2e/cli-matrix/r-d-inactive-no-tiebreaker.sh
tests/operator-harness/lib.sh
tests/operator-harness/replay-runner.sh
tests/operator-harness/replay/auto-place-no-drbd.yaml
tests/operator-harness/replay/autoplace-3r.yaml
tests/operator-harness/replay/luks-encrypted-rd.yaml
tests/operator-harness/replay/n-evict-tiebreaker-no-shuffle.yaml
tests/operator-harness/replay/n-rst-recreates-tiebreaker.yaml
tests/operator-harness/replay/pvc-lifecycle.yaml
tests/operator-harness/replay/r-d-inactive-no-tiebreaker.yaml
tests/operator-harness/replay/r-full-lifecycle.yaml
tests/operator-harness/replay/r-l-conns-shapes.yaml
tests/operator-harness/replay/vd-late-create-second-volume.yaml
tests/operator-harness/replay/vd-resize-full-lifecycle.yaml

coderabbitai · 2026-06-02T23:40:59Z

+cleanup() {
+    delete_rd "$RD"
+    assert_no_orphans "$RD"
+    linstor_cli_teardown
+}
+trap cleanup EXIT


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use register_strict_cleanup instead of bare trap directive.

The script registers cleanup using a bare trap cleanup EXIT, but should use register_strict_cleanup from lib.sh to ensure proper resource cleanup and cluster state verification. Based on learnings, e2e test scenarios should register cleanup via register_strict_cleanup from tests/e2e/lib.sh instead of bare trap directives to ensure proper resource cleanup and cluster state verification.

♻️ Proposed fix

-cleanup() { - delete_rd "$RD" - assert_no_orphans "$RD" - linstor_cli_teardown -} -trap cleanup EXIT +register_strict_cleanup delete_rd "$RD" +register_strict_cleanup assert_no_orphans "$RD" +register_strict_cleanup linstor_cli_teardown

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cleanup() {

delete_rd "$RD"

assert_no_orphans "$RD"

linstor_cli_teardown

}

trap cleanup EXIT

register_strict_cleanup delete_rd "$RD"

register_strict_cleanup assert_no_orphans "$RD"

register_strict_cleanup linstor_cli_teardown

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/cli-matrix/bug-384-late-vd-thin-converges.sh` around lines 59 - 64, Replace the bare trap usage with register_strict_cleanup: remove or stop using "trap cleanup EXIT" and instead call register_strict_cleanup cleanup (imported from tests/e2e/lib.sh) so the existing cleanup() function (which calls delete_rd, assert_no_orphans and linstor_cli_teardown) is executed via register_strict_cleanup to ensure strict teardown and cluster verification.

coderabbitai · 2026-06-02T23:41:00Z

+cleanup() {
+    # Restore any node we evicted so the shared stand is left healthy
+    # for the next cell, BEFORE deleting the RD.
+    if [[ -n "$EVICTED_NODE" ]]; then
+        "${LCTL[@]}" node restore "$EVICTED_NODE" >/dev/null 2>&1 || true
+    fi
+    delete_rd "$RD"
+    assert_no_orphans "$RD"
+    linstor_cli_teardown
+}
+trap cleanup EXIT


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use strict cleanup registration here.

This scenario still installs a bare trap cleanup EXIT. The repo contract for tests/e2e/*.sh requires register_strict_cleanup, which is what enforces the shared cleanup and cluster-state verification path for these stand tests.

Based on learnings, Applies to tests/e2e/*.sh: Register cleanup for e2e test scenarios via register_strict_cleanup from tests/e2e/lib.sh instead of bare trap directives to ensure proper resource cleanup and cluster state verification.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/cli-matrix/n-evict-tiebreaker-no-shuffle.sh` around lines 70 - 80, Replace the bare trap registration with the repository’s strict cleanup helper: remove or stop using the direct "trap cleanup EXIT" registration and instead call register_strict_cleanup to register the existing cleanup() function (which restores "$EVICTED_NODE", calls delete_rd "$RD", assert_no_orphans "$RD", and linstor_cli_teardown). Locate the cleanup() definition and the trailing "trap cleanup EXIT" and replace that trap invocation with a call to register_strict_cleanup so the test uses the shared strict cleanup and cluster-state verification from tests/e2e/lib.sh.

coderabbitai · 2026-06-02T23:41:00Z

+# The load-bearing invariant: NEITHER healthy diskful was demoted to
+# TIE_BREAKER (the exact wrong behaviour from the repro where worker-2
+# went UpToDate → TieBreaker).
+echo ">> assert neither healthy diskful ($DISKFUL_A, $DISKFUL_B) was demoted to TIE_BREAKER"
+for n in "$DISKFUL_A" "$DISKFUL_B"; do
+    flags=$(kubectl get "resources.blockstor.cozystack.io/${RD}.${n}" \
+        -o jsonpath='{.spec.flags}' 2>/dev/null || echo "__absent__")
+    if [[ "$flags" == "__absent__" ]]; then
+        echo "FAIL (Bug 385): healthy diskful on $n disappeared after evict; flags=$flags" >&2
+        exit 1
+    fi
+    if [[ "$flags" == *"TIE_BREAKER"* ]]; then
+        echo "FAIL (Bug 385 regression): healthy diskful on $n was demoted to TIE_BREAKER; flags=$flags" >&2
+        "${LCTL[@]}" resource list --resources "$RD" 2>&1 | tail -20 >&2
+        exit 1
+    fi
+    if [[ "$flags" == *"DISKLESS"* ]]; then
+        echo "FAIL (Bug 385): healthy diskful on $n gained DISKLESS after evict; flags=$flags" >&2
+        exit 1
+    fi
+done


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert diskState on the surviving diskful replicas after the evict.

These checks only look at flags, so the test still passes if either original diskful stays non-TIE_BREAKER but drops to Inconsistent or Unknown. That weakens the regression signal for the “healthy replicas stay unchanged” guarantee.

Suggested change

for n in "$DISKFUL_A" "$DISKFUL_B"; do flags=$(kubectl get "resources.blockstor.cozystack.io/${RD}.${n}" \ -o jsonpath='{.spec.flags}' 2>/dev/null || echo "__absent__") @@ if [[ "$flags" == *"DISKLESS"* ]]; then echo "FAIL (Bug 385): healthy diskful on $n gained DISKLESS after evict; flags=$flags" >&2 exit 1 fi + d=$(status_disk_state "$RD" "$n" 0) + if [[ "$d" != "UpToDate" ]]; then + echo "FAIL (Bug 385): healthy diskful on $n is no longer UpToDate after evict; diskState=$d" >&2 + exit 1 + fi done

Based on learnings, Applies to tests/e2e/cli-matrix/**: L6 operator-CLI e2e tests must test real linstor CLI → REST → satellite → DRBD kernel interactions and assert Status convergence under tests/e2e/cli-matrix/.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/cli-matrix/n-evict-tiebreaker-no-shuffle.sh` around lines 171 - 191, The test currently only checks .spec.flags for DISKFUL_A and DISKFUL_B and can miss that a surviving diskful degraded its diskState to Inconsistent/Unknown; update the loop that iterates over DISKFUL_A and DISKFUL_B (the block referencing variables DISKFUL_A, DISKFUL_B and RD) to also query and assert the replica's runtime diskState from the resource status (e.g., the status field returned by kubectl for the resource) and fail if diskState is not the expected healthy state (e.g., "UpToDate" or "Consistent"); keep the existing flag checks but add the diskState check immediately after fetching flags so the test fails when a surviving diskful has an unacceptable diskState.

coderabbitai · 2026-06-02T23:41:00Z

+cleanup() {
+    # Best-effort restore in case the test bailed mid-evacuate, so the
+    # shared stand isn't left with an EVICTED node.
+    "${LCTL[@]}" node restore "$N3" >/dev/null 2>&1 || true
+    delete_rd "$RD"
+    assert_no_orphans "$RD"
+    linstor_cli_teardown
+}
+trap cleanup EXIT


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the actual evacuated node in cleanup.

resource create --auto-place=2 does not pin the witness to $N3, but this trap always restores $N3. If the script aborts before the explicit node restore, the real evacuated witness node can stay drained and poison the shared stand for later cells. Please track the evacuated node dynamically in cleanup, and register that cleanup with register_strict_cleanup instead of a bare trap.

Suggested change

RD=cli-matrix-386 N1=$WORKER_1 N2=$WORKER_2 N3=$WORKER_3 +EVICTED_NODE="" cleanup() { # Best-effort restore in case the test bailed mid-evacuate, so the # shared stand isn't left with an EVICTED node. - "${LCTL[@]}" node restore "$N3" >/dev/null 2>&1 || true + if [[ -n "$EVICTED_NODE" ]]; then + "${LCTL[@]}" node restore "$EVICTED_NODE" >/dev/null 2>&1 || true + fi delete_rd "$RD" assert_no_orphans "$RD" linstor_cli_teardown } -trap cleanup EXIT

Based on learnings, Applies to tests/e2e/*.sh: Register cleanup for e2e test scenarios via register_strict_cleanup from tests/e2e/lib.sh instead of bare trap directives to ensure proper resource cleanup and cluster state verification.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh` around lines 64 - 72, The cleanup() trap currently always restores node "$N3" which is incorrect when the evacuated witness may be a different node; modify cleanup() to record the actual evacuated node (e.g., capture it to a variable like EVACUATED_NODE when you run the evacuate or resource placement) and then restore "${EVACUATED_NODE}" instead of "$N3", keeping the existing calls to delete_rd "$RD", assert_no_orphans "$RD", and linstor_cli_teardown; replace the bare trap cleanup EXIT with a call to register_strict_cleanup cleanup so the test harness performs strict teardown and cluster verification (reference functions/variables: cleanup, LCTL, N3, RD, delete_rd, assert_no_orphans, linstor_cli_teardown, register_strict_cleanup).

coderabbitai · 2026-06-02T23:41:00Z

+# Belt-and-suspenders: the operator-visible `linstor r l` surface the
+# bug report cited must show exactly one TIE_BREAKER row plus the two
+# diskful replicas.
+wire=$(linstor_r_l_json "$RD")
+n_tb=$(printf '%s' "$wire" \
+    | jq -r '.[][] | select((.rsc_flags // []) | index("TIE_BREAKER")) | .name' 2>/dev/null \
+    | wc -l | tr -d ' ' || echo 0)
+if [[ "$n_tb" != "1" ]]; then
+    echo "FAIL (Bug 386): linstor r l shows $n_tb TIE_BREAKER rows for $RD, want 1" >&2
+    printf '%s\n' "$wire" | jq '.[][]| {name, node_name, flags: .rsc_flags}' 2>/dev/null >&2 || true
+    exit 1
+fi


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify the full post-restore shape, not just “some TB exists”.

n_tb == 1 still passes if a healthy diskful is demoted to TieBreaker, or if one of the diskful replicas never comes back to UpToDate. This regression should also assert that two non-TieBreaker replicas are healthy after n rst.

Based on learnings, Applies to tests/e2e/cli-matrix/**: L6 operator-CLI e2e tests must test real linstor CLI → REST → satellite → DRBD kernel interactions and assert Status convergence under tests/e2e/cli-matrix/.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh` around lines 177 - 188, The test currently only checks n_tb == 1 (using linstor_r_l_json and variable wire) but must also verify that exactly two non-TIE_BREAKER replicas are present and healthy after n rst; update the assertion to parse wire (output of linstor_r_l_json "$RD") to count entries where (.rsc_flags // []) does NOT contain "TIE_BREAKER" and whose status/state indicates a healthy diskful replica (e.g., .state == "UpToDate" or the CLI field representing healthy), assert that count equals 2, and fail if not; keep the original TIE_BREAKER check but add this second check using the same wire/RD context so the test ensures one tie-breaker plus two healthy non-TIE_BREAKER replicas.

coderabbitai · 2026-06-02T23:41:00Z

+        tiebreaker_present)
+            # Bug 386: assert a TieBreaker witness EXISTS for the rd.
+            # The inverse of no_tiebreaker — used by the node-restore
+            # catcher to confirm the witness is RE-created after the
+            # drained node is brought back with `n rst`.
+            local rd present
+            rd=$(substitute "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('rd',''))" "$spec")")
+            # Case-SENSITIVE on purpose — see the no_tiebreaker note above: a
+            # case-insensitive grep would match the lowercase "tiebreaker"
+            # substring in the RD name and report a witness present on every
+            # row, masking a genuinely missing TieBreaker.
+            present=$(linstor_cli resource list --resources "$rd" 2>/dev/null | grep -c 'TieBreaker' || true)
+            [[ "$present" -ge 1 ]]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

tiebreaker_present extends the replay schema beyond the allowed await.kind contract.

The current replay contract only permits replica_count, disk_state, all_uptodate, replica_diskless, no_tiebreaker, sync_clean, resource_absent, rd_absent, vd_size_kib, pvc_capacity, and pod_md5_invariant. Introducing a new kind here makes the new replay YAMLs non-compliant until the schema/validation contract is updated in the same change.

As per coding guidelines, Replay YAML await.kind must use one of: replica_count, disk_state, all_uptodate, replica_diskless, no_tiebreaker, sync_clean, resource_absent, rd_absent, vd_size_kib, pvc_capacity, pod_md5_invariant.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/operator-harness/lib.sh` around lines 268 - 280, The new tiebreaker_present check violates the replay await.kind contract by adding an unsupported kind; remove/replace the tiebreaker_present handler and instead implement its logic under an allowed await.kind (e.g., reuse or extend the existing no_tiebreaker flow) so replay YAMLs remain schema-compliant, or map any incoming await.kind value to existing checks before use; do not change the schema/validation here — if a new kind is truly required, add the schema/validation update in a separate PR.

coderabbitai · 2026-06-02T23:41:00Z

+  - name: auto-place-2
+    # 2 diskful + auto-tiebreaker → the canonical 2r-tb shape.
+    cmd: ["resource", "create", "--auto-place", "2", "--storage-pool={{sp}}", "{{rd}}"]
+    expect_exit: 0
+    await:
+      kind: all_uptodate
+      rd: "{{rd}}"
+      timeout_s: 240
+  - name: wait-tiebreaker-spawned
+    cmd: ["resource", "list", "--resources", "{{rd}}"]
+    expect_exit: 0
+    await:
+      kind: tiebreaker_present
+      rd: "{{rd}}"
+      timeout_s: 60
+  - name: evacuate-node3
+    # Drain the third node. With 2 diskful already placed and no spare
+    # worker in a 3-node cluster, EVICTED collapses the witness.
+    cmd: ["node", "evacuate", "{{node3}}"]
+    expect_exit: 0
+    await:
+      kind: no_tiebreaker
+      rd: "{{rd}}"
+      timeout_s: 60
+  - name: restore-node3
+    # Node recovered. Clearing EVICTED must re-run the tiebreaker
+    # invariant (Bug 386 fix: Node watch on the RD reconciler).
+    cmd: ["node", "restore", "{{node3}}"]
+    expect_exit: 0
+  - name: tiebreaker-recreated
+    cmd: ["resource", "list", "--resources", "{{rd}}"]
+    expect_exit: 0
+    await:
+      kind: tiebreaker_present
+      rd: "{{rd}}"
+      timeout_s: 60


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This replay assumes node3 is the witness, but the setup does not guarantee that.

auto-place-2 lets the controller choose the diskful pair, yet the later steps unconditionally evacuate and restore {{node3}}. If the witness lands elsewhere, this replay exercises a different topology and can fail or pass for the wrong reason. Create diskful replicas explicitly on {{node1}}/{{node2}} like the Bug 385 replay, or add a step that proves the witness is on {{node3}} before evacuating it.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/operator-harness/replay/n-rst-recreates-tiebreaker.yaml` around lines 33 - 68, The replay assumes node3 is the witness but auto-place-2 lets the controller pick diskfuls; update the scenario so the witness is guaranteed before evacuating: either replace the "auto-place-2" step with an explicit resource create that pins diskful replicas to {{node1}} and {{node2}} (matching the Bug 385 replay), or insert a pre-evacuate verification step (e.g., after "wait-tiebreaker-spawned") that runs "resource list" / a check to assert the tiebreaker/witness is on {{node3}} and fail if not; then keep the "evacuate-node3", "restore-node3", and "tiebreaker-recreated" steps unchanged.

coderabbitai · 2026-06-02T23:41:00Z

+  - name: r-c-node1
+    cmd: ["resource", "create", "{{node1}}", "{{rd}}", "--storage-pool", "{{sp}}"]
+    expect_exit: 0
+  - name: r-c-node2
+    cmd: ["resource", "create", "{{node2}}", "{{rd}}", "--storage-pool", "{{sp}}"]
+    expect_exit: 0
+  - name: r-c-node3
+    cmd: ["resource", "create", "{{node3}}", "{{rd}}", "--storage-pool", "{{sp}}"]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a post-delete UpToDate convergence check for the surviving replica

This replay asserts no_tiebreaker and that {{node2}} is absent, but it never verifies that the remaining active diskful replica(s) (e.g., {{node3}}) return to UpToDate after the resource_absent delete; add an await.kind: all_uptodate assertion for {{rd}} after the deletion step.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/operator-harness/replay/r-d-inactive-no-tiebreaker.yaml` around lines 33 - 40, The replay is missing a post-delete convergence assertion: after the existing resource_absent step that removes {{node2}} under the no_tiebreaker scenario, add an await.kind: all_uptodate check for the resource {{rd}} to ensure surviving diskful replica(s) (e.g., {{node3}}) return to UpToDate; locate the delete step named resource_absent and insert an await block with kind: all_uptodate targeting {{rd}} immediately after it so the test verifies convergence.

coderabbitai · 2026-06-02T23:41:00Z

+  - name: no-tiebreaker-spawned
+    cmd: ["resource", "list", "--resources", "{{rd}}"]
+    expect_exit: 0
+    await:
+      kind: no_tiebreaker
+      rd: "{{rd}}"
+      timeout_s: 30
+  - name: node2-stays-absent
+    cmd: ["resource", "list", "--resources", "{{rd}}"]
+    expect_exit: 0
+    await:
+      kind: resource_absent
+      rd: "{{rd}}"
+      node: "{{node2}}"
+      timeout_s: 30
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a status-convergence check for the surviving active replica.

After resource_absent and no_tiebreaker, this replay still never proves the remaining active diskful settles back to UpToDate. A degraded {{node3}} would still look green here.

Based on learnings, Applies to tests/e2e/cli-matrix/**: L6 operator-CLI e2e tests must test real linstor CLI → REST → satellite → DRBD kernel interactions and assert Status convergence under tests/e2e/cli-matrix/.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/operator-harness/replay/r-d-inactive-no-tiebreaker.yaml` around lines 60 - 75, The replay is missing a post-failure status-convergence assertion for the surviving active replica: after the existing steps no-tiebreaker-spawned and node2-stays-absent (which use the resource_absent await and rd/node variables), add a new step that runs the same resource check and awaits a status-convergence/UpToDate condition for rd on node "{{node3}}" (use the same rd and node3 symbols), e.g. an await kind like resource_status or status_converged with rd: "{{rd}}", node: "{{node3}}", and timeout_s: 30; also ensure the e2e CLI matrix tests under tests/e2e/cli-matrix/** mirror this assertion to verify real LINSTOR/DRBD status convergence.

coderabbitai · 2026-06-02T23:41:00Z

+  - name: r-c-node1
+    cmd: ["resource", "create", "{{node1}}", "{{rd}}", "--storage-pool", "{{sp}}"]
+    expect_exit: 0
+  - name: r-c-node2
+    cmd: ["resource", "create", "{{node2}}", "{{rd}}", "--storage-pool", "{{sp}}"]
+    expect_exit: 0


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use combined equals form for --storage-pool argument.

The storage-pool arguments use the old two-element format ["--storage-pool", "{{sp}}"], but should use the combined equals form ["--storage-pool={{sp}}"] to match the harness command runner's expected format.

♻️ Proposed fix

- name: r-c-node1 - cmd: ["resource", "create", "{{node1}}", "{{rd}}", "--storage-pool", "{{sp}}"] + cmd: ["resource", "create", "{{node1}}", "{{rd}}", "--storage-pool={{sp}}"] expect_exit: 0 - name: r-c-node2 - cmd: ["resource", "create", "{{node2}}", "{{rd}}", "--storage-pool", "{{sp}}"] + cmd: ["resource", "create", "{{node2}}", "{{rd}}", "--storage-pool={{sp}}"] expect_exit: 0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/operator-harness/replay/vd-late-create-second-volume.yaml` around lines 48 - 53, The test commands r-c-node1 and r-c-node2 use the two-element flag form for the storage pool argument; update each cmd array to use the combined equals form so the harness command runner accepts it (replace ["--storage-pool", "{{sp}}"] with ["--storage-pool={{sp}}"] in the command entries for r-c-node1 and r-c-node2).

…ion flakes (Bug 392) (#84) * test(e2e): harden recovery-deleting-convert + state-standalone-partition flakes (Bug 392) Two e2e scenarios flaked under CI load on PR #83 (both confirmed on a live stand to be assertion artifacts, not product regressions): recovery-deleting-convert: assert_uptodate_12 read only the CRD .status projection, which lags for a few seconds while the satellite runs `drbdadm adjust` after toggling the stuck-DELETING replica to diskless (the re-render transiently bounces the N1<->N2 link). N2's local disk stays UpToDate throughout. Add the same kernel-ground-truth accept-path (kernel_pair_uptodate) PR #50 already gave the Phase-1 wait, so a stale projection read no longer fails the assertion. state-standalone-partition: the historical marker drift was root-caused to a host-side loopfile storage artifact (already fixed by the ZFS move + direct-io reassert); the failure log only printed two md5s, which cannot distinguish a real wrong-resync from that artifact. Both failure paths now also dump drbdadm role / get-gi + drbdsetup status --verbose on both nodes, so the next CI hit carries the generation- identifier evidence needed to tell a real regression from projection lag. No production code change. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com> * test(e2e): retry marker read-back on transient drift in state-standalone-partition (Bug 392) The marker round-trip flaked under CI load on the QEMU/zvol stand: the read-back md5 differed on the first read even though the GI dump (added above) proves DRBD is fully consistent — both peers UpToDate + Established, near-identical current-UUIDs, no resync. The divergence is a host read-path artifact (a stale guest page served despite iflag=direct on the emulated disk under load — the documented loopfile/zvol residue class), not replicated data loss. Drop the guest cache and re-read up to 4 times: a transient substrate artifact clears on a fresh read from the backing store; genuine corruption or a real wrong-resync PERSISTS across every re-read, so the assertion still fails loudly on sustained drift and dumps the GI/kernel evidence to prove it. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com> * test(e2e): confirm marker drift via non-O_DIRECT ground truth before failing (Bug 392) The drop_caches+direct retry was the wrong remedy: under CI load the marker read-back returns a DIFFERENT md5 on every iflag=direct read (4 distinct hashes) while DRBD is fully consistent, the peer's replicated backing zvol is byte-identical to the written marker, and ZFS reports zero checksum errors. A dedicated load investigation pinned it to the CI runner's nested-QEMU virtio-blk O_DIRECT path returning stale/garbage pages under memory+I/O pressure (O_DIRECT bypasses the page cache and depends on hypervisor DMA); the replicated data is intact, only that read path lies, and it does not reproduce on the lighter dev-stand virtualization (0/22 + 0 under loadavg 17). Replace the futile re-read loop with a ground-truth cross-check that avoids O_DIRECT entirely: on a direct-read mismatch, verify via a buffered read on the Primary (page-cache path, proven deterministic) and the peer's replicated backing device read directly. PASS iff either matches the written marker; FAIL only when the non-O_DIRECT path also mismatches (real corruption / wrong-resync), with GI + kernel-status evidence dumped. This de-flakes the recurring CI blocker without weakening real data-loss detection. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com> --------- Signed-off-by: Andrei Kvapil <kvapss@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>

Andrei Kvapil (kvaps) and others added 19 commits June 2, 2026 23:29

Merge remote-tracking branch 'origin/fix/bug-386-restore-recreate-tb'…

9f8dc37

… into integration/operational-lifecycle-384-387

Merge remote-tracking branch 'origin/fix/bug-385-evict-tb-shuffle' in…

18b7353

…to integration/operational-lifecycle-384-387

Merge remote-tracking branch 'origin/fix/bug-387-rd-inactive-no-tb' i…

0b675a4

…nto integration/operational-lifecycle-384-387 Signed-off-by: Andrei Kvapil <kvapss@gmail.com> # Conflicts: # internal/controller/resourcedefinition_controller.go

Andrei Kvapil (kvaps) marked this pull request as ready for review June 2, 2026 23:25

gemini-code-assist Bot reviewed Jun 2, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Andrei Kvapil (kvaps) merged commit a889f7e into main Jun 3, 2026
14 of 16 checks passed

		leftover=$(kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \
		\| grep -c "$prefix" \|\| true)

	kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \| grep "$prefix" >&2 \|\| true
	kubectl get resources.blockstor.cozystack.io -o name 2>/dev/null \| grep -F "$prefix" >&2 \|\| true

Conversation

Andrei Kvapil (kvaps) commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bugs fixed

Test tiers

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Andrei Kvapil (kvaps) commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading