v0.1.5
Large bug-fix release. Spans the REST API wire-validation surface (Bugs 356–383: input validation, typed FAIL_* envelopes, idempotency), the satellite DRBD / LUKS / metadata paths, and a multi-round operator-lifecycle bug-hunt that closed the four operator-reported DRBD lifecycle bugs the REST sweep missed plus their adjacent classes (Bugs 384–397). Every operator-facing fix lands with an L1/L2 unit/contract test, an L6 cli-matrix cell, and an L7 operator-replay workflow, validated on the live Talos+DRBD stand.
Fixed
- Late
vd cleaves the new volumeInconsistenton every replica (Bug 384, data integrity, #83) — adding a volume to an already-initialized multi-replica resource ran the seed path withisWinner=falseunconditionally (first-activation election is gated on!rdInitialized), so no replica seeded the new volumeUpToDateand it latchedInconsistentforever. The satellite now re-runs the lowest-node-id winner election per freshly-added volume, so exactly one replica becomes the SyncSource. Class regression of the Bug 79/332 family. node evictdemotes a healthy diskful replica to TieBreaker (Bug 385, #83) —ensureTiebreakercounted a witness stranded on a just-EVICTED node as live, so the witness was never relocated and a healthy diskful drifted into the tiebreaker role. Replicas on EVICTED/LOST nodes are now excluded from the witness/quorum decision and stranded witnesses are reaped.node restoredoes not recreate the auto-TieBreaker (Bug 386, #83) — the RD reconciler did not watchNode, so clearing the EVICTED flag never re-ran the tiebreaker invariant, leaving two diskful UpToDate with no witness (split-brain risk). Adds a Node watch.r dof a diskful on a 2-diskful + 1-INACTIVE resource grows a useless TieBreaker (Bug 387, #83) — an INACTIVE (drbdadm down) replica is not a voting peer but was counted as a diskful, so the delete spuriously converted to a witness. INACTIVE replicas are excluded from the voting set.node evacuatenever prunes the source replica (Bug 389, #81) — evacuate gap-filled a replacement but never deleted the source on the drained node, leaving the resource permanently at place_count+1. Now does strict add-before-drop (prune only after the replacement reaches UpToDate) and derives the redundancy target from the current diskful count, so it works on RDs that inheritplace_count=0fromDfltRscGrp.- auto-diskful ignores EVICTED/LOST nodes and INACTIVE replicas (Bug 390, #82) — the deficit count and promotion-candidate set treated drained-node and deactivated replicas as healthy diskful, masking deficits and promoting onto draining nodes. Both are now filtered.
- Autoplace under-places when an INACTIVE replica is present (Bug 393, #85) —
placer.countDiskfulReplicascounted INACTIVE replicas towardplace_count, so a replacement active replica was never placed. INACTIVE is now excluded, mirroring the auto-diskful and tiebreaker invariants. snapshot createfails on any resource with an INACTIVE replica (Bug 394, #86) — snapshot node-selection and the success denominator included the INACTIVE node, whose down DRBD device cannot ack the suspend-io barrier, aborting the whole group. INACTIVE replicas are excluded from snapshot targets.- Thick-LVM volume resize silently diverges the replicas (Bug 395, data integrity, #87) —
drbdadm resize --assume-cleanran unconditionally; on a thickLVMpool the grown extents hold node-distinct stale content, so replicas disagreed on the grown region with no resync (out-of-sync 0) and a failover changed the bytes an application read. Resize is now provider-aware: zero-on-allocate providers (ZFS, thin, file) keep the--assume-cleanfast path; thick LVM omits it so DRBD resyncs the grown region. Cozystack's default (ZFS) was unaffected. - Snapshot-restore onto a snapshot-less node (Bug 397, #89) — the explicit
--node-namerestore path did not constrain targets to the nodes that hold the snapshot (unlike the auto-place path), so a replica could be placed on a node lacking the data. The restore handler now rejects a snapshot-less target with a typed error, and the seed path refuses the skip-init-sync fast path for a blank-fallback replica so it SyncTargets the real copy; a legitimate all-clone restore keeps the fast path. - Tiebreaker / toggle-disk / LUKS / metadata satellite fixes —
r toggle-disk --diskfulno longer leaves a staleTIE_BREAKERflag on the promoted replica (#54);r d --keep-tiebreakerkeeps the auto-witness instead of collapsing it (#57);r cretries through the tiebreaker-collapse race instead of failing (Bug 359, #61); a TB-relocate that wedgedStandAloneon a both-disks-bitmap state now recovers (#53);r td --disklesscloses the LUKS mapper so the backing zvol can be reclaimed (#55); and per-volumedrbdadm create-mdstopsvd con an existing RD from EBUSY-looping against vol-0's attached minor (Bug 332, #58).
Fixed — REST API wire validation & idempotency
Closes Bugs 356–383: the REST surface now validates operator input at the wire boundary (before any partial state lands) and returns upstream-matching FAIL_* ApiCallRc envelopes instead of bare 200s or generic 500s.
- Name & volume-number validation — RD/RG/Node names are capped at the 48-char k8s-label limit (Bug 360, #59) and invalid names are rejected on
s r rst/rg spawnbefore partial state lands (#56);volume_numberis validated in[0, 65535]at create (#60) and onvd d/vd l/vd m(Bug 365, #62); a non-numeric volume-number in the URL path returns an operator-grade envelope (Bug 380, #73). - Size, type & placement validation — non-positive
volume_sizesin spawn (Bug 381, #74) and non-positivesize_kibon avdPUT regardless of--force(Bug 383, #75) are rejected;select_filter.place_countis validated at RG create + modify (Bug 367 / 361, #64); nodeTypeis validated, defaulting empty toSATELLITE, atPOST /v1/nodes(Bug 370, #65); aPUTresource-definition validates itsresource_group(Bug 372, #66); net-interfacePUTvalidates address + port at the wire (Bug 371 / 368 / 369, #63); the fresh-create pool resolver walks the RGStoragePoolList(Bug 364, #67). - Immutability & idempotency —
StorDriver/*mutation is rejected onPUTstorage-pools (Bug 373, #68) and storage-pool-definitions (Bug 375, #70); five bare-200 write endpoints now emit an ApiCallRc envelope (Bug 374, #69);drop-property(Bug 378, #71) and net-interfaceDELETE(Bug 379, #72) are idempotent on a missing parent.
Test infrastructure
- L7 replay convergence assertions were silent no-ops (Bug 388, #83 / #84) —
all_uptodate/wait_settlefiltered replicas onspec.resourceName, but the CRD field isspec.resourceDefinitionName, so the most-used "did the cluster converge" check passed vacuously across every replay. Fixed the field, tolerate Diskless/TieBreaker rows, and gaveno_orphansa settle window. (This immediately caught a real drop-without-add defect in the first Bug-389 fix.) - e2e flake hardening (Bugs 392 / 396 / 398 — #84 / #88 / #90) —
state-standalone-partitionand siblings flaked under CI on three substrate-level read/scan races, none of them blockstor data bugs (DRBD partition recovery is forensically correct — the writer stays SyncSource, ZFS checksums clean). The connection-state waits now read kernel ground truth instead of the lagging CRD projection; the marker round-trip distinguishes a real (stable) on-disk corruption from a non-deterministic nested-QEMU read-path glitch; and the stand's Talos config narrows the LVMglobal_filterso the node-sidepvscanno longer races the satellite for DRBD/dm/zvol/loop backing devices (open(/dev/loopN): Device or resource busy).