Skip to content

fix(satellite): decouple rs-discard-granularity from discard-zeroes gating#112

Merged
Andrei Kvapil (kvaps) merged 5 commits into
mainfrom
fix/q3-rs-discard-granularity
Jun 6, 2026
Merged

fix(satellite): decouple rs-discard-granularity from discard-zeroes gating#112
Andrei Kvapil (kvaps) merged 5 commits into
mainfrom
fix/q3-rs-discard-granularity

Conversation

@kvaps
Copy link
Copy Markdown
Member

@kvaps Andrei Kvapil (kvaps) commented Jun 5, 2026

What & why

The thin-aware-resync feature renders DRBD's discard-zeroes-if-aligned + rs-discard-granularity into the resource disk { } block so a partially-written thin volume resyncs only the written bytes instead of copying the unallocated zero ranges.

An earlier revision of this PR decoupled the two options and emitted rs-discard-granularity for FILE_THIN as well (the loop backing reports a non-zero lsblk DISC-GRAN of 4096, and upstream LINSTOR 1.33.2 renders it on the same backing). That decoupling introduced a fresh-create convergence regression on loop-backed FILE_THIN.

The regression (CI-proven)

On a fresh multi-replica create whose RD carries FileSystem/Type (the elected day0 winner force-primaries to run mkfs), mkfs issues a full-device discard. On the loop backing — which has a documented DRBD kernel write-path interaction — that discard storm, with rs-discard-granularity active, dirties the bitmap relative to the day0-seeded peers and forces a FULL initial SyncTarget that then wedges (PausedSyncT, resync-suspended:peer,dependency between the two targets), so the replicas never reach UpToDate.

Evidence:

  • e2e respawn-standalone-wedge (FILE_THIN + FileSystem/Type=ext4) FAILED on the decoupled image: worker-1 SyncSource, peers Inconsistent done:37% on a brand-new 512M resource, timing out at "wait all 3 replicas UpToDate". The same scenario PASSES on the baseline (no FILE_THIN disk block).
  • e2e replica-add-no-resync (FILE_THIN, no mkfs) PASSED on the decoupled image — so the trigger is the mkfs-discard × loop × rs-discard-granularity interaction, not the option in isolation.
  • Dev-stand isolation: an identical 3-replica create + FileSystem/Type=ext4 converges instantly on LVM_THIN (real block device, same full disk block incl. rs-discard-granularity 65536) but full-resyncs on FILE_THIN (loop). The disk block is inert/safe on a real block device; the wedge is loop-specific.

Fix

Gate rs-discard-granularity on the discard-zero-safe block-device provider set (LVM_THIN / ZFS / ZFS_THIN) — the same set as discard-zeroes-if-aligned. FILE_THIN now renders only the inert discard-zeroes-if-aligned no; (matching pre-feature behaviour) and omits the granularity. The thin-aware-resync win is retained where it is both safe and effective: real block-device thin/ZFS pools.

Accepted delta (known-delta #76)

For FILE_THIN, BS deliberately omits rs-discard-granularity even though upstream 1.33.2 emits it on the same loop backing — the loop-safety divergence above. BS additionally omits upstream's block-size 512; and renders the disk block at resource (not volume) scope; both functionally identical for single-volume resources.

Test protocol

  • L1 (pkg/satellite/discardgran_test.go): FILE_THIN and thick LVM omit rs-discard-granularity; LVM_THIN/ZFS keep it; end-to-end render pins TestBuildResFile_FileThinDay0SkipRender (no rs-discard-granularity through the production autoDiskOptionsForResource path) and TestBuildResFile_LvmThinDay0SkipRender (keeps it). go test ./... + golangci-lint run ./pkg/satellite/ green.
  • L6 (tests/e2e/cli-matrix/file-thin-rs-discard-granularity.sh): 512M FILE_THIN with FileSystem/Type=ext4 (drives the mkfs/force-primary path); asserts the rendered .res carries discard-zeroes-if-aligned no but NO rs-discard-granularity, and that a 2nd diskful replica day0-skips (sync-target received ≪ device size).
  • L7 (tests/operator-harness/replay/file-thin-rs-discard-granularity.yaml): operator-CLI sequence asserting discard-zeroes-if-aligned=no, rs-discard-granularity absent (drbdsetup show), and a clean 2nd-replica sync_clean.

Validation note

The shared dev stand runs an older deployed image and is used by parallel campaign agents, so the fix is validated via (a) L1 render pins, (b) the LVM_THIN-vs-FILE_THIN mkfs isolation on the stand (loop-specificity confirmed), and (c) the CI re-run on the fixed branch image — the authoritative gate for the e2e respawn-standalone-wedge / disk-replace-internal-metadata convergence scenarios.

Summary by CodeRabbit

  • New Features

    • Improved discard granularity handling and consistency for thin storage configurations.
  • Bug Fixes

    • Fixed conservative merging of disk options across multi-volume resources to handle unsafe volumes correctly.
  • Tests

    • Enhanced test coverage for FILE_THIN storage scenarios and discard behavior validation across storage types.

Andrei Kvapil (kvaps) and others added 3 commits June 5, 2026 22:51
…ating

A partially-written FILE_THIN volume resynced ~2x the bytes upstream
LINSTOR did because the rendered .res lacked rs-discard-granularity.
autoDiskOptions() early-returned nil for any provider that was not
discard-zeroes-safe, coupling rs-discard-granularity (which upstream
gates ONLY on the backing device's reported discard granularity) to
discard-zeroes-if-aligned (correctly provider-gated).

Decouple the two gates to mirror upstream:

  - discard-zeroes-if-aligned: provider-gated as before, but now
    rendered explicitly as `no` for non-safe kinds (FILE_THIN, thick
    LVM, FILE) instead of omitting the whole block — matching upstream
    1.33.2's render.
  - rs-discard-granularity: emitted whenever the backing device reports
    a non-zero discard granularity (lsblk DISC-GRAN > 0), independent of
    provider kind, so an aligned all-zero region is UNMAPped during
    resync instead of transferred.

Multi-volume collapse stays conservative: discard-zeroes-if-aligned is
`yes` only when every volume is safe (one `no` pins the resource), and
rs-discard-granularity is the smallest across volumes.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
L6 cli-matrix cell file-thin-rs-discard-granularity.sh: 512M FILE_THIN
single replica, write 320M, add 2nd diskful replica, assert the new
replica's resync `received` byte counter is close to the written bytes
(not the full device) and the rendered .res carries the discard disk
block.

L7 replay file-thin-rs-discard-granularity.yaml: operator-CLI sequence
asserting the rendered DRBD config carries rs-discard-granularity=4096
and discard-zeroes-if-aligned=no for a FILE_THIN resource, plus a clean
3rd-replica sync (drbd_option + sync_clean awaits).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
After the rs-discard-granularity decoupling fix, BS renders the same
thin-aware-resync disk options as upstream for a discard-capable
FILE_THIN volume. Two render-shape divergences remain and are accepted:
upstream additionally emits `block-size 512;` (BS omits it), and upstream
renders the disk block at volume scope vs BS's resource scope —
functionally identical for single-volume resources.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Too many files changed? Review this PR in Change Stack to see how the pieces fit before you dive in.

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e90f4a8b-2c29-49e3-a953-cf7afa31a98c

📥 Commits

Reviewing files that changed from the base of the PR and between a939c9b and 640b0fa.

📒 Files selected for processing (5)
  • docs/cli-parity-known-deltas.md
  • pkg/satellite/discardgran.go
  • pkg/satellite/discardgran_test.go
  • tests/e2e/cli-matrix/file-thin-rs-discard-granularity.sh
  • tests/operator-harness/replay/file-thin-rs-discard-granularity.yaml

📝 Walkthrough

Walkthrough

This PR updates DRBD discard option generation to correctly handle FILE_THIN loop-backed storage by ensuring discard-zeroes-if-aligned is always rendered (as yes or no based on provider kind), while rs-discard-granularity is emitted only for discard-zero-safe providers with non-zero granularity. Multi-volume resource merging is made conservative with "no wins" semantics.

Changes

FILE_THIN discard-zeroes-if-aligned and rs-discard-granularity

Layer / File(s) Summary
Discard option constants and conservative merging
pkg/satellite/discardgran.go
Explicit drbdNo constant added; mergeResourceDiscardOptions rewritten to use smallest-value logic for rs-discard-granularity and "no wins" rule for discard-zeroes-if-aligned across volumes.
Per-volume and resource-level disk option functions
pkg/satellite/discardgran.go
autoDiskOptions now always returns discard-zeroes-if-aligned (yes or no by provider kind); autoDiskOptionsForResource removes early abort to allow resource-level options even when some volumes lack options; comments updated for conservative collapsing behavior.
Unit tests for provider-kind behavior
pkg/satellite/discardgran_test.go
New tests assert thick LVM and FILE_THIN are gated to discard-zeroes-if-aligned=no with no rs-discard-granularity; FILE_THIN on 4KiB loop uses provider-kind short-circuit preventing lsblk probe; resource-level tests validate the same at orchestration scope.
Rendered .res file validation tests
pkg/satellite/discardgran_test.go
Integration tests render complete DRBD resource files: FILE_THIN-on-4K-loop includes discard-zeroes-if-aligned no; without rs-discard-granularity; LVM_THIN includes both discard-zeroes-if-aligned yes; and the expected rs-discard-granularity value.
End-to-end and operator harness tests
tests/e2e/cli-matrix/file-thin-rs-discard-granularity.sh, tests/operator-harness/replay/file-thin-rs-discard-granularity.yaml
Bash e2e script and YAML operator workflow validate FILE_THIN loop-backed regression: both confirm rendered .res includes discard-zeroes-if-aligned no without rs-discard-granularity, and both verify "day0 skip" behavior (fresh replica converges without full resync) via DRBD statistics.
CLI parity documentation
docs/cli-parity-known-deltas.md
Documents intentional REST/.res wire-shape divergence for FILE_THIN disk blocks: rs-discard-granularity and block-size are omitted, with pins to unit tests and harness scripts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A loop-device fix for discard granularity,
When FILE_THIN seeks safety with clarity—
Always say "no" when unsafe, "yes" when all's clear,
Merge with a gentle and conservative cheer!
Fresh replicas sync without wasteful rescan,
Day zero skips bloom according to plan. 🌱

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/q3-rs-discard-granularity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request decouples the rs-discard-granularity and discard-zeroes-if-aligned DRBD disk options, allowing them to be gated independently. Previously, if a volume provider (such as FILE_THIN or thick LVM) was not discard-zero-safe, the entire disk options block was omitted. With this change, such volumes will explicitly set discard-zeroes-if-aligned to no while still retaining rs-discard-granularity if supported by the backing device. This optimization ensures that resync operations only transfer written bytes rather than copying the entire device. Extensive unit, E2E, and integration tests have been added to verify this behavior. There are no review comments to address, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@kvaps
Copy link
Copy Markdown
Member Author

Regression found — do not merge. Both CI lane failures share one root: on this branch's image a FRESH multi-replica create performs a FULL initial resync instead of the day0-GI skip (lane-4 respawn-standalone-wedge setup: worker-1 SyncSource, peers Inconsistent done:37% on a brand-new resource; lane-2 disk-replace-internal-metadata same nature). On main the same scenarios pass with instant UpToDate via drbdmeta set-gi seeding. Suspect: the new backing-device discard-granularity probe interacts with the create-md/set-gi/up ordering (or the added disk{} section triggers an adjust path that re-handshakes before seeding). Needs a fix + an L1 pin that the day0 skip survives the render change.

Andrei Kvapil (kvaps) and others added 2 commits June 6, 2026 02:22
…s (FILE_THIN loop wedge)

Rendering rs-discard-granularity into a loop-backed FILE_THIN volume's
disk{} block regressed fresh-create convergence. When the elected day0
winner force-primaries to run mkfs (the FileSystem/Type path), mkfs
issues a full-device discard; on the loop backing that discard storm,
with rs-discard-granularity active, dirties the bitmap relative to the
day0-seeded peers and forces a FULL initial SyncTarget that wedges
(PausedSyncT dependency between the two targets). Proven on stand + CI:
an identical 3-replica create + mkfs converges instantly on LVM_THIN
(real block device, same disk block) but full-resyncs and wedges on
FILE_THIN (loop) — the e2e respawn-standalone-wedge failure.

Gate rs-discard-granularity on the discard-zero-safe block-device
provider set (LVM_THIN / ZFS / ZFS_THIN), the same set as
discard-zeroes-if-aligned. FILE_THIN now renders only the inert
discard-zeroes-if-aligned no (matching pre-feature behaviour) and omits
the granularity. The thin-aware-resync win is retained where it is both
safe and effective: real block-device thin/ZFS pools.

L1 pins (discardgran_test.go): FILE_THIN + thick LVM omit the
granularity; LVM_THIN/ZFS keep it; end-to-end render pins
TestBuildResFile_FileThinDay0SkipRender (no rs-discard-granularity) and
TestBuildResFile_LvmThinDay0SkipRender (keeps it). L6/L7 repurposed to
guard the FILE_THIN fresh-create + mkfs day0-skip convergence and assert
rs-discard-granularity is absent. Known-delta #76 updated.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
…IN gate

Comment-only: the rs-discard-granularity gate is no longer INDEPENDENT
of provider kind — it follows the discard-zero-safe block-device set and
excludes loop-backed FILE_THIN. Align the godoc with the implementation.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps Andrei Kvapil (kvaps) marked this pull request as ready for review June 6, 2026 01:40
@kvaps Andrei Kvapil (kvaps) merged commit 6291a2d into main Jun 6, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant