rollback: support named-subvolume btrfs layouts (ROLLBACK_METHOD)#1133
rollback: support named-subvolume btrfs layouts (ROLLBACK_METHOD)#1133mxsb wants to merge 8 commits into
Conversation
Adds RollbackMethod enum (SET_DEFAULT, SUBVOL_RENAME) and two detection functions to AppUtil: - detect_rollback_method_from_options(): pure function over already-parsed mount options, testable without /proc/mounts access - detect_rollback_method(): reads /proc/mounts via getMtabData() and delegates to the above Returns SUBVOL_RENAME when a "subvol=<name>" mount option is present and the name is not "/", SET_DEFAULT otherwise. This distinguishes systems using named subvolumes (subvol=@root) from those relying on the btrfs default subvolume id. Also adds KEY_ROLLBACK_METHOD config key and ROLLBACK_METHOD="auto" to the default config template, and a Boost unit test covering all detection cases including the subvolid= vs subvol= prefix disambiguation. config.h is included in AppUtil.cc as it now contains the first conditional compilation block in that translation unit.
Reads ROLLBACK_METHOD from the snapper config and resolves the effective rollback method before the ambit switch: - "auto": detect from /proc/mounts via detect_rollback_method() - "set-default": force the existing btrfs default subvolume ioctl - "subvol-rename": use named subvolume swap; errors if root is not mounted with a named subvolume - unknown value: error and exit Adds get_subvol_name() as a thin wrapper around detect_rollback_method() for the explicit "subvol-rename" path, making intent clear without discarding the returned RollbackMethod. Both CLASSIC and TRANSACTIONAL ambit paths switch on the resolved method. The subvol-rename case exits with "not yet implemented" until the rename logic is added in the next commit.
Adds Btrfs::rollbackSubvolRename() which performs rollback on systems using named btrfs subvolumes (subvol=@root in fstab) by atomically swapping subvolumes rather than changing the btrfs default subvolume id. The operation: 1. Mounts the btrfs top-level (subvolid=5) via TmpMount 2. Cleans up any stale <subvol>.incoming from a previous failed attempt 3. Creates a rw snapshot of the rollback target as <subvol>.incoming 4. Atomically swaps <subvol> and <subvol>.incoming via renameat2(RENAME_EXCHANGE) 5. Renames old root to <subvol>.rollback.<num> for preservation (non-fatal if fails) 6. Fires set_default_snapshot pre/post plugin hooks as setDefault() does Adds SDir::rename_exchange() wrapping renameat2(RENAME_EXCHANGE), with syscall(SYS_renameat2) fallback for older glibc. Btrfs supports RENAME_EXCHANGE since Linux 3.17. configure.ac detects renameat2 via AC_CHECK_FUNCS. Both CLASSIC and TRANSACTIONAL ambit paths in cmd-rollback.cc now dispatch to rollbackSubvolRename() when rollback_method is SUBVOL_RENAME. Tests: - testsuite/rename-exchange.cc: Boost unit test for SDir::rename_exchange covering successful exchange and the ENOENT case - testsuite-real/rollback-subvol-rename.cc: integration test exercising the full subvolume swap sequence on a real btrfs filesystem
Add ROLLBACK_METHOD to snapper-configs(5) with descriptions of all three values (auto, set-default, subvol-rename). Update the rollback section of snapper(8) to mention auto-detection from /proc/mounts. Add doc/rollback.txt explaining the design rationale, the two rollback methods, and why subvol-rename is needed on systems using named subvolumes (subvol=@root).
SDir operations assert that names contain no '/'. A nested subvol= path (e.g. root/@root) would hit that assert and abort with no useful message. Add an explicit check at the top of rollbackSubvolRename that throws IOErrorException with a clear message before any disk operations.
If a previous subvol-rename rollback completed the rename_exchange but failed to move .incoming to .rollback.N, that .incoming subvolume is still the old running root (mounted by subvolume ID). Deleting it risks corruption during the unmount sequence on reboot (kernel < 5.x deferred deletion is unreliable under this scenario). Rename it to .rollback.<timestamp> instead, which is always safe and preserves the subvolume for recovery.
… name The previous timestamp-based suffix had a 1-second granularity collision window. Use get_id() on the stale .incoming subvolume instead — btrfs subvolume IDs are unique across the filesystem for the lifetime of the subvolume, so the rescued name is guaranteed not to collide. Also removes the <ctime> include that is no longer needed.
If the target .rollback.N name is already occupied (e.g. rolling back to the same snapshot number twice), the rename fails with EEXIST. Instead of leaving the old root stranded as .incoming, fall back to a subvolume-ID based name (.rollback.svid.<id>) which is guaranteed unique across the btrfs filesystem. Extends the integration test with a second test case that verifies the collision scenario and the fallback rename.
|
I see a check failed. I’m currently on vacation, but will look into it asap when I’m back home. |
|
The failure in Leap is unrelated to the changes. |
|
|
||
| #ifdef ENABLE_ROLLBACK | ||
|
|
||
| enum class RollbackMethod { SET_DEFAULT, SUBVOL_RENAME }; |
There was a problem hiding this comment.
Since the enum and the functions are only used in cmd-rollback.cc they should not be defined in the library.
| case RollbackMethod::SUBVOL_RENAME: | ||
| { | ||
| const Btrfs* btrfs = dynamic_cast<const Btrfs*>(filesystem.get()); | ||
| if (!btrfs) |
There was a problem hiding this comment.
There is already a check for btrfs above. So the variable btrfs could also be defined there.
Another possibility would be to add rollbackSubvolRename to Filesystem like it is done for setDefault. Esp. since renaming subvolume could maybe also be used to rollback using LVM (there the logical volumes would be renamed). Although currently there are no plans to implement this for LVM.
|
I installed Fedora 44 and the mount option in fstab is subvol=root (not subvol=@root). The rollback fails. |
| #ifdef HAVE_RENAMEAT2 | ||
| return ::renameat2(dirfd, name1.c_str(), dirfd, name2.c_str(), RENAME_EXCHANGE); | ||
| #else | ||
| return syscall(SYS_renameat2, dirfd, name1.c_str(), dirfd, name2.c_str(), |
There was a problem hiding this comment.
renameat2 was added in glibc in 2018 so this fallback looks unnecessary. Even without the workaround it builds on all archs currently tested (e.g. https://build.opensuse.org/project/show/home:aschnell:snapper).
|
Thanks for your comments and looking at the PR, I'm working on the feedback. |
On systems where fstab mounts the root filesystem with an explicit
subvol=option (e.g.subvol=@root),snapper rollbackcurrently has no effect on the next boot. TheBTRFS_IOC_DEFAULT_SUBVOLioctl changes the default subvolume ID, but when the kernel mounts by name the default ID is ignored.This affects systems using named-subvolume layouts (Fedora, Ubuntu, many manual btrfs setups) and is the root cause behind #722, #365, #159, and #1011.
Changes
This PR adds a
ROLLBACK_METHODconfig key with three values:auto(default) — detects from/proc/mountswhether root uses a namedsubvol=option and picks the appropriate method; no manual configuration needed on most systemsset-default— existing behavior, unchangedsubvol-rename— atomically swaps the named root subvolume usingrenameat2(RENAME_EXCHANGE)(Linux 3.17+)The
subvol-renamesequence:@root.incomingrenameat2(RENAME_EXCHANGE)atomically swaps@root.incoming↔@root— no window where@rootdoes not exist@root.rollback.<N>for recoveryBoth CLASSIC and TRANSACTIONAL ambits are supported.
Safety properties
subvol=root/@root) are rejected with a clear error before any disk operations.SDiroperations require single-component names and would otherwise abort.@root.rollback.<N>. If that name already exists (e.g. rolling back to the same snapshot twice), a subvolume-ID-based fallback name@root.rollback.svid.<id>is used — guaranteed unique across the btrfs filesystem..incomingfrom a prior interrupted rollback is renamed (not deleted) before starting. If a previous rollback completed the swap but failed to rename.incomingto.rollback.<N>, that subvolume is the old running root, still mounted by the kernel via subvolume ID. Deleting it risks corruption during the unmount sequence. It is renamed to@root.rollback.svid.<id>instead.rename_exchangefailure deletes only the newly-created.incomingcopy (safe — the swap never occurred) and throws a clear error.Testing
testsuite/rollback-method.cc)SDir::rename_exchange(testsuite/rename-exchange.cc)testsuite-real/rollback-subvol-rename.cc): happy path +.rollback.Ncollision scenario with subvolid fallbackDocumentation
snapper-configs(5)— newROLLBACK_METHODentrysnapper(8)— rollback section updateddoc/rollback.txt— design rationale and method descriptionsFixes #722, #365, #159. Related: #1011.