Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Fix Btrfs local refresh #13133

Merged
merged 3 commits into from Mar 15, 2024

Conversation

roosterfish
Copy link
Contributor

@roosterfish roosterfish commented Mar 13, 2024

Fixes #13085

When copying a Btrfs backed storage volume locally the last target volume's snapshot doesn't have the Received UUID property set which is required to accept a subsequent stream using btrfs receive that is created using btrfs send.

The Received UUID isn't present since a volume created by local copy is a snapshot of its parent volume. The Received UUID gets set only when using the send/receive functions.
It looks that this behavior has changed with Btrfs version 5.14.2 onwards (see https://btrfs.readthedocs.io/en/latest/Subvolumes.html#subvolume-flags) as previously you could just receive a Btrfs subvol stream on top of a snapshot.

This PR takes a somewhat radical approach and replaces the usage of send/receive for local volume refreshes. Instead we can reuse the volume copy logic so that when refreshing a volume we simply delete the target and create a new snapshot from the source. Using this approach we don't have to manually set the Received UUID with a syscall (as done for migrations) and instead create a fresh copy of the source volume including all of the requested snapshots.

@roosterfish
Copy link
Contributor Author

It's still not clear to me why it isn't causing a failure for e.g. busybox based instances so I'll leave it in draft state for now.

Signed-off-by: Julian Pelizäus <julian.pelizaeus@canonical.com>
…lume

Signed-off-by: Julian Pelizäus <julian.pelizaeus@canonical.com>
… volume

Signed-off-by: Julian Pelizäus <julian.pelizaeus@canonical.com>
@roosterfish roosterfish marked this pull request as ready for review March 14, 2024 16:36
@roosterfish
Copy link
Contributor Author

roosterfish commented Mar 14, 2024

@tomponline I am convinced this is now ready for review. I can also confirm it fixes the original bug reported in #13085.

Since Btrfs snapshots are almost instantaneous it looks that this approach (see desc) is much more bulletproof as we don't need to run syscalls to set Btrfs subvol attributes (as it is done currently for migrations).

@tomponline
Copy link
Member

This PR takes a somewhat radical approach and replaces the usage of send/receive for local volume refreshes. Instead we can reuse the volume copy logic so that when refreshing a volume we simply delete the target and create a new snapshot from the source. Using this approach we don't have to manually set the Received UUID with a syscall (as done for migrations) and instead create a fresh copy of the source volume including all of the requested snapshots.

Does this maintain compatibility with previously refreshed volumes?

Also doesn't this change the relation of the refreshed volume so rather than being a discrete volume in its own right its now a snapshot of the source volume? Could that have knock on effects? Or is that the same as lxc copy without doing a refresh and you're making it consistent between lxc copy and lxc copy --refresh?

@tomponline
Copy link
Member

@roosterfish please can you add a PR to restore the refresh check for btrfs on lxd-ci?

@roosterfish
Copy link
Contributor Author

This PR takes a somewhat radical approach and replaces the usage of send/receive for local volume refreshes. Instead we can reuse the volume copy logic so that when refreshing a volume we simply delete the target and create a new snapshot from the source. Using this approach we don't have to manually set the Received UUID with a syscall (as done for migrations) and instead create a fresh copy of the source volume including all of the requested snapshots.

Does this maintain compatibility with previously refreshed volumes?

Also doesn't this change the relation of the refreshed volume so rather than being a discrete volume in its own right its now a snapshot of the source volume? Could that have knock on effects? Or is that the same as lxc copy without doing a refresh and you're making it consistent between lxc copy and lxc copy --refresh?

Yes all this PR is doing is reuse the already existing copy functionality (which is doing a snapshot) also for refreshes.
In case only the the root volume needs to be refreshed (snapshots on both source and target are equal) we just delete the target vol and create a new snapshot from the source vol.

I did some more comprehensive tests and I don't see any issues applying this new concept on subvols that were previously refreshed using the send/receive functions.

roosterfish added a commit to roosterfish/lxd-ci that referenced this pull request Mar 15, 2024
The refresh got fixed in canonical/lxd#13133.

Signed-off-by: Julian Pelizäus <julian.pelizaeus@canonical.com>
roosterfish added a commit to roosterfish/lxd-ci that referenced this pull request Mar 15, 2024
The refresh got fixed in canonical/lxd#13133.

Signed-off-by: Julian Pelizäus <julian.pelizaeus@canonical.com>
@tomponline
Copy link
Member

@roosterfish so you can mark it ready for review or are you still thinking about it further?

@tomponline
Copy link
Member

I did some more comprehensive tests and I don't see any issues applying this new concept on subvols that were previously refreshed using the send/receive functions.

Yes it seems to make sense, do the same as local copy already does.
If local copy was using send/recv i'd have concerns, but its not, so refresh should really be very similar to local copy.

@roosterfish
Copy link
Contributor Author

@roosterfish so you can mark it ready for review or are you still thinking about it further?

This one is ready, the linked one in lxd-ci isn't as it's waiting for this one here :)

@tomponline tomponline merged commit 633bd7d into canonical:main Mar 15, 2024
28 checks passed
@roosterfish roosterfish deleted the fix_btrfs_refresh branch March 15, 2024 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refreshing BTRFS backed instances is causing a segfault
2 participants