Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkswap: make sure to have a single extent in swap files #1120

Closed
poettering opened this issue Aug 11, 2020 · 11 comments
Closed

mkswap: make sure to have a single extent in swap files #1120

poettering opened this issue Aug 11, 2020 · 11 comments

Comments

@poettering
Copy link
Contributor

I think it would make a ton of sense if mkswap would add some support for verifying that swap files it formats are properly set up to actually work as swap files. Specifically, it should check that the file is a single extent. use the FS_IOC_FIEMAP ioctl on the file to verify that it really is just one extent. For details see:

  1. https://www.kernel.org/doc/Documentation/filesystems/fiemap.txt
  2. https://stackoverflow.com/questions/38669605/how-to-use-ioctl-with-fs-ioc-fiemap

I think this should at least be a warning, but better would be to let the mkswap operation fail if the file is more than one extent.

(It might also be an idea, to allow mkswap to create swap files from scratch, i.e. a new --create switch or so, that creates the file, and then uses fallocate() to allocate space for, verifying with FS_IOC_FIEMAP that everything is in order)

@lczerner
Copy link
Contributor

lczerner commented Sep 7, 2020

Why do you think single extent is important ? What do you mean by single extent anyway ? Maybe you mean that the file should be entirely allocated, without holes ?

@lczerner
Copy link
Contributor

lczerner commented Sep 7, 2020

If I understand it correctly, the idea is to make sure that mkswap will only create the swapfs on the file if that resulting file can be used for swap - meaning that it will not be rejected by swapon, nor kernel.
Possible reasons why it would be rejected is that it either contains holes, shared extents, inline data or delalloc extents. We can check for that with fiemap.
Fsync should get rid of delalloc extents. Using fallocate can fix the case where the file contains holes, which can be done automatically or upon interaction with user.

@poettering
Copy link
Contributor Author

Why do you think single extent is important ? What do you mean by single extent anyway ? Maybe you mean that the file should be entirely allocated, without holes ?

hibernating to a swap file requires to pass offset + length to kernel. Hence a swap file should be a single extent to be compatible with hibernation.

(Yeah, i didn#t mention hibernation before, I should have. I only care about hibernation here for swap files, not otherwise)

@lczerner
Copy link
Contributor

lczerner commented Sep 7, 2020

hibernating to a swap file requires to pass offset + length to kernel. Hence a swap file should be a single extent to be compatible with hibernation.

(Yeah, i didn#t mention hibernation before, I should have. I only care about hibernation here for swap files, not otherwise)

Hmm, that does not seem right. There is no way for a userspace to influence file system allocation decision to the extent to say how the data should be laid out physically. In fact, some file systems do have limited extent size and data blocks interleaved with metadata blocks - so in fact this might not even be possible on something like ext4. There must be some misunderstanding...

I do not know too much about hibernate, but according to this kernel documentation https://www.kernel.org/doc/html/latest/power/swsusp-and-swap-files.html (1) swap files need not be contiguous - so it does not even need to be contiguous, let alone one extent. In fact according to that document you only need to be able to find the header of the swap file on the file system that the swap file is stored:

In principle the location of a swap file’s header may be determined with the help of appropriate filesystem driver. Unfortunately, however, it requires the filesystem holding the swap file to be mounted, and if this filesystem is journaled, it cannot be mounted during resume from disk. For this reason to identify a swap file swsusp uses the name of the partition that holds the file and the offset from the beginning of the partition at which the swap file’s header is located. For convenience, this offset is expressed in <PAGE_SIZE> units.

It says that not the offset + length is what is needed, but rather the swap file partition + offset to the swap file header within that partition - this can be achieved with fiemap on the swap file that can tell you the partition physical block corresponding to the swap file logical block.

Or am I missing something ?

@karelzak
Copy link
Collaborator

karelzak commented Sep 8, 2020

It says that not the offset + length is what is needed, but rather the swap file partition + offset to the swap file header within that partition - this can be achieved with fiemap on the swap file that can tell you the partition physical block corresponding to the swap file logical block.

But I guess this is something we do not need to care about in userspace if we know that the file is without holes (etc.), right?

IMHO hibernate to a regular file is strange :-) It seems better to have swap partition to avoid extra FS layer.

@lczerner
Copy link
Contributor

lczerner commented Sep 8, 2020

It says that not the offset + length is what is needed, but rather the swap file partition + offset to the swap file header within that partition - this can be achieved with fiemap on the swap file that can tell you the partition physical block corresponding to the swap file logical block.

But I guess this is something we do not need to care about in userspace if we know that the file is without holes (etc.), right?

Oh yeah, definitely. I think it is still a good idea for mkswap to make sure the file is ok to use for swap. But apparently this is a different problem from what Lennart is talking about.

IMHO hibernate to a regular file is strange :-) It seems better to have swap partition to avoid extra FS layer.

As is for the swap in general.

karelzak added a commit that referenced this issue Sep 9, 2020
Let's make mkswap(8) more user-friend and report possible swapon
issues already when user initialize a swap file. The extents check
produces warnings, no exit with error.

  # dd if=/dev/zero of=img count=100 bs=1MiB
  # chmod 0600 img

  # fallocate --dig-hole --offset 64520192 --length 1MiB img
  # fallocate --dig-hole --offset 84520960 --length 1MiB img

  # ./mkswap img
  mkswap: extents check failed:
    - hole detected at offset 64520192 (size 1048576 bytes)
    - hole detected at offset 84520960 (size 1048576 bytes)
  file img can be rejected by kernel on swap activation.

  Setting up swapspace version 1, size = 100 MiB (104853504 bytes)
  no label, UUID=92091112-26b5-47aa-a02a-592e52528319

It checks for holes in the file, and for unknown, inline, shared and
deallocated extents.

Addresses: #1120
Signed-off-by: Karel Zak <kzak@redhat.com>
@karelzak
Copy link
Collaborator

karelzak commented Sep 9, 2020

@lczerner please, review: 195025c
The commit message contains an example. Thanks!

@karelzak
Copy link
Collaborator

karelzak commented Sep 9, 2020

Thanks for your review! Bugfix: 29b68b9

(Oh, yes ... next time I need to use a topic branch to avoid such incremental changes in our master branch, ...)

@karelzak
Copy link
Collaborator

OK, so the final version (I hole:-)) on file with holes:

./mkswap img

mkswap: img contains holes or other unsupported extents.
        This swap file can be rejected by kernel on swap activation!
        Use --verbose for more details.

Setting up swapspace version 1, size = 100 MiB (104853504 bytes)
no label, UUID=96a9dd74-0fe2-4c4c-9713-e3e516da09e2

@lczerner
Copy link
Contributor

I like it. Thanks Karel!

@karelzak
Copy link
Collaborator

Thanks for your review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants