Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/lib/make-ext4-fs: Fix: `resize2fs -M' can leave insufficient slack #125121

Merged
merged 1 commit into from Jun 1, 2021

Conversation

ztzg
Copy link
Contributor

@ztzg ztzg commented May 31, 2021

Motivation for this change

The root filesystem resizing step, resize2fs -M, does not provide any control over the amount of slack left in the result. It can produce an arbitrarily tight fit, depending on how well the payload aligns with ext4 data structures.

This is problematic, as NixOS must create a few files and directories during its first boot, before the root is enlarged to match the size of the containing SD card.

An overly tight fit can cause failures in the first stage:

mkdir: can't create directory '/mnt-root/proc': No space left on device

or in the second stage:

install: cannot create directory '/var': No space left on device

A previous version of make-ext4-fs (before PR #79368) was explicitly "reserving" 16 MiB of free space in the final filesystem. Manually calculating the size of an ext4 filesystem is a perilous endeavor, however, and the method it employed was apparently unreliable.

Reverting is consequently not a good option.

A solution would be to create some sort of "balloon" occupying inodes and blocks in the image prior to invoking resize2fs -M, and to remove these temporary files/directories before the compression step.

This changeset takes the simpler approach of simply dropping the resizing step.

Note that this does not result in a larger image in general, as the current procedure does not truncate the .img file anyway. In fact, it has been observed to yield smaller compressed images—probably because of some "noise" left after resizing. E.g., before-vs-after:

-r--r--r-- 2 root root 607M  1. Jan 1970  nixos-sd-image-21.11pre-git-x86_64-linux.img.zst

-r--r--r-- 2 root root 606M  1. Jan 1970  nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Added a release notes entry if the change is major or breaking
  • Fits CONTRIBUTING.md.

The root filesystem resizing step, `resize2fs -M', does not provide any
control over the amount of slack left in the result.  It can produce an
arbitrarily tight fit, depending on how well the payload aligns with
ext4 data structures.

This is problematic, as NixOS must create a few files and directories
during its first boot, before the root is enlarged to match the size of
the containing SD card.

An overly tight fit can cause failures in the first stage:

    mkdir: can't create directory '/mnt-root/proc': No space left on device

or in the second stage:

    install: cannot create directory '/var': No space left on device

A previous version of `make-ext4-fs' (before PR NixOS#79368) was explicitly
"reserving" 16 MiB of free space in the final filesystem.  Manually
calculating the size of an ext4 filesystem is a perilous endeavor,
however, and the method it employed was apparently unreliable.

Reverting is consequently not a good option.

A solution would be to create some sort of "balloon" occupying inodes
and blocks in the image prior to invoking `resize2fs -M', and to remove
these temporary files/directories before the compression step.

This changeset takes the simpler approach of simply dropping the
resizing step.

Note that this does *not* result in a larger image in general, as the
current procedure does not truncate the `.img' file anyway.  In fact, it
has been observed to yield *smaller* compressed images---probably
because of some "noise" left after resizing.  E.g., before-vs-after:

    -r--r--r-- 2 root root 607M  1. Jan 1970  nixos-sd-image-21.11pre-git-x86_64-linux.img.zst

    -r--r--r-- 2 root root 606M  1. Jan 1970  nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
@samueldr
Copy link
Member

samueldr commented May 31, 2021

How big is the uncompressed ext4 image before/after the change?

Though this change is probably fine either way since it does not change the API of the image builder.

In practice we really need to take the time to do the unification work of all the various bespoke image building code into a common interface.

@ztzg
Copy link
Contributor Author

ztzg commented May 31, 2021

How big is the uncompressed ext4 image before/after the change?

The uncompressed .img files always have exactly the same size, as the recipe does not truncate them to match the resized filesystem:

unzstd /nix/store/q3q1zrw5dk5lhd8p87x209dipvchmxq9-nixos-sd-image-21.11pre-git-x86_64-linux.img/sd-image/nixos-sd-image-21.11pre-git-x86_64-linux.img.zst -o before.img

unzstd /nix/store/5qnn1rp75b2qcvyyn504q4lbs1wjkfwk-nixos-sd-image-21.11pre-git-x86_64-linux.img/sd-image/nixos-sd-image-21.11pre-git-x86_64-linux.img.zst -o after.img

ls -l before.img after.img

# -r--r--r-- 1 dash users 2566414336  1. Jan 1970  after.img
# -r--r--r-- 1 dash users 2566414336  1. Jan 1970  before.img

As for the ext4 FS held in the second partition: I will check tomorrow with the armv7l image which exhibited the issue, as it turns out that on my current (x86_64) playground, resize2fs -M just says:

resize2fs 1.45.5 (07-Jan-2020)
The filesystem is already 616838 (4k) blocks long.  Nothing to do!

… which is not very representative.

@samueldr
Copy link
Member

The uncompressed .img files always have exactly the same size, as the recipe does not truncate them to match the resized filesystem:

Right, so my main concern here ends up being a no-op. (Though as I write this I realize it wasn't much of a real concern that really mattered...)

@samueldr
Copy link
Member

There is literally no reason not to merge this AFAICT. Given that the image is already as big as it gets, so it's not like it would stop fitting on any storage media.

I guess this might not actually solve some underlying issue where there is not enough slack space in the filesystem. In that instance I guess adding more slack space outright would be fine as long as the compressed image stays in the same size ballpark.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 1, 2021

Successfully created backport PR #125159 for release-21.05.

@ztzg
Copy link
Contributor Author

ztzg commented Jun 1, 2021

@samueldr, @Mic92: Great; thanks!

Samuel, given your conclusions, and the fact that I agree with them, I ended up focusing on other things than that ext4 filesystem today. I am still planning to experiment a bit with the SD image creation scripts, and will keep you posted as soon as I do.

Cheers, -D

@ztzg
Copy link
Contributor Author

ztzg commented Jun 7, 2021

Hi @samueldr,

I had a closer look into this, based on the image on which we observed the issue. For some reason, resize2fs is quite happy to fully trim that specific filesystem.

One thing I had missed is that resize2fs does, in fact, truncate the container it operates on when the latter is an ordinary file—so there can be a significant difference in uncompressed image sizes after all:

Before (resize) After (no resize) Delta
Uncompressed 592465920 778747904 186281984
Compressed 165375164 165577174 202010

There isn't much of a difference in the size of the compressed output, so this shouldn't impact Hydra. It might occasionally cause an issue if somebody tries to dump the image on a small-ish SD card, however.

For the filesystem itself, we have:

Before (resize) After (no resize) Delta
1K-blocks 552272 732132 179860
Used 540444 540752 308
Available 0 138148 138148
Use% 100% 80% -0.2

As with the x86_64 experiment in my earlier comment, the differences (even in the uncompressed form) are far less dramatic with sd-image-aarch64:

Before (resize) After (no resize) Delta
Uncompressed 2685890560 2685894656 4096
Compressed 661521482 662132709 611227
Before (resize) After (no resize) Delta
1K-blocks 2507000 2507000 0
Used 1949980 1949980 0
Available 409568 409568 0
Use% 83% 83% 0.

(I don't have a clear explanation for the difference in compressed image size despite the fact that resize2fs does not… resize; I'm guessing that it zeroes some areas.)

You wrote this:

In practice we really need to take the time to do the unification work of all the various bespoke image building code into a common interface.

I agree, and minimizing the size of the filesystem (while explicitly reserving some slack, to avoid the issue fixed by this PR) should ideally be part of the checklist for such an effort. Do we have a document and/or ticket I should watch/contribute to?

@samueldr
Copy link
Member

samueldr commented Jun 7, 2021

Do we have a document and/or ticket I should watch/contribute to?

Not yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants