-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
makeInitrdNG: malformed squashfs images when filesize exceeds 2GB #203593
Comments
The Rust bits don't touch the contents of the initrd at all, they're just copied and then fed to cpio. I wonder if you're hitting some cpio bug. |
Sound plausible given that cpio is quite old to my understanding. Bit more context from my triaging: the Linux kernel code for kexec had a bug/feature regarding the 2GB size due to use of the @K900 do you think this issue should be kept open until the issue is found from the dependencies, or closed as unrelated to NixOS? |
I think we should at least look into this and maybe fix our cpio build to make it work. |
Found a lot of threads where people discuss this and the magic value of 2GB where problems start to occur. Then I found the following, which actually makes a lot of sense: "Downloads are placed in the 32-bit address space (i.e., below 4GB). It is very plausible that a large chunk of this address space is allocated for PCI BARs, and so you may have only ~2GB of actual RAM within this address space." Unfortunately, I think it is game over regarding this; we are stuck with the squashfs method. Or what do you guys think? |
Yeah, the only solution I see for that is a more complex setup where only a small initrd is fetched at PXE time and that initrd then obtains the Nix store squashfs (or otherwise getting the nix store -- depending on the use case nfs or similar might make sense too and allow for faster boots) from the netboot server some other way. That would require setting up networking and stuff in the initrd though. |
That is exactly what we are currently using as a workaround. It required some changes to the stage-1-init script: https://github.com/majbacka-labs/nixpkgs/commits/patch-init1sh. It should be stated that this does not have anything to do with fixing kexec, which is annoying since I would like to have both functionalities for the same output format. If you are also convinced that this does not directly relate to nixpkgs, as far as I am concerned, this issue should be closed. |
The 2GB limit has nothing to do with PXE. The limit is most likely a cpio bug.
The linked issue above to PXE forums is irrelevant in this context because: 1) the initrd is created post-boot while using 64 bit addressing, and 2) the squashfs is mounted in the init1 stage, which can use 64 bit addressing. Kexec itself supports images of over 4GB, so that is not the problem.
The problem you linked would only make sense before the kernel is started. But, the moment the kernel is started, which is always the case when you are handling squashfs, you have full access to RAM.
To fix this problem, the way the cpio is packaged, alongside its arguments, have to be reviewed.
Juuso
Le 23 mars 2024 à 13:46 +0000, Jesse Karjalainen ***@***.***>, a écrit :
… That is exactly what we are currently using as a workaround. It required some changes to the stage-1-init script: https://github.com/majbacka-labs/nixpkgs/commits/patch-init1sh. It should be stated that this does not have anything to do with fixing kexec, which is annoying since I would like to have both functionalities for the same output format.
If you are also convinced that this does not directly relate to nixpkgs, as far as I am concerned, this issue should be closed.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
@jhvst Isn't the argument that the initrd itself, which is loaded before the kernel starts, is too big because it contains the squashfs? i.e. The problem is not mounting the squashfs; it's loading the initrd in the first place. |
I do not think so, but I may be wrong: the FWIW, we wish to eventually upstream the changes for the workaround (see: #203750), but I guess an alternative solution would be to troubleshoot this issue. However, both options are low priority for us, while the required effort seems coincidentally high. My current ETA is that I may take a look at this sometime in June/July, but cannot promise any resolution. |
@jhvst Look at how this initrd is created: nixpkgs/nixos/modules/installer/netboot/netboot.nix Lines 90 to 99 in fd2ac5b
The ordinary initrd is used in This means that if the initrd is being truncated during load because it is too long, only the squashfs would be affected. So I think my explanation is still very likely correct. |
Now I see, makes sense then. This might be a good test case then. I will most likely start debugging this expression. Thanks! |
It seems that this issue has now resolved itself -- initrds over 2GB boot fine. Thanks to everyone who shared their comments. This has been tested both on kexec and via ipxe netboot. |
Describe the bug
If you create initrd files over 2GB, for example, with this configuration and try to boot it, the bootup will fail when mounting the squashfs image in init-1-stage with an error
squashfs error unable to read id index table
. As suggested, e.g., by #26230, this is indeed caused by data corruption. However, the corruption does not happen in transport, it happens when creating the initrd. Furthermore, and what's the real bug, is that the data corruption only happens when the resulting initrd exceeds 2GB in size. The data corruption can be verified by comparing thesha256sum
of the squashfs images: first on the computer that build the image (i.e., from nix store path echoed on the final parts of the build process), and then on the computer that tries to boot the image from the initrd root folder in emergency shell. The files will be exact in size, but differ in their hash signature. If one is to use the emergency shell to fetch the squashfs image from the nix store and manually initialize init-2-stage, the OS will boot successfully.Steps To Reproduce
Steps to reproduce the behavior:
nix-build -A pix.ipxe nvidia.nix -I home-manager=https://github.com/nix-community/home-manager/archive/master.tar.gz -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/refs/heads/nixos-unstable.zip
. Alternatively, you can download a pre-built version of mine.kernelModules
in the Nix config file to include drivers necessary you to fetch the original squashfs image. This would be keyabord, filesystem drivers, and/or network drivers. Alternatively, you can download a kernel that I used.linux_latest
from nixpkgs will do.kexec-boot
. If you modified the Nix configuration file to include kernelModules for your system, you can execute this script. If you decided to use my prebuilt images, runkexec --load phasedKernel --initrd=initrd -c "boot.shell_on_fail"
. Then, when you are ready to halt the system, runkexec -e
.f
for emergency shell. Then, you will find the malformed squashfs from the root folder. Check itssha256sum
. This should differ from your self-built image, or the file at http://boot.ponkila.com/squashfs.img. Bug reproduced. Done../init
or prepare the filesystem manually to eventually launchswitch_root
. You can refer to this document for more details. Finally, the OS will boot successfully.Expected behavior
I expect it's fine if my initrd is over 2GB. Currently, it's not: I'm unable to boot. Kexec shows that this is not BIOS related issue.
Screenshots
N/A
Additional context
I wrote more details here. I have triaged this issue a lot, from cancelling out BusyBox issues, to UEFI compatibility, to RAM running out, and to kernel configuration issues with tmpfs files over 2GB. I have looked at the current implementation of the
makeInitrdNG
, which I believe is at least related to the bug, but I cannot see how this issue could arise from the current Rust code. This bug does not seem to be an issue with any of the userspace or kernel code, hence, it's specific to the way the images are built on Nix.This issue is probably not very high in priority -- in practice, the 2GB limit can be circumvented by modifying the init-1-stage script to download a rootfs over the Internet, much like what is done in reproducing this bug. However, I don't think this issue should exist unless there is something awry happening in the initrd build process, hence, should probably taken a look at.
Notify maintainers
@dasJ @ElvishJerricco @K900 @lheckemann
Metadata
N/A
The text was updated successfully, but these errors were encountered: