-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the correct frame format for linux #956
Comments
|
As I remember it, the kernel code employs the legacy frame format, The legacy frame format is specified in this document. It's an old format, which was initially conceived as a demo, and as a consequence is fairly rigid. What it has for it is its simplicity, and maybe that's why it was preferred. However, note that it depends on an externally-provided end-of-file signal to stop the decoding process. I've seen mention in the discussion that it's expected that a build tool would round the compressed size to the nearest 4-byte boundary. If that's the case, this is unlikely to be the Thing is, the This leaves a few next steps on the table :
|
|
So on my system I have Thus my understanding is that total length of initrd is 0x56b1410, with the last bytes being the 32-bit zero. Then if one reads beyond that, it should be EOF notification or it should be the magic of the next concatenated lz4 of something (block/frame?!). Now, I do wonder if how and where kernel could get the EOF notification from. Cause i'ts just reading memory locations as far as I understand. Or how it could guess the total size of initrd. My theory is that there could be anything after the "0000 0000" in the memory. Hopefully, most of the time, it is unparasable garbage. Ideally, it would be nice if grub had a way to signal to kernel what's the total length of initrds, such that it doesn't try to read beyond that. |
|
Hi all, After digging a bit, I'm afraid the analysis & fix of the following patch is wrong That is my experiment, first patched the kernel decompress_unlz4.c to show the whole in_len and that is what I got and then I hacked /usr/bin/unmkinitramfs as follows and ran with /boot/initrd.img-5.10.0-rc4+ finally I ran $ xxd /tmp/tmp | tail -n 5 and ls -l /tmp/tmp so you can see 723885818(2b259efa) != 723885820(2b259efc). So that is the real cause of decompression failure (in_len mismatched). I didn't dig into anymore, but the patch suggested in the thread above is just a workaround, not the root cause of the issue. Thanks, |
|
@hsiangkao thank you! this is useful. I was suspecting that initrd on disk, is not the initrd in memory. Now I guess i must figure out where the in_len came from. |
|
grub_add (size, ALIGN_UP (grub_file_size (files[i]), 4), &size) |
|
I think it's a grub bug. It aligns up by 4 bytes. and then aligns multiple initrds at 4 bytes boundry too. Then it declares the total size as it allocated the memory, rather than the initrds true sizes. |
|
I think something like https://github.com/rhboot/grub2/pull/77/files is the right place to fix this in. |
|
Actually I'm wrong about grub. It does everything right. Initramfs buffer format is specified at https://www.kernel.org/doc/html/latest/driver-api/early-userspace/buffer-format.html and it explicitly states that any number of zero padding is allowed. Thus when compatible lz4 decompressor is in use, it tries in error, to consume all the size specified to it, when in practice it should stop once zeros stop making any sense and return the consumed input buffer position. Thus imho, I now think that unlz4 decompressor is implemented incorrectly in the linux kernel, w.r.t. the expected linux decompressor api. |
No, what's shown is the cpio(initramfs) format (which describes the uncompressed data), not cpio+lz4 format. in which case, lz4 legacy frame format is a container of cpio(initramfs) format. and lz4 legacy frame format needs the exact in_len of the compressed data (not more and less). So there is nothing wrong with this format. If grub cannot guarantee such in_len match, lz4 legacy frame format cannot be used then. |
|
plus, that is also why "/tmp/tmp" in my previous comment can be decompressed with "lz4 -d", but if you add 2 more 0 bytes to "/tmp/tmp", it won't decompress correctly. that is a limitation to lz4 legacy frame format, not related to its content --- initramfs format at all. |
Yes, lz4 legacy frame format relies on EOF marker or size, which is never known here in advance.
initramfs buffer is a series of either zeros, cpio_archives (uncompressed), compressed_archives. Star grub passes the whole buffer, and the total size of it, which may contain more than one lz4 compressed archive. Meaning, there will be outstanding size & input buffer when lz4 decompression of the first archive completes. concatenating two compressed cpio archives and loading as initrd must work, with or without grub. |
There are two issues at play here: One is a bug in pierrec/lz4 when using the legacy framing format [1]. This bit us when we hit a broken size region with CL:2130, taking hours to debug. The other is the fact that the Linux LZ4 frame format has significant design issues [2], especially with concatenanted initrds. The first issue could be fixed by switching to a different LZ4 implementation (we do even have the reference impl in the monorepo) but there is no API to generate the legacy frame format and things like [3], a patch carried by Ubuntu to fix more edge cases just do not inspire confidence in such a solution. Thus, this CL switches over to using zstd for compressing initrds. Zstd is slower than LZ4 for decompressing, but it still decompresses at multiple GB/s per core while having a much better compression ratio. It also doesn't have any Linux-specific bits and Linux uses the reference implementation for decoding, which should make it much more robust. So overall I think this is a good tradeoff. [1] pierrec/lz4#156 [2] lz4/lz4#956 (comment) [3] https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch Change-Id: I69cf69f2f361de325f4b39f2d3644ee729643716 Reviewed-on: https://review.monogon.dev/c/monogon/+/2313 Tested-by: Jenkins CI Reviewed-by: Serge Bazanski <serge@monogon.tech>
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660
It's very long, but the most interesting piece is the attached patch
https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch
I am interested to know if the lz4 compressed initrd that we produce is correct or not.
And if the linux kernel decompression of the lz4 compressed initrd is correct or not.
And what the spec for lz4 says it should be.
Any help with this would be highly appreciated.
The text was updated successfully, but these errors were encountered: