Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the correct frame format for linux #956

Closed
xnox opened this issue Nov 25, 2020 · 11 comments
Closed

What is the correct frame format for linux #956

xnox opened this issue Nov 25, 2020 · 11 comments

Comments

@xnox
Copy link

xnox commented Nov 25, 2020

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1835660

It's very long, but the most interesting piece is the attached patch

https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch

I am interested to know if the lz4 compressed initrd that we produce is correct or not.
And if the linux kernel decompression of the lz4 compressed initrd is correct or not.
And what the spec for lz4 says it should be.

Any help with this would be highly appreciated.

@Cyan4973
Copy link
Member

As I remember it, the kernel code employs the legacy frame format,
which is discouraged but still available in the lz4 CLI (both encoding and decoding sides).

The legacy frame format is specified in this document. It's an old format, which was initially conceived as a demo, and as a consequence is fairly rigid. What it has for it is its simplicity, and maybe that's why it was preferred. However, note that it depends on an externally-provided end-of-file signal to stop the decoding process.

I've seen mention in the discussion that it's expected that a build tool would round the compressed size to the nearest 4-byte boundary. If that's the case, this is unlikely to be the lz4 CLI, and it's easy to test : just use lz4 -l on any file, and you'll quickly find a few files which compress to a non-multiple of 4.
If that happens, it means the compressed payload contains a few bytes after the last block, and it may invite the decoder to attempt decoding them too. Since the new block is supposed to start with a 4-bytes header, and there are less than 4-bytes left in the buffer, the code either over-read beyond buffer boundary, or detect the issue and errors out.

Thing is, the liblz4 library doesn't provide any entry point to decode (nor generate) a legacy frame. It's up to the user program to provide the logic, while the liblz4 library only provides entry points to decode or encode the blocks themselves.
For example, the lz4 CLI drives the process directly in its IO unit : https://github.com/lz4/lz4/blob/v1.9.3/programs/lz4io.c#L947
As can be seen, faced with a situation where a few (<4) bytes are left after the last block, it's likely to detect it and error out.
However, this is obviously not the code used by the kernel, which provides its own routine. How the kernel code reacts in these circumstances is unknown to me.

This leaves a few next steps on the table :

  • Check if the tool generating the compressed payload (initramfs?) does pad a few bytes after the end of compressed payload. See if it can be fixed.
  • Check how the legacy frame decoder within the linux kernel behaves when presented with bogus data containing a few bytes after compressed payload. See if it requires fixing. See if it can be made compatible (if need be) with initramfs use case.
  • Longer term : maybe consider transition towards supporting the official frame format, starting with an ability to decode both legacy and official lz4 frame formats. Decoding the official lz4 frame format can be delegated to the liblz4 libraty (lz4frame unit), so the responsibility doesn't depend on some kernel-specific code. (Note : they are easily distinguished through their initial 4-bytes magic number).

@xnox
Copy link
Author

xnox commented Nov 26, 2020

So on my system I have

printf "%x\n" `stat -L -c "%s" /boot/initrd.img`
56b1410

# cat initrd.img | hexdump | tail -n2
56b1400 5245 2121 0021 0001 1cff 0050 0000 0000
56b1410

Thus my understanding is that total length of initrd is 0x56b1410, with the last bytes being the 32-bit zero.

Then if one reads beyond that, it should be EOF notification or it should be the magic of the next concatenated lz4 of something (block/frame?!).

Now, I do wonder if how and where kernel could get the EOF notification from. Cause i'ts just reading memory locations as far as I understand. Or how it could guess the total size of initrd. My theory is that there could be anything after the "0000 0000" in the memory. Hopefully, most of the time, it is unparasable garbage. Ideally, it would be nice if grub had a way to signal to kernel what's the total length of initrds, such that it doesn't try to read beyond that.

@hsiangkao
Copy link

hsiangkao commented Nov 26, 2020

Hi all,
I'm using Ubuntu 20.04 as well. Just saw this, and I also found the message in my dmesg (Although I can boot the system...)
"[ 1.098138] Initramfs unpacking failed: Decoding failed"

After digging a bit, I'm afraid the analysis & fix of the following patch is wrong
https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch

That is my experiment, first patched the kernel decompress_unlz4.c to show the whole in_len

diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
index c0cfcfd486be..75fd87a5f260 100644
--- a/lib/decompress_unlz4.c
+++ b/lib/decompress_unlz4.c
@@ -159,6 +159,10 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
                dest_len = ret;
 #endif
                if (ret < 0) {
+#ifndef PREBOOT
+                       pr_err("chunksize %lu(%lx) in_len %lu(%lx) offset %lu(%lx)\n",
+                              chunksize, chunksize, in_len, in_len, inp - input, inp - input);
+#endif
                        error("Decoding failed");
                        goto exit_2;
                }

and that is what I got
[ 0.198971] Trying to unpack rootfs image as initramfs...
[ 1.097727] chunksize 3435921408(cccc0000) in_len 723885820(2b259efc) offset 723885822(2b259efe)
[ 1.097735] fbcon: Taking over console
[ 1.098138] Initramfs unpacking failed: Decoding failed

and then I hacked /usr/bin/unmkinitramfs as follows and ran with /boot/initrd.img-5.10.0-rc4+

                mv "$subarchive" "/tmp/tmp"
#               xcpio "$subarchive" "${dir:+$dir/main}" -i "$@"

finally I ran $ xxd /tmp/tmp | tail -n 5

2b259eb0: 3443 3274 0044 1738 7400 0011 0a0c 7800  4C2t.D.8t.....x.
2b259ec0: 1f33 7800 0c1f 3278 0024 1b44 7800 562f  .3x...2x.$.Dx.V/
2b259ed0: 6468 6370 7c00 0f01 0012 2f31 3001 0023  dhcp|...../10..#
2b259ee0: 1442 0900 bf54 5241 494c 4552 2121 2100  .B...TRAILER!!!.
2b259ef0: 0100 ffc0 5000 0000 0000                 ....P.....

and ls -l /tmp/tmp
-rw------- 1 hsiangkao hsiangkao 723885818 Nov 27 03:14 /tmp/tmp

so you can see 723885818(2b259efa) != 723885820(2b259efc). So that is the real cause of decompression failure (in_len mismatched).

I didn't dig into anymore, but the patch suggested in the thread above is just a workaround, not the root cause of the issue.

Thanks,
Gao Xiang

@xnox
Copy link
Author

xnox commented Nov 26, 2020

@hsiangkao thank you! this is useful. I was suspecting that initrd on disk, is not the initrd in memory. Now I guess i must figure out where the in_len came from.

@xnox
Copy link
Author

xnox commented Nov 26, 2020

grub_add (size, ALIGN_UP (grub_file_size (files[i]), 4), &size)

@xnox
Copy link
Author

xnox commented Nov 26, 2020

I think it's a grub bug. It aligns up by 4 bytes. and then aligns multiple initrds at 4 bytes boundry too. Then it declares the total size as it allocated the memory, rather than the initrds true sizes.

@xnox
Copy link
Author

xnox commented Nov 26, 2020

I think something like https://github.com/rhboot/grub2/pull/77/files is the right place to fix this in.

@xnox
Copy link
Author

xnox commented Dec 1, 2020

Actually I'm wrong about grub. It does everything right.

Initramfs buffer format is specified at https://www.kernel.org/doc/html/latest/driver-api/early-userspace/buffer-format.html and it explicitly states that any number of zero padding is allowed. Thus when compatible lz4 decompressor is in use, it tries in error, to consume all the size specified to it, when in practice it should stop once zeros stop making any sense and return the consumed input buffer position.

Thus imho, I now think that unlz4 decompressor is implemented incorrectly in the linux kernel, w.r.t. the expected linux decompressor api.

@xnox xnox closed this as completed Dec 1, 2020
@hsiangkao
Copy link

Actually I'm wrong about grub. It does everything right.

Initramfs buffer format is specified at https://www.kernel.org/doc/html/latest/driver-api/early-userspace/buffer-format.html and it explicitly states that any number of zero padding is allowed. Thus when compatible lz4 decompressor is in use, it tries in error, to consume all the size specified to it, when in practice it should stop once zeros stop making any sense and return the consumed input buffer position.

Thus imho, I now think that unlz4 decompressor is implemented incorrectly in the linux kernel, w.r.t. the expected linux decompressor api.

No, what's shown is the cpio(initramfs) format (which describes the uncompressed data), not cpio+lz4 format. in which case, lz4 legacy frame format is a container of cpio(initramfs) format. and lz4 legacy frame format needs the exact in_len of the compressed data (not more and less). So there is nothing wrong with this format. If grub cannot guarantee such in_len match, lz4 legacy frame format cannot be used then.

@hsiangkao
Copy link

plus, that is also why "/tmp/tmp" in my previous comment can be decompressed with "lz4 -d", but if you add 2 more 0 bytes to "/tmp/tmp", it won't decompress correctly. that is a limitation to lz4 legacy frame format, not related to its content --- initramfs format at all.

@xnox
Copy link
Author

xnox commented Dec 1, 2020

Actually I'm wrong about grub. It does everything right.
Initramfs buffer format is specified at https://www.kernel.org/doc/html/latest/driver-api/early-userspace/buffer-format.html and it explicitly states that any number of zero padding is allowed. Thus when compatible lz4 decompressor is in use, it tries in error, to consume all the size specified to it, when in practice it should stop once zeros stop making any sense and return the consumed input buffer position.
Thus imho, I now think that unlz4 decompressor is implemented incorrectly in the linux kernel, w.r.t. the expected linux decompressor api.

No, what's shown is the cpio(initramfs) format (which describes the uncompressed data), not cpio+lz4 format. in which case, lz4 legacy frame format is a container of cpio(initramfs) format. and lz4 legacy frame format needs the exact in_len of the compressed data (not more and less). So there is nothing wrong with this format. If grub cannot guarantee such in_len match, lz4 legacy frame format cannot be used then.

Yes, lz4 legacy frame format relies on EOF marker or size, which is never known here in advance.

initramfs := ("\0" | cpio_archive | cpio_gzip_archive)*

initramfs buffer is a series of either zeros, cpio_archives (uncompressed), compressed_archives. Star * at the end means that all of those can be repeated multiple times.

grub passes the whole buffer, and the total size of it, which may contain more than one lz4 compressed archive. Meaning, there will be outstanding size & input buffer when lz4 decompression of the first archive completes.

concatenating two compressed cpio archives and loading as initrd must work, with or without grub.

monogon-bot pushed a commit to monogon-dev/monogon that referenced this issue Nov 14, 2023
There are two issues at play here: One is a bug in pierrec/lz4 when
using the legacy framing format [1]. This bit us when we hit a broken
size region with CL:2130, taking hours to debug.

The other is the fact that the Linux LZ4 frame format has significant
design issues [2], especially with concatenanted initrds.

The first issue could be fixed by switching to a different LZ4
implementation (we do even have the reference impl in the monorepo) but
there is no API to generate the legacy frame format and things like [3],
a patch carried by Ubuntu to fix more edge cases just do not inspire
confidence in such a solution.

Thus, this CL switches over to using zstd for compressing initrds.

Zstd is slower than LZ4 for decompressing, but it still decompresses at
multiple GB/s per core while having a much better compression ratio.
It also doesn't have any Linux-specific bits and Linux uses the
reference implementation for decoding, which should make it much more
robust. So overall I think this is a good tradeoff.

[1] pierrec/lz4#156
[2] lz4/lz4#956 (comment)
[3] https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch

Change-Id: I69cf69f2f361de325f4b39f2d3644ee729643716
Reviewed-on: https://review.monogon.dev/c/monogon/+/2313
Tested-by: Jenkins CI
Reviewed-by: Serge Bazanski <serge@monogon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants