Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tar-rs fails to decompress some ostree archives #367

Open
mripard opened this issue Jun 18, 2024 · 1 comment
Open

tar-rs fails to decompress some ostree archives #367

mripard opened this issue Jun 18, 2024 · 1 comment

Comments

@mripard
Copy link

mripard commented Jun 18, 2024

Hi,

I've been trying to write a tool that fetches container images and extracts them. Containers are stored as multiple tar.gz files that need to be decompressed in sequence.

It looks like tar-rs struggles to decompress some of those archives, for a reason that isn't entirely clear to me yet:

$ sudo target/debug/ocibootstrap test test-dir 
11:02:37 [INFO] Running ocibootstrap 0.1.0
11:02:38 [INFO] Found layer 81867c2cbb30529f835c9ff79532b7ae0c3585da08963dd398e179a425d887fb
11:02:38 [INFO] Blob retrieved, extracting ...
11:02:39 [INFO] Done
11:02:39 [INFO] Found layer d9a9ed9b0630076fc7fd8052cc6676271e5413084e19c36502793c079af153ea
11:02:40 [INFO] Blob retrieved, extracting ...
11:02:43 [INFO] Done
11:02:43 [INFO] Found layer b16913c15949a37f7740651a76a0b1eeb717c885f577afcd39893160be77c747
11:02:43 [INFO] Blob retrieved, extracting ...
thread 'main' panicked at src/main.rs:458:40:
called `Result::unwrap()` on an `Err` value: Custom { kind: NotFound, error: TarError { desc: "failed to unpack `/var/home/max/Work/experimentations/ocibootstrap/test-dir/usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601`", io: Custom { kind: NotFound, error: "No such file or directory (os error 2) while canonicalizing /var/home/max/Work/experimentations/ocibootstrap/test-dir/sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file" } } }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I've made a test tool that just tries to decompress a tar.gz archive using the same code, and can reproduce the issue:

$ sudo target/debug/examples/test-tar /root/.cache/ocibootstrap/b16913c15949a37f7740651a76a0b1eeb717c885f577afcd39893160be77c747 new-new-new-new-example-test-dir 
Input File /root/.cache/ocibootstrap/b16913c15949a37f7740651a76a0b1eeb717c885f577afcd39893160be77c747
Output Dir: new-new-new-new-example-test-dir
thread 'main' panicked at examples/test-tar.rs:32:31:
called `Result::unwrap()` on an `Err` value: Custom { kind: NotFound, error: TarError { desc: "failed to unpack `new-new-new-new-example-test-dir/usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601`", io: Custom { kind: NotFound, error: "No such file or directory (os error 2) while canonicalizing /var/home/max/Work/experimentations/ocibootstrap/new-new-new-new-example-test-dir/sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file" } } }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

tar works just fine:

$ sudo tar xf /root/.cache/ocibootstrap/b16913c15949a37f7740651a76a0b1eeb717c885f577afcd39893160be77c747
$ echo $?
0

If I have a look at the output from tar though, it appears that usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601 is a symlink:

$ ls -l usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601
lrwxrwxrwx. 2 root root 47  1 janv.  1970 usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601 -> ../../../../usr/lib64/firefox/minidump-analyzer

And a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file seems to be a symlink to the same file, but a broken one:

$ ls -l sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file
lrwxrwxrwx. 2 root root 47  1 janv.  1970 sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file -> ../../../../usr/lib64/firefox/minidump-analyzer

It makes sense for this link to be broken because ostree deploys the actual files as hardlinks to their stored objects under ostree/repo.

And sure enough, in the tar version:

$ ls -i usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601 sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file
56540564 sysroot/ostree/repo/objects/1c/a63fafc9ae2c7b25dbdf4dba215d4e0a7a3f9136c5b1e02bfa767da611d965.file
56540564 usr/lib/.build-id/91/319a27e0de0f6d6f0606ca103fa52f5ab78601

The '/usr/lib one ends up a hardlink to the one under 'sysroot/ostree/repo, but the link is only valid under /usr/lib, and the one under sysroot is kind of expected to be broken.

So it looks to me that tar-rs is a bit to cautious there, or at least more that GNU tar is, and we could possibly relax (with an option?) tar-rs ?

@mripard mripard changed the title tar-rs fails to decompress some archives tar-rs fails to decompress some ostree archives Jun 18, 2024
@cgwalters
Copy link
Collaborator

Hi, I wrote that code (in https://github.com/ostreedev/ostree-rs-ext/ ).

Before I forget I also have some larger questions about what you're thinking of doing with that linked tool https://github.com/mripard/ocibootstrap and its overlap/intersection with http://github.com/containers/bootc and especially bootc install...but those are all side questions better done perhaps as an issue on your repository.

So it looks to me that tar-rs is a bit to cautious there, or at least more that GNU tar is, and we could possibly relax (with an option?) tar-rs ?

Without digging in deeply here I think that's possible...but it needs some deeper analysis.
First, long link support here is relatively new and it's entirely possible I missed something.

Actually of important note here is that for how ostree actually uses these we don't call unpack_in...and in fact the unit test I added only re-parses the tarball.

Handling symlinks during unpacking (especially being robust against malicious tar) is a longstanding issue here (previously #90 etc).

I will say personally my view is that the maximally safe and robust way to handle this is using e.g. https://github.com/bytecodealliance/cap-std - I thought about making that an optional dependency here.

Also I see another issue here which may be a dup of this one or related: #369

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants