Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layers with symlinks differ depending on OS #23

Closed
YorikSar opened this issue Jun 6, 2022 · 4 comments · Fixed by #25
Closed

Layers with symlinks differ depending on OS #23

YorikSar opened this issue Jun 6, 2022 · 4 comments · Fixed by #25

Comments

@YorikSar
Copy link
Contributor

YorikSar commented Jun 6, 2022

I've noticed that when I run patched skopeo on Linux, it succeeds, but when I run it with the same JSON on Darwin, I get

FATA[0001] writing blob: writing to temporary on-disk layer: happened during read: Digest did not match, expected sha256:..., got sha256:...

And the layer in question is a .so library that has symlinks in it.

It looks like nix2containers produces different hashes on different OSs if the layer contains symlinks. To check this theory, I've added a symlink to test data and ran in on two different systems, I got on Darwin:

$ ln -s file1 data/tar-directory/symlink; go test ./nix -run TestTar
--- FAIL: TestTar (0.00s)
    tar_test.go:19: Digest is sha256:9a9402c601020cb24b71d4287b143d7bda875869970a1ed35792caa1702fdb69 while it should be sha256:efccbbe35209d59cfeebd8e73785258d3679fa258f72a7dfbc2eec65695fd5c8
FAIL
FAIL    github.com/nlewo/nix2container/nix      0.244s
FAIL

And on Linux:

$ ln -s file1 data/tar-directory/symlink; go test ./nix -run TestTar
--- FAIL: TestTar (0.00s)
    tar_test.go:19: Digest is sha256:077af73ad0fb226436e92a272318b777b6976b85c3a05d86183274818dd634f8 while it should be sha256:efccbbe35209d59cfeebd8e73785258d3679fa258f72a7dfbc2eec65695fd5c8
FAIL
FAIL    github.com/nlewo/nix2container/nix      0.013s
FAIL

Note that the digests are different in both cases, even though they should match. Without this symlink, test passes, so the digests are the same.

Note that this prevents me from using nix2container on macOS with Linux builder.

@nlewo
Copy link
Owner

nlewo commented Jun 6, 2022

@YorikSar Thx for this nice bug report!

I never used Darwin and i don't know what could differ regarding symlinks...

If you set reproducible=false in a nix2container image or layer, the produced store path contains the JSON file and the tar file. This could allow us to diff the two produced tar files in order to know what differs (with tar -tvf for instance).

It would be really useful if you could provide such diff on a simple image (containing a symlink) built on Linux and Darwin!

@YorikSar
Copy link
Contributor Author

YorikSar commented Jun 7, 2022

I've ran this script on both systems:

package main

import (
	"io"
	"log"
	"os"

	"github.com/nlewo/nix2container/nix"
	"github.com/nlewo/nix2container/types"
)

func main() {
	path := types.Path{
		Path: "data/tar-directory",
	}
	t := nix.TarPaths(types.Paths{path})
	out, err := os.Create("out.tar")
	if err != nil {
		log.Fatal(err)
	}
	_, err = io.Copy(out, t)
	if err != nil {
		log.Fatal(err)
	}
}

Results are (ran through gzip -k9 for GitHub):
darwin.tar.gz
linux.tar.gz

Diff of od -t x1 -a for both of them is very small:

--- darwin.txt	2022-06-07 12:09:58.000000000 +0400
+++ linux.txt	2022-06-07 12:10:05.000000000 +0400
@@ -74,18 +74,18 @@
 0003020    72  79  2f  73  79  6d  6c  69  6e  6b  00  00  00  00  00  00
            r   y   /   s   y   m   l   i   n   k nul nul nul nul nul nul
 0003040    00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
          nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 *
-0003140    00  00  00  00  30  30  30  30  37  35  35  00  30  30  30  30
-         nul nul nul nul   0   0   0   0   7   5   5 nul   0   0   0   0
+0003140    00  00  00  00  30  30  30  30  37  37  37  00  30  30  30  30
+         nul nul nul nul   0   0   0   0   7   7   7 nul   0   0   0   0
 0003160    30  30  30  00  30  30  30  30  30  30  30  00  30  30  30  30
            0   0   0 nul   0   0   0   0   0   0   0 nul   0   0   0   0
 0003200    30  30  30  30  30  30  30  00  30  30  30  30  30  30  30  30
            0   0   0   0   0   0   0 nul   0   0   0   0   0   0   0   0
-0003220    30  30  30  00  30  31  37  30  34  33  00  20  32  66  69  6c
-           0   0   0 nul   0   1   7   0   4   3 nul  sp   2   f   i   l
+0003220    30  30  30  00  30  31  37  30  34  37  00  20  32  66  69  6c
+           0   0   0 nul   0   1   7   0   4   7 nul  sp   2   f   i   l
 0003240    65  31  00  00  00  00  00  00  00  00  00  00  00  00  00  00
            e   1 nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 0003260    00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
          nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 *

The first one (755 vs 777) is permissions (confirmed on both systems, ln -s sets different permissions). Checked on both systems the problematic Nix store path, and they also have different permissions!

# Linux
$ ll /nix/store/lrhl751psj7fwfc009wqy2b62pwccnvf-libunistring-0.9.10/lib
total 1.6M
dr-xr-xr-x 2 root 4.0K Jan  1  1970 .
dr-xr-xr-x 3 root 4.0K Jan  1  1970 ..
-r-xr-xr-x 1 root  990 Jan  1  1970 libunistring.la
lrwxrwxrwx 1 root   21 Jan  1  1970 libunistring.so -> libunistring.so.2.1.0
lrwxrwxrwx 1 root   21 Jan  1  1970 libunistring.so.2 -> libunistring.so.2.1.0
-r-xr-xr-x 1 root 1.6M Jan  1  1970 libunistring.so.2.1.0

# Darwin
$ ll /nix/store/lrhl751psj7fwfc009wqy2b62pwccnvf-libunistring-0.9.10/lib
total 3200
dr-xr-xr-x  6 root  nixbld   192B Jan  1  1970 .
dr-xr-xr-x  3 root  nixbld    96B Jan  1  1970 ..
-r-xr-xr-x  1 root  nixbld   990B Jan  1  1970 libunistring.la
lrwxr-xr-x  1 root  nixbld    21B Jan  1  1970 libunistring.so -> libunistring.so.2.1.0
lrwxr-xr-x  1 root  nixbld    21B Jan  1  1970 libunistring.so.2 -> libunistring.so.2.1.0
-r-xr-xr-x  1 root  nixbld   1.6M Jan  1  1970 libunistring.so.2.1.0

Note that this store path was built on this Linux machine and then fetched to the Darwin one. Bug to Nix is coming 🚂

chmod -h 777 data/tar-directory/symlink on Darwin made the result of the script the same again. While it seems like a bug in Nix, maybe in the mean time, we should address this on the nix2container side by chmodding all symlinks to 777 in layers when build on Darwin, or to 755 when build on Linux?

@YorikSar
Copy link
Contributor Author

YorikSar commented Jun 7, 2022

Digging a bit deeper, I found that the NAR format is only described in Eelco's thesis and it includes following wording (on page 90 of the thesis, 98 of the PDF):

Fortunately, for our purposes—which is storing components in the Nix store—most of these features are either not relevant (drive letters, device nodes, etc.), can be ignored (hard links), or are never used in practice (streams). Some others, such as permissions, we will ignore to make life simpler.

He then proceeds with describing NAR format that indeed ignores all permissions except the executable flag on regular files. Since NAR format is used to transfer store paths between machines, we can expect to have such discrepancies in the OS-dependant things like symlinks (and they are, very much, OS-dependant, as it appears).

I guess since it's impossible to change symlink permissions on Linux, and Docker always (*) runs on Linux, and Nix doesn't care about permissions anyway, we should be forcing symlink permissions to 777 in layers.

(*) Docker can natively run containers on Linux and Windows, and I know of effort to bring it to FreeBSD. FreeBSD is officially not supported by Docker, and efforts to run Docker on it without emulation didn't get anywhere (yet?). Windows is not supported by Nix (yet?), and doesn't really have Unix permissions. I don't know of other container runtimes that can run outside Linux.

@YorikSar
Copy link
Contributor Author

YorikSar commented Jun 7, 2022

Created a PR with a fix for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants