Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builds for macOS are not reproducible #1230

Closed
giordano opened this issue Sep 19, 2022 · 3 comments
Closed

Builds for macOS are not reproducible #1230

giordano opened this issue Sep 19, 2022 · 3 comments

Comments

@giordano
Copy link
Member

giordano commented Sep 19, 2022

See JuliaBinaryWrappers/HelloWorldC_jll.jl@7478301, notice that the the git-tree-sha1 of macOS artifacts changed, unlike all other platforms. Usual suspect is codesigning.

Edit 1: build are non-reproducible even before codesigning, even though codesigning may still make the binaries non-reproducible.

Edit 2: it's using -g which makes the build non-reproducible

sandbox:${WORKSPACE}/srcdir/build # cc -g -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
da9fa0c822e251a02a32e8f17b0bcfc60bcceeafdc06eb7e1f80de126bf99be8  hello_world
sandbox:${WORKSPACE}/srcdir/build # cc -g -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
e08247be6b32f970bf7c6362658f59a09a89f2105fe6ca0211bb5b057fc046ba  hello_world
sandbox:${WORKSPACE}/srcdir/build # cc -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
b6e81450b3ee15bcd92e4c1c84d738e8b92838d489195bb0f4510ed906c3590a  hello_world
sandbox:${WORKSPACE}/srcdir/build # cc -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
b6e81450b3ee15bcd92e4c1c84d738e8b92838d489195bb0f4510ed906c3590a  hello_world

-g embeds the object file, which has a seemingly random filename, and the name is also embedded in the binary. You can see that by running strings hello_world and comparing the output with a different debug build.

Edit 2.5: more specifically, this only happens when doing build+linking in a single pass (without building object files and then linking them, in that case they'd have the name given in the command line, likely deterministic). However, in most cases we use build systems to drive the build, which do the compilation of object files separately (and using deterministic names), so in practice this shouldn't be a worry as long as we use proper build systems.

Edit 3: Go folks ran into the same issue some time ago: golang/go#40979. Solution seems to use -Wl,-S:

sandbox:${WORKSPACE}/srcdir/build # cc -Wl,-S -g -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
warning: no debug symbols in executable (-arch x86_64)
b6e81450b3ee15bcd92e4c1c84d738e8b92838d489195bb0f4510ed906c3590a  hello_world
sandbox:${WORKSPACE}/srcdir/build # cc -Wl,-S -g -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
warning: no debug symbols in executable (-arch x86_64)
b6e81450b3ee15bcd92e4c1c84d738e8b92838d489195bb0f4510ed906c3590a  hello_world
sandbox:${WORKSPACE}/srcdir/build # cc -o hello_world /usr/share/testsuite/c/hello_world/hello_world.c; sha256sum hello_world
b6e81450b3ee15bcd92e4c1c84d738e8b92838d489195bb0f4510ed906c3590a  hello_world
@giordano
Copy link
Member Author

giordano commented Sep 19, 2022

To recap: the issue arises only when doing debug builds (which we don't always do, especially when using CMake or Meson) and not compiling object files with deterministic names (which instead we do very often, through build systems, and is also the only way to take advantage of ccache, which can't cache combined compilation+linking, as it doesn't cache linking at all).

I'd say let's not worry about this right now, as the reproducibility issue is likely to affect only a small fraction of packages. I'm going to close this issue, but we may reopen it in the future if we realise this non-reproducibility is more common than what we'd like, and we want take a stronger action to ensure it.

Side note: codesigning doesn't seem to affect reproducibility of final tarball.

@giordano giordano closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2022
@giordano
Copy link
Member Author

I hadn't seen how the object files names are written, but Elliot suggested that another possibility is perhaps to add an audit pass like the one in #1259 which normalises the names.

@giordano
Copy link
Member Author

giordano commented Jan 24, 2023

I can confirm the path of the temporary object file is stored as a literal NULL-delimited string which contains

<NULL>/tmp/<source file>-<6-character random slug>.o<NULL>

when compiling with clang, or

<NULL>/tmp/<8-character random slug>.o<NULL>

when compiling with gcc, so we should indeed be able to normalise the random slug like in #1259.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant