Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upBit-for-bit deterministic / reproducible builds #34902
Comments
steveklabnik
added
the
A-compiler
label
Jul 18, 2016
This comment has been minimized.
This comment has been minimized.
|
In general, being reproducible is something we're interested in; we try to tackle it bug by bug. |
This comment has been minimized.
This comment has been minimized.
|
This probably has some obvious cause:
If someone wants to look at the code generation differences, it's probably best to start with libcore. The reproducible-builds.org diff isn't that useful for that because it doesn't recognize .rlib files as archives. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for the heads up, I just added AR support in diffoscope and hopefully the diff output will see that in the next few weeks, when the website updates. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
If it's a |
This comment has been minimized.
This comment has been minimized.
|
I tried replacing HashMap with FnvHashMap in
perhaps i'm Doing It Wrong. |
infinity0
referenced this issue
Aug 20, 2016
Closed
perf.rust-lang.org test for syntex-syntax messed up #35855
This comment has been minimized.
This comment has been minimized.
|
As background, HashMap in rust is non-deterministic to protect against certain types of DoS attack. You can switch it to the deterministic FnvHashMap if you're sure your code will always be called in a safe manner. This ought to be true for rustc itself. (I notice some online "try-it-yourself" rust web services let me run "ls /" and other shell commands, so I could also exploit this HashMap ordering issue, but I also assume that they're clever enough to set a ulimit and/or containerise the thing.) |
This comment has been minimized.
This comment has been minimized.
|
I wrote a small script to diff metadata (I couldn't really get the makefile to work). It looks like even more is changing between two compiles using the current master (or nightly): The crate hash and/or disambiguator, which are stored right after the target triple. @infinity0 can you confirm? EDIT: That might get fixed by #35854 |
This comment has been minimized.
This comment has been minimized.
|
Okay, apparently replacing |
This comment has been minimized.
This comment has been minimized.
|
The remaining issue is that (some?) predicates are encoded in non-deterministic order. Maybe we need to do this for all bounds? Not sure what would be the correct way to do that (or why these are nondeterministic in the first place). |
This comment has been minimized.
This comment has been minimized.
|
Just for reference, there's this: #34805 |
This comment has been minimized.
This comment has been minimized.
|
@michaelwoerister Thanks! Too tired to think about it currently. Will have a look tomorrow :) |
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 22, 2016
This comment has been minimized.
This comment has been minimized.
|
#24473 is rustdoc subset of this. |
eddyb
added a commit
to eddyb/rust
that referenced
this issue
Aug 23, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 23, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 23, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 24, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 25, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 25, 2016
bors
added a commit
that referenced
this issue
Aug 26, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 27, 2016
jonas-schievink
added a commit
to jonas-schievink/rust
that referenced
this issue
Aug 27, 2016
bors
added a commit
that referenced
this issue
Aug 28, 2016
bors
added a commit
that referenced
this issue
Aug 28, 2016
This comment has been minimized.
This comment has been minimized.
|
Potentially relevant, from IRC:
though I think that it's not a big issue unless the hashtable is truly random. |
This comment has been minimized.
This comment has been minimized.
|
With 1.21.0+dfsg1-3 Debian, rustc the compiler is built deterministically (and build-path-independently) but libstd has some things remaining (112MB diff): ./usr/lib/x86_64-linux-gnu/librustc-2d0e4ac759a96c13.so
objdump --line-numbers --disassemble --demangle --section=.text {}
@@ -39978,25 +39978,25 @@
26491: 48 89 e5 mov %rsp,%rbp
/usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/float/sub.rs:11
26494: f2 0f 5c c1 subsd %xmm1,%xmm0
/usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/macros.rs:255
26498: 5d pop %rbp
26499: c3 retq
2649a: 66 90 xchg %ax,%ax
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
2649c: 66 0f 6e c7 movd %edi,%xmm0
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
264a0: 48 c1 ef 20 shr $0x20,%rdi
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
264a4: 48 0b 3d 25 19 00 00 or 0x1925(%rip),%rdi
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
264ab: 66 0f 56 05 fd 18 00 orpd 0x18fd(%rip),%xmm0
264b2: 00 What's strange is that some of the paths are already remapped to |
This comment has been minimized.
This comment has been minimized.
|
I should mention that I had to disable jemalloc in Debian for other reasons, so builds with that enabled might still hit the nondeterminism I mentioned earlier. |
This comment has been minimized.
This comment has been minimized.
|
Nice to hear we almost got rid of nondeterminism! The last bunch of path dependence looks like it comes from compiler-rt asm files that are a part of compiler-builtins - I think the build-script that compiles them all is here. I think it's just that we don't pass debuginfo remapping all the way down to gcc. |
luser
referenced this issue
Feb 5, 2018
Closed
rlibs are not bit-identical when building the same crate in different directories with -Zremap-path-prefix #48019
infinity0
referenced this issue
Feb 21, 2018
Open
[regression] [1.24] extern crate proc_macro: found two different crates with name `std` that are not distinguished by differing `-C metadata` #48319
infinity0
referenced this issue
Jul 28, 2018
Closed
reproducible builds: non-deterministic use of cmpq #50556
quentinlesceller
referenced this issue
Aug 21, 2018
Closed
Provide 64 bits Mac and Linux binaries #1379
This comment has been minimized.
This comment has been minimized.
daym
commented
Sep 25, 2018
•
|
Rust 1.22.1 has a reproducibility problem in liblibc's rust.metadata.bin .The symbols in there seem to be shuffled around (example: |
This comment has been minimized.
This comment has been minimized.
|
@daym I can't find anything in the metadata encoder that immediately seems at error here. Would be interesting to look into this some more. |
This comment has been minimized.
This comment has been minimized.
daym
commented
Oct 24, 2018
•
|
@michaelwoerister Thanks for checking! Further information: With newer rusts, the situation is much better. We work around the remaining reproducibility problems of rust 1.25, rust 1.26, rust 1.27 by using llvm 3.9.1 instead of llvm 6.0.1, which makes their build reproducible then. Rust 1.28.0 is fine with llvm 6.0.1, although rust 1.29.2's cargo does not build reproducibly (for the latter see #50556, also https://lists.gnu.org/archive/html/guix-patches/2018-10/msg00491.html ). Our build of rust 1.22.1 above--with the problem observed--used llvm 3.9.1. Interesting is that older versions of rust (1.19.0, 1.20.0, 1.21.0) are apparently reproducible! So in principle the fix could be bisected. But we use rust 1.22.1 as part of a bootstrapping effort, bootstrapping the rust compiler from C++ via mrustc. mrustc so far only supports compiling the rust 1.19.0 compiler, so that's how we ended up (eventually) caring about rust 1.22.1. That said, maybe just wait until mrustc supports compiling the rust 1.23.0 compiler directly. We'll see which comes first. |
This comment has been minimized.
This comment has been minimized.
elichai
commented
Jan 17, 2019
|
What's the status of this? Is it already possible to make a reproducible/deterministic build or not? |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jan 17, 2019
•
|
My understanding is builds are now (sometimes?) reproducible and that people are using reprotest successfully in CI. To me it seems like we're not too far away from robust reproducible builds, but the main problem right now besides the lingering small issues mentioned above the lack of easy-to-use tooling for producing and verifying such builds. I opened an issue about that on the Secure Code WG issue tracker: |
This comment has been minimized.
This comment has been minimized.
|
Isn't this a significant milestone on reproducibility. https://www.reddit.com/r/rust/comments/afscgo/ripgrep_0100_is_reproducible_in_debian/ |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jan 17, 2019
|
Well, if you're ok with cargo culting a config from a project called https://github.com/kpcyrd/sniffglue/blob/master/ci/reprotest.sh |
This comment has been minimized.
This comment has been minimized.
|
Just to prevent some confusion: these two issues are separate: 1) reproducible builds of softwares written in Rust, and 2) reproducible builds of Rust compiler. Since Rust compiler is written in Rust, 2 requires 1. Since this is an issue on Rust compiler repository, this issue is mostly about 2. 2 is very close but not done. So this issue is open. 1 is largely possible now, although tooling is lacking. I think further discussion of 1 should happen on Secure Code WG. |
This comment has been minimized.
This comment has been minimized.
tarcieri
commented
Jan 18, 2019
•
|
Thanks for the clarification @sanxiyn ! If anyone is interested in discussing tooling for producing and verifying reproducible builds, particularly for things like CI use cases, I opened an issue on the Secure Code WG repo here: |
infinity0
referenced this issue
Jan 20, 2019
Closed
man page generation should use SOURCE_DATE_EPOCH #57776
This comment has been minimized.
This comment has been minimized.
|
Continuing off from last time, more than a year ago:
The cc crate now honours CFLAGS and CPPFLAGS so this problem has gone away and I can confirm the libcompiler_builtins builds reproducibly independently of build-path including the C code. However there is a new regression, dylib_metadata is no longer reproducible. (rlib_metadata is fine.) Please see these samples: These files were extracted from the |
This comment has been minimized.
This comment has been minimized.
|
I think this is the last piece so if someone can figure this out in the next month (before Mar 12) we could in theory get a reproducible |
This comment has been minimized.
This comment has been minimized.
It's not just a binary format, but also compressed on top (e.g. you can't see any strings): rust/src/librustc_metadata/locator.rs Lines 876 to 889 in f40aaa6 The metadata header is currently 12 bytes: rust/src/librustc_metadata/schema.rs Lines 45 to 46 in f40aaa6 So you can strip those first 12 bytes then inflate the rest. |
This comment has been minimized.
This comment has been minimized.
My mistake, rlib_metadata is also affected, here is another sample: Note that the other contents of the |
infinity0 commentedJul 18, 2016
It would be good if rustc could generate bit-for-bit reproducible results, even in the presence of minor system environment differences. Currently we have quite a large diff: e.g. see txt diff for 1.9.0 or perhaps txt diff for 1.10.0 a few days after I'm posting this. (You might want to "save link as" instead of displaying it directly in the browser.)
Much of the diff output is due to build-id differences, which can be ignored since they are caused by other deeper issues and will go away once these deeper issues are fixed. One example of a deeper issue is this:
Here are the system variations that might be causing these issues. I myself am not that familiar with ELF, but perhaps someone else here would know why the section header is 4 bytes later in the first vs second builds.