Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bit-for-bit deterministic / reproducible builds #34902

Open
infinity0 opened this issue Jul 18, 2016 · 50 comments

Comments

Projects
None yet
@infinity0
Copy link
Contributor

commented Jul 18, 2016

It would be good if rustc could generate bit-for-bit reproducible results, even in the presence of minor system environment differences. Currently we have quite a large diff: e.g. see txt diff for 1.9.0 or perhaps txt diff for 1.10.0 a few days after I'm posting this. (You might want to "save link as" instead of displaying it directly in the browser.)

Much of the diff output is due to build-id differences, which can be ignored since they are caused by other deeper issues and will go away once these deeper issues are fixed. One example of a deeper issue is this:

│   │   │   ├── ./usr/lib/i386-linux-gnu/librustc_data_structures-e8edd0fd.so
│   │   │   │   ├── readelf --wide --file-header {}
│   │   │   │   │ @@ -6,15 +6,15 @@
│   │   │   │   │    OS/ABI:                            UNIX - System V
│   │   │   │   │    ABI Version:                       0
│   │   │   │   │    Type:                              DYN (Shared object file)
│   │   │   │   │    Machine:                           Intel 80386
│   │   │   │   │    Version:                           0x1
│   │   │   │   │    Entry point address:               0x4620
│   │   │   │   │    Start of program headers:          52 (bytes into file)
│   │   │   │   │ -  Start of section headers:          338784 (bytes into file)
│   │   │   │   │ +  Start of section headers:          338780 (bytes into file)
│   │   │   │   │    Flags:                             0x0
│   │   │   │   │    Size of this header:               52 (bytes)
│   │   │   │   │    Size of program headers:           32 (bytes)
│   │   │   │   │    Number of program headers:         8
│   │   │   │   │    Size of section headers:           40 (bytes)
│   │   │   │   │    Number of section headers:         30
│   │   │   │   │    Section header string table index: 29
│   │   │   │   ├── readelf --wide --program-header {}

Here are the system variations that might be causing these issues. I myself am not that familiar with ELF, but perhaps someone else here would know why the section header is 4 bytes later in the first vs second builds.

@steveklabnik

This comment has been minimized.

Copy link
Member

commented Jul 18, 2016

In general, being reproducible is something we're interested in; we try to tackle it bug by bug.

@eefriedman

This comment has been minimized.

Copy link
Contributor

commented Jul 18, 2016

This probably has some obvious cause:

│   │   │   │ -<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;</span></code></h4>
│   │   │   │ +<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;</span></code></h4>

If someone wants to look at the code generation differences, it's probably best to start with libcore. The reproducible-builds.org diff isn't that useful for that because it doesn't recognize .rlib files as archives.

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Jul 18, 2016

Thanks for the heads up, I just added AR support in diffoscope and hopefully the diff output will see that in the next few weeks, when the website updates.

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2016

Here's a diff of rust.metadata.bin from libcore.rlib, from 1.10.0. (2.0 MB, ~1700 lines, might make your browser slow.) Mostly ordering differences. @eddyb has suggested HashMap ordering as a culprit, and that would be replaced with FnvHashMap instead.

@eddyb

This comment has been minimized.

Copy link
Member

commented Aug 16, 2016

If it's a HashMap, I don't know which one. cc @rust-lang/compiler

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2016

I tried replacing HashMap with FnvHashMap in src/librustc_metadata/loader.rs but it didn't seem to help. Here is the Makefile i'm using to automate my rebuilds:

TRIPLET = x86_64-unknown-linux-gnu

all: libcore.diff

libcore.diff: rust.metadata.bin.1 rust.metadata.bin.2
    diffoscope --html-dir "$@" rust.metadata.bin.1 rust.metadata.bin.2 \
        --max-diff-block-lines 5000 \
        --max-diff-input-lines 10000000 \
        --max-report-size 204800000; true # diffoscope returns 1 for "there were diffs"

rust.metadata.bin.1 rust.metadata.bin.2: Makefile
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    $(MAKE) $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    ar x $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib rust.metadata.bin
    mv rust.metadata.bin "$@"

.PHONY: all rust.metadata.bin.1 rust.metadata.bin.2

perhaps i'm Doing It Wrong.

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2016

As background, HashMap in rust is non-deterministic to protect against certain types of DoS attack. You can switch it to the deterministic FnvHashMap if you're sure your code will always be called in a safe manner. This ought to be true for rustc itself. (I notice some online "try-it-yourself" rust web services let me run "ls /" and other shell commands, so I could also exploit this HashMap ordering issue, but I also assume that they're clever enough to set a ulimit and/or containerise the thing.)

@jonas-schievink

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

I wrote a small script to diff metadata (I couldn't really get the makefile to work).

It looks like even more is changing between two compiles using the current master (or nightly): The crate hash and/or disambiguator, which are stored right after the target triple. @infinity0 can you confirm?

EDIT: That might get fixed by #35854

@jonas-schievink

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

Okay, apparently replacing Hash{Map,Set} in resolve with FnvHash{Map,Set} got rid of the first (smaller) blocks of the diff. The big block at the bottom is still there. Will continue to randomly sed the compiler and report back ;)

@jonas-schievink

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

The remaining issue is that (some?) predicates are encoded in non-deterministic order. Maybe we need to do this for all bounds? Not sure what would be the correct way to do that (or why these are nondeterministic in the first place).

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2016

Just for reference, there's this: #34805

@jonas-schievink

This comment has been minimized.

Copy link
Member

commented Aug 22, 2016

@michaelwoerister Thanks! Too tired to think about it currently. Will have a look tomorrow :)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 22, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)
@sanxiyn

This comment has been minimized.

Copy link
Member

commented Aug 23, 2016

#24473 is rustdoc subset of this.

eddyb added a commit to eddyb/rust that referenced this issue Aug 23, 2016

Rollup merge of rust-lang#35909 - jonas-schievink:more-fnv, r=eddyb
Use `FnvHashMap` in more places

* A step towards rust-lang#34902 (see my comments there)
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

(likewise for `HashSet` -> `FnvHashSet`)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 23, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

This *should* basically fix issue rust-lang#34902, although landing a test would
be nice (I also didn't test a scenario where multiple crates were
involved, only libcore).

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 23, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 24, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 25, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 25, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

bors added a commit that referenced this issue Aug 26, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 27, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 27, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

bors added a commit that referenced this issue Aug 28, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473

bors added a commit that referenced this issue Aug 28, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

Potentially relevant, from IRC:

<eddyb> nmatsakis, infinity0: fwiw, I just found another hash table. not random, but may cause issues around incremental recompilation. see rustc_metadata::encoder::encode_reachable (the reachable NodeSet)

though I think that it's not a big issue unless the hashtable is truly random.

@daym

This comment has been minimized.

Copy link

commented Sep 25, 2018

Rust 1.22.1 has a reproducibility problem in liblibc's rust.metadata.bin .The symbols in there seem to be shuffled around (example: getutxent).

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

commented Oct 24, 2018

@daym I can't find anything in the metadata encoder that immediately seems at error here. Would be interesting to look into this some more.

@daym

This comment has been minimized.

Copy link

commented Oct 24, 2018

@michaelwoerister Thanks for checking!

Further information:

With newer rusts, the situation is much better. We work around the remaining reproducibility problems of rust 1.25, rust 1.26, rust 1.27 by using llvm 3.9.1 instead of llvm 6.0.1, which makes their build reproducible then. Rust 1.28.0 is fine with llvm 6.0.1, although rust 1.29.2's cargo does not build reproducibly (for the latter see #50556, also https://lists.gnu.org/archive/html/guix-patches/2018-10/msg00491.html ).

Our build of rust 1.22.1 above--with the problem observed--used llvm 3.9.1. Interesting is that older versions of rust (1.19.0, 1.20.0, 1.21.0) are apparently reproducible!

So in principle the fix could be bisected.

But we use rust 1.22.1 as part of a bootstrapping effort, bootstrapping the rust compiler from C++ via mrustc.

mrustc so far only supports compiling the rust 1.19.0 compiler, so that's how we ended up (eventually) caring about rust 1.22.1.

That said, maybe just wait until mrustc supports compiling the rust 1.23.0 compiler directly. We'll see which comes first.

@elichai

This comment has been minimized.

Copy link

commented Jan 17, 2019

What's the status of this? Is it already possible to make a reproducible/deterministic build or not?
(I'm interested in both binaries and staticlibs)

@tarcieri

This comment has been minimized.

Copy link

commented Jan 17, 2019

My understanding is builds are now (sometimes?) reproducible and that people are using reprotest successfully in CI.

To me it seems like we're not too far away from robust reproducible builds, but the main problem right now besides the lingering small issues mentioned above the lack of easy-to-use tooling for producing and verifying such builds.

I opened an issue about that on the Secure Code WG issue tracker:

rust-secure-code/wg#28

@zmanian

This comment has been minimized.

Copy link
Contributor

commented Jan 17, 2019

Isn't this a significant milestone on reproducibility.

https://www.reddit.com/r/rust/comments/afscgo/ripgrep_0100_is_reproducible_in_debian/

@dwijnand

This comment has been minimized.

Copy link
Member

commented Jan 17, 2019

My understanding is builds are now (sometimes?) reproducible and that people are using reprotest successfully in CI.

@tarcieri can you share some links of CI setups for that?

@tarcieri

This comment has been minimized.

Copy link

commented Jan 17, 2019

Well, if you're ok with cargo culting a config from a project called sniffglue, here is an example:

https://github.com/kpcyrd/sniffglue/blob/master/ci/reprotest.sh

@sanxiyn

This comment has been minimized.

Copy link
Member

commented Jan 18, 2019

Just to prevent some confusion: these two issues are separate: 1) reproducible builds of softwares written in Rust, and 2) reproducible builds of Rust compiler. Since Rust compiler is written in Rust, 2 requires 1.

Since this is an issue on Rust compiler repository, this issue is mostly about 2. 2 is very close but not done. So this issue is open.

1 is largely possible now, although tooling is lacking. I think further discussion of 1 should happen on Secure Code WG.

@tarcieri

This comment has been minimized.

Copy link

commented Jan 18, 2019

Thanks for the clarification @sanxiyn !

If anyone is interested in discussing tooling for producing and verifying reproducible builds, particularly for things like CI use cases, I opened an issue on the Secure Code WG repo here:

rust-secure-code/wg#28

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Jan 31, 2019

Continuing off from last time, more than a year ago:

@infinity0

Nice to hear we almost got rid of nondeterminism! The last bunch of path dependence looks like it comes from compiler-rt asm files that are a part of compiler-builtins - I think the build-script that compiles them all is here.

I think it's just that we don't pass debuginfo remapping all the way down to gcc.

The cc crate now honours CFLAGS and CPPFLAGS so this problem has gone away and I can confirm the libcompiler_builtins builds reproducibly independently of build-path including the C code.

However there is a new regression, dylib_metadata is no longer reproducible. (rlib_metadata is fine.) Please see these samples:

These files were extracted from the .rustc ELF section of the respective .so files. I don't know how to parse the binary data into a human-readable format, so I can't continue further by myself. Someone needs to explain it a bit more.

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Jan 31, 2019

I think this is the last piece so if someone can figure this out in the next month (before Mar 12) we could in theory get a reproducible rustc into the next Debian Stable, if that motivates anyone to work on it.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jan 31, 2019

I don't know how to parse the binary data into a human-readable format, so I can't continue further by myself.

It's not just a binary format, but also compressed on top (e.g. you can't see any strings):

// The header is uncompressed
let header_len = METADATA_HEADER.len();
debug!("checking {} bytes of metadata-version stamp", header_len);
let header = &buf[..cmp::min(header_len, buf.len())];
if header != METADATA_HEADER {
return Err(format!("incompatible metadata version found: '{}'",
filename.display()));
}
// Header is okay -> inflate the actual metadata
let compressed_bytes = &buf[header_len..];
debug!("inflating {} bytes of compressed metadata", compressed_bytes.len());
let mut inflated = Vec::new();
match DeflateDecoder::new(compressed_bytes).read_to_end(&mut inflated) {

The metadata header is currently 12 bytes:

pub const METADATA_HEADER: &[u8; 12] =
&[0, 0, 0, 0, b'r', b'u', b's', b't', 0, 0, 0, METADATA_VERSION];

So you can strip those first 12 bytes then inflate the rest.

@infinity0

This comment has been minimized.

Copy link
Contributor Author

commented Jan 31, 2019

(rlib_metadata is fine.)

My mistake, rlib_metadata is also affected, here is another sample:

Note that the other contents of the rlibs and sos are reproducible including build path remappings and the filename hashes. If you look inside the metadata files I just linked, they both contain remapped paths starting with /usr/src/rustc-1.32.0. Something else is different.

@tarcieri

This comment has been minimized.

Copy link

commented May 18, 2019

As a side note to anyone interested in creating reproducible builds of libraries/apps authored in Rust, I've created a cargo-repro project in the Secure Code WG as a place to focus efforts on creating a Cargo-driven workflow for reproducible builds:

https://github.com/rust-secure-code/cargo-repro

(Apologies if this is off-topic for this particular issue, but I thought there might be some interest overlap in the people subscribed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.