New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bit-for-bit deterministic / reproducible builds #34902

Open
infinity0 opened this Issue Jul 18, 2016 · 36 comments

Comments

Projects
None yet
@infinity0
Contributor

infinity0 commented Jul 18, 2016

It would be good if rustc could generate bit-for-bit reproducible results, even in the presence of minor system environment differences. Currently we have quite a large diff: e.g. see txt diff for 1.9.0 or perhaps txt diff for 1.10.0 a few days after I'm posting this. (You might want to "save link as" instead of displaying it directly in the browser.)

Much of the diff output is due to build-id differences, which can be ignored since they are caused by other deeper issues and will go away once these deeper issues are fixed. One example of a deeper issue is this:

│   │   │   ├── ./usr/lib/i386-linux-gnu/librustc_data_structures-e8edd0fd.so
│   │   │   │   ├── readelf --wide --file-header {}
│   │   │   │   │ @@ -6,15 +6,15 @@
│   │   │   │   │    OS/ABI:                            UNIX - System V
│   │   │   │   │    ABI Version:                       0
│   │   │   │   │    Type:                              DYN (Shared object file)
│   │   │   │   │    Machine:                           Intel 80386
│   │   │   │   │    Version:                           0x1
│   │   │   │   │    Entry point address:               0x4620
│   │   │   │   │    Start of program headers:          52 (bytes into file)
│   │   │   │   │ -  Start of section headers:          338784 (bytes into file)
│   │   │   │   │ +  Start of section headers:          338780 (bytes into file)
│   │   │   │   │    Flags:                             0x0
│   │   │   │   │    Size of this header:               52 (bytes)
│   │   │   │   │    Size of program headers:           32 (bytes)
│   │   │   │   │    Number of program headers:         8
│   │   │   │   │    Size of section headers:           40 (bytes)
│   │   │   │   │    Number of section headers:         30
│   │   │   │   │    Section header string table index: 29
│   │   │   │   ├── readelf --wide --program-header {}

Here are the system variations that might be causing these issues. I myself am not that familiar with ELF, but perhaps someone else here would know why the section header is 4 bytes later in the first vs second builds.

@steveklabnik

This comment has been minimized.

Show comment
Hide comment
@steveklabnik

steveklabnik Jul 18, 2016

Member

In general, being reproducible is something we're interested in; we try to tackle it bug by bug.

Member

steveklabnik commented Jul 18, 2016

In general, being reproducible is something we're interested in; we try to tackle it bug by bug.

@eefriedman

This comment has been minimized.

Show comment
Hide comment
@eefriedman

eefriedman Jul 18, 2016

Contributor

This probably has some obvious cause:

│   │   │   │ -<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;</span></code></h4>
│   │   │   │ +<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;</span></code></h4>

If someone wants to look at the code generation differences, it's probably best to start with libcore. The reproducible-builds.org diff isn't that useful for that because it doesn't recognize .rlib files as archives.

Contributor

eefriedman commented Jul 18, 2016

This probably has some obvious cause:

│   │   │   │ -<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;</span></code></h4>
│   │   │   │ +<h4 id='method.unzip' class='method'><code>fn <a href='../../core/iter/trait.Iterator.html#method.unzip' class='fnname'>unzip</a>&lt;A, B, FromA, FromB&gt;(self) -&gt; (FromA, FromB) <span class='where'>where FromB: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;B&gt;, Self: <a class='trait' href='../../core/iter/trait.Iterator.html' title='core::iter::Iterator'>Iterator</a>&lt;Item=(A, B)&gt;, FromA: <a class='trait' href='../../core/default/trait.Default.html' title='core::default::Default'>Default</a> + <a class='trait' href='../../core/iter/trait.Extend.html' title='core::iter::Extend'>Extend</a>&lt;A&gt;</span></code></h4>

If someone wants to look at the code generation differences, it's probably best to start with libcore. The reproducible-builds.org diff isn't that useful for that because it doesn't recognize .rlib files as archives.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Jul 18, 2016

Contributor

Thanks for the heads up, I just added AR support in diffoscope and hopefully the diff output will see that in the next few weeks, when the website updates.

Contributor

infinity0 commented Jul 18, 2016

Thanks for the heads up, I just added AR support in diffoscope and hopefully the diff output will see that in the next few weeks, when the website updates.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Aug 16, 2016

Contributor

Here's a diff of rust.metadata.bin from libcore.rlib, from 1.10.0. (2.0 MB, ~1700 lines, might make your browser slow.) Mostly ordering differences. @eddyb has suggested HashMap ordering as a culprit, and that would be replaced with FnvHashMap instead.

Contributor

infinity0 commented Aug 16, 2016

Here's a diff of rust.metadata.bin from libcore.rlib, from 1.10.0. (2.0 MB, ~1700 lines, might make your browser slow.) Mostly ordering differences. @eddyb has suggested HashMap ordering as a culprit, and that would be replaced with FnvHashMap instead.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Aug 16, 2016

Member

If it's a HashMap, I don't know which one. cc @rust-lang/compiler

Member

eddyb commented Aug 16, 2016

If it's a HashMap, I don't know which one. cc @rust-lang/compiler

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Aug 20, 2016

Contributor

I tried replacing HashMap with FnvHashMap in src/librustc_metadata/loader.rs but it didn't seem to help. Here is the Makefile i'm using to automate my rebuilds:

TRIPLET = x86_64-unknown-linux-gnu

all: libcore.diff

libcore.diff: rust.metadata.bin.1 rust.metadata.bin.2
    diffoscope --html-dir "$@" rust.metadata.bin.1 rust.metadata.bin.2 \
        --max-diff-block-lines 5000 \
        --max-diff-input-lines 10000000 \
        --max-report-size 204800000; true # diffoscope returns 1 for "there were diffs"

rust.metadata.bin.1 rust.metadata.bin.2: Makefile
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    $(MAKE) $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    ar x $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib rust.metadata.bin
    mv rust.metadata.bin "$@"

.PHONY: all rust.metadata.bin.1 rust.metadata.bin.2

perhaps i'm Doing It Wrong.

Contributor

infinity0 commented Aug 20, 2016

I tried replacing HashMap with FnvHashMap in src/librustc_metadata/loader.rs but it didn't seem to help. Here is the Makefile i'm using to automate my rebuilds:

TRIPLET = x86_64-unknown-linux-gnu

all: libcore.diff

libcore.diff: rust.metadata.bin.1 rust.metadata.bin.2
    diffoscope --html-dir "$@" rust.metadata.bin.1 rust.metadata.bin.2 \
        --max-diff-block-lines 5000 \
        --max-diff-input-lines 10000000 \
        --max-report-size 204800000; true # diffoscope returns 1 for "there were diffs"

rust.metadata.bin.1 rust.metadata.bin.2: Makefile
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib
    rm -f $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    $(MAKE) $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/stamp.core
    ar x $(TRIPLET)/stage2/lib/rustlib/$(TRIPLET)/lib/libcore-*.rlib rust.metadata.bin
    mv rust.metadata.bin "$@"

.PHONY: all rust.metadata.bin.1 rust.metadata.bin.2

perhaps i'm Doing It Wrong.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Aug 20, 2016

Contributor

As background, HashMap in rust is non-deterministic to protect against certain types of DoS attack. You can switch it to the deterministic FnvHashMap if you're sure your code will always be called in a safe manner. This ought to be true for rustc itself. (I notice some online "try-it-yourself" rust web services let me run "ls /" and other shell commands, so I could also exploit this HashMap ordering issue, but I also assume that they're clever enough to set a ulimit and/or containerise the thing.)

Contributor

infinity0 commented Aug 20, 2016

As background, HashMap in rust is non-deterministic to protect against certain types of DoS attack. You can switch it to the deterministic FnvHashMap if you're sure your code will always be called in a safe manner. This ought to be true for rustc itself. (I notice some online "try-it-yourself" rust web services let me run "ls /" and other shell commands, so I could also exploit this HashMap ordering issue, but I also assume that they're clever enough to set a ulimit and/or containerise the thing.)

@jonas-schievink

This comment has been minimized.

Show comment
Hide comment
@jonas-schievink

jonas-schievink Aug 22, 2016

Contributor

I wrote a small script to diff metadata (I couldn't really get the makefile to work).

It looks like even more is changing between two compiles using the current master (or nightly): The crate hash and/or disambiguator, which are stored right after the target triple. @infinity0 can you confirm?

EDIT: That might get fixed by #35854

Contributor

jonas-schievink commented Aug 22, 2016

I wrote a small script to diff metadata (I couldn't really get the makefile to work).

It looks like even more is changing between two compiles using the current master (or nightly): The crate hash and/or disambiguator, which are stored right after the target triple. @infinity0 can you confirm?

EDIT: That might get fixed by #35854

@jonas-schievink

This comment has been minimized.

Show comment
Hide comment
@jonas-schievink

jonas-schievink Aug 22, 2016

Contributor

Okay, apparently replacing Hash{Map,Set} in resolve with FnvHash{Map,Set} got rid of the first (smaller) blocks of the diff. The big block at the bottom is still there. Will continue to randomly sed the compiler and report back ;)

Contributor

jonas-schievink commented Aug 22, 2016

Okay, apparently replacing Hash{Map,Set} in resolve with FnvHash{Map,Set} got rid of the first (smaller) blocks of the diff. The big block at the bottom is still there. Will continue to randomly sed the compiler and report back ;)

@jonas-schievink

This comment has been minimized.

Show comment
Hide comment
@jonas-schievink

jonas-schievink Aug 22, 2016

Contributor

The remaining issue is that (some?) predicates are encoded in non-deterministic order. Maybe we need to do this for all bounds? Not sure what would be the correct way to do that (or why these are nondeterministic in the first place).

Contributor

jonas-schievink commented Aug 22, 2016

The remaining issue is that (some?) predicates are encoded in non-deterministic order. Maybe we need to do this for all bounds? Not sure what would be the correct way to do that (or why these are nondeterministic in the first place).

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 22, 2016

Contributor

Just for reference, there's this: #34805

Contributor

michaelwoerister commented Aug 22, 2016

Just for reference, there's this: #34805

@jonas-schievink

This comment has been minimized.

Show comment
Hide comment
@jonas-schievink

jonas-schievink Aug 22, 2016

Contributor

@michaelwoerister Thanks! Too tired to think about it currently. Will have a look tomorrow :)

Contributor

jonas-schievink commented Aug 22, 2016

@michaelwoerister Thanks! Too tired to think about it currently. Will have a look tomorrow :)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 22, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)
@sanxiyn

This comment has been minimized.

Show comment
Hide comment
@sanxiyn

sanxiyn Aug 23, 2016

Member

#24473 is rustdoc subset of this.

Member

sanxiyn commented Aug 23, 2016

#24473 is rustdoc subset of this.

eddyb added a commit to eddyb/rust that referenced this issue Aug 23, 2016

Rollup merge of rust-lang#35909 - jonas-schievink:more-fnv, r=eddyb
Use `FnvHashMap` in more places

* A step towards rust-lang#34902 (see my comments there)
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

(likewise for `HashSet` -> `FnvHashSet`)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 23, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

This *should* basically fix issue rust-lang#34902, although landing a test would
be nice (I also didn't test a scenario where multiple crates were
involved, only libcore).

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 23, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 24, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 25, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 25, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

bors added a commit that referenced this issue Aug 26, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 27, 2016

Use `FnvHashMap` in more places
* A step towards rust-lang#34902
* More stable error messages in some places related to crate loading
* Possible slight performance improvements since all `HashMap`s
  replaced had small keys where `FnvHashMap` should be faster
  (although I didn't measure)

jonas-schievink added a commit to jonas-schievink/rust that referenced this issue Aug 27, 2016

Make metadata encoding deterministic
`ty::Predicate` was being used as a key for a hash map, but its hash
implementation indirectly hashed addresses, which vary between each
compiler run. This is fixed by sorting predicates by their ID before
encoding them.

In my tests, rustc is now able to produce deterministic results when
compiling libcore and libstd.

I've beefed up `run-make/reproducible-build` to compare the produced
artifacts bit-by-bit. This doesn't catch everything, but should be a
good start.

cc rust-lang#34902

bors added a commit that referenced this issue Aug 28, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473

bors added a commit that referenced this issue Aug 28, 2016

Auto merge of #35984 - jonas-schievink:reproducible-builds, r=eddyb
Steps towards reproducible builds

cc #34902

Running `make dist` twice will result in a rustc tarball where only `librustc_back.so`, `librustc_llvm.so` and `librustc_trans.so` differ. Building `libstd` and `libcore` twice with the same compiler and flags produces identical artifacts.

The third commit should close #24473
@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Sep 2, 2016

Contributor

Potentially relevant, from IRC:

<eddyb> nmatsakis, infinity0: fwiw, I just found another hash table. not random, but may cause issues around incremental recompilation. see rustc_metadata::encoder::encode_reachable (the reachable NodeSet)

though I think that it's not a big issue unless the hashtable is truly random.

Contributor

nikomatsakis commented Sep 2, 2016

Potentially relevant, from IRC:

<eddyb> nmatsakis, infinity0: fwiw, I just found another hash table. not random, but may cause issues around incremental recompilation. see rustc_metadata::encoder::encode_reachable (the reachable NodeSet)

though I think that it's not a big issue unless the hashtable is truly random.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 May 13, 2017

Contributor

Assuming that #41508 works properly, there are only two issues left, judging from this diff obtained when not varying the build path across builds.

  • #24473 for rustdoc, with several instances of
├── ./usr/share/doc/rust-doc/html/implementors/core/cmp/trait.Ord.js
├── js-beautify {}
@@ -1,10 +1,9 @@
 (function() {
     var implementors = {};
-    implementors["alloc"] = ["impl&lt;T:&nbsp;? [.. snip ..]
     implementors["rustc_unicode"] = [];
     implementors["collections"] = ["impl&lt;T&gt; [.. snip ..]
     implementors["core"] = [];
 
     if (window.register_implementors) {
         window.register_implementors(implementors);
     } else {
  • one non-build-path debuginfo variation:
     <2ba6>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x33a0): x86_64-unknown-linux-gnu/rustllvm/PassWrapper.dwo
     <2baa>   DW_AT_comp_dir    : .
     <2bac>   DW_AT_GNU_pubnames: 1
     <2bac>   DW_AT_GNU_addr_base: 0x4d90
-    <2bb0>   DW_AT_GNU_dwo_id  : 0xeed3d315f58a4192
+    <2bb0>   DW_AT_GNU_dwo_id  : 0xe31c7153633c9772
     <2bb8>   DW_AT_GNU_ranges_base: 0x57f0
   Compilation Unit @ offset 0x2bbc:
Contributor

infinity0 commented May 13, 2017

Assuming that #41508 works properly, there are only two issues left, judging from this diff obtained when not varying the build path across builds.

  • #24473 for rustdoc, with several instances of
├── ./usr/share/doc/rust-doc/html/implementors/core/cmp/trait.Ord.js
├── js-beautify {}
@@ -1,10 +1,9 @@
 (function() {
     var implementors = {};
-    implementors["alloc"] = ["impl&lt;T:&nbsp;? [.. snip ..]
     implementors["rustc_unicode"] = [];
     implementors["collections"] = ["impl&lt;T&gt; [.. snip ..]
     implementors["core"] = [];
 
     if (window.register_implementors) {
         window.register_implementors(implementors);
     } else {
  • one non-build-path debuginfo variation:
     <2ba6>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x33a0): x86_64-unknown-linux-gnu/rustllvm/PassWrapper.dwo
     <2baa>   DW_AT_comp_dir    : .
     <2bac>   DW_AT_GNU_pubnames: 1
     <2bac>   DW_AT_GNU_addr_base: 0x4d90
-    <2bb0>   DW_AT_GNU_dwo_id  : 0xeed3d315f58a4192
+    <2bb0>   DW_AT_GNU_dwo_id  : 0xe31c7153633c9772
     <2bb8>   DW_AT_GNU_ranges_base: 0x57f0
   Compilation Unit @ offset 0x2bbc:
@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Jul 10, 2017

Contributor

The doc issue seems to be gone, the dwo_id issue remains.

@@ -4968,40 +4968,40 @@
     <2ef2>   DW_AT_ranges      : 0x51c0
     <2ef6>   DW_AT_low_pc      : 0x0
     <2efe>   DW_AT_stmt_list   : 0x1110
     <2f02>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3221): x86_64-unknown-linux-gnu/rustllvm/RustWrapper.dwo
     <2f06>   DW_AT_comp_dir    : .
     <2f08>   DW_AT_GNU_pubnames: 1
     <2f08>   DW_AT_GNU_addr_base: 0x0
-    <2f0c>   DW_AT_GNU_dwo_id  : 0x82dcd4a4a86511a
+    <2f0c>   DW_AT_GNU_dwo_id  : 0x94cc2c8c89838e78
     <2f14>   DW_AT_GNU_ranges_base: 0x1290
   Compilation Unit @ offset 0x2f18:
    Length:        0x2e (32-bit)
    Version:       4
    Abbrev Offset: 0x206
    Pointer Size:  8
  <0><2f23>: Abbrev Number: 1 (DW_TAG_compile_unit)
     <2f24>   DW_AT_ranges      : 0x8250
     <2f28>   DW_AT_low_pc      : 0x0
     <2f30>   DW_AT_stmt_list   : 0x4366
     <2f34>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3253): x86_64-unknown-linux-gnu/rustllvm/PassWrapper.dwo
     <2f38>   DW_AT_comp_dir    : .
     <2f3a>   DW_AT_GNU_pubnames: 1
     <2f3a>   DW_AT_GNU_addr_base: 0x4f08
-    <2f3e>   DW_AT_GNU_dwo_id  : 0x6564472d68267dc6
+    <2f3e>   DW_AT_GNU_dwo_id  : 0xa4b7031387acdfba
     <2f46>   DW_AT_GNU_ranges_base: 0x58a0
   Compilation Unit @ offset 0x2f4a:
    Length:        0x2e (32-bit)
    Version:       4
    Abbrev Offset: 0x223
    Pointer Size:  8
  <0><2f55>: Abbrev Number: 1 (DW_TAG_compile_unit)
     <2f56>   DW_AT_ranges      : 0xaf70
     <2f5a>   DW_AT_low_pc      : 0x0
     <2f62>   DW_AT_stmt_list   : 0x6382
     <2f66>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3285): x86_64-unknown-linux-gnu/rustllvm/ArchiveWrapper.dwo
     <2f6a>   DW_AT_comp_dir    : .
     <2f6c>   DW_AT_GNU_pubnames: 1
     <2f6c>   DW_AT_GNU_addr_base: 0x7528
-    <2f70>   DW_AT_GNU_dwo_id  : 0xc1d2b489290c9d87
+    <2f70>   DW_AT_GNU_dwo_id  : 0xcd75544460bc0e9c
     <2f78>   DW_AT_GNU_ranges_base: 0x8480
Contributor

infinity0 commented Jul 10, 2017

The doc issue seems to be gone, the dwo_id issue remains.

@@ -4968,40 +4968,40 @@
     <2ef2>   DW_AT_ranges      : 0x51c0
     <2ef6>   DW_AT_low_pc      : 0x0
     <2efe>   DW_AT_stmt_list   : 0x1110
     <2f02>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3221): x86_64-unknown-linux-gnu/rustllvm/RustWrapper.dwo
     <2f06>   DW_AT_comp_dir    : .
     <2f08>   DW_AT_GNU_pubnames: 1
     <2f08>   DW_AT_GNU_addr_base: 0x0
-    <2f0c>   DW_AT_GNU_dwo_id  : 0x82dcd4a4a86511a
+    <2f0c>   DW_AT_GNU_dwo_id  : 0x94cc2c8c89838e78
     <2f14>   DW_AT_GNU_ranges_base: 0x1290
   Compilation Unit @ offset 0x2f18:
    Length:        0x2e (32-bit)
    Version:       4
    Abbrev Offset: 0x206
    Pointer Size:  8
  <0><2f23>: Abbrev Number: 1 (DW_TAG_compile_unit)
     <2f24>   DW_AT_ranges      : 0x8250
     <2f28>   DW_AT_low_pc      : 0x0
     <2f30>   DW_AT_stmt_list   : 0x4366
     <2f34>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3253): x86_64-unknown-linux-gnu/rustllvm/PassWrapper.dwo
     <2f38>   DW_AT_comp_dir    : .
     <2f3a>   DW_AT_GNU_pubnames: 1
     <2f3a>   DW_AT_GNU_addr_base: 0x4f08
-    <2f3e>   DW_AT_GNU_dwo_id  : 0x6564472d68267dc6
+    <2f3e>   DW_AT_GNU_dwo_id  : 0xa4b7031387acdfba
     <2f46>   DW_AT_GNU_ranges_base: 0x58a0
   Compilation Unit @ offset 0x2f4a:
    Length:        0x2e (32-bit)
    Version:       4
    Abbrev Offset: 0x223
    Pointer Size:  8
  <0><2f55>: Abbrev Number: 1 (DW_TAG_compile_unit)
     <2f56>   DW_AT_ranges      : 0xaf70
     <2f5a>   DW_AT_low_pc      : 0x0
     <2f62>   DW_AT_stmt_list   : 0x6382
     <2f66>   DW_AT_GNU_dwo_name: (indirect string, offset: 0x3285): x86_64-unknown-linux-gnu/rustllvm/ArchiveWrapper.dwo
     <2f6a>   DW_AT_comp_dir    : .
     <2f6c>   DW_AT_GNU_pubnames: 1
     <2f6c>   DW_AT_GNU_addr_base: 0x7528
-    <2f70>   DW_AT_GNU_dwo_id  : 0xc1d2b489290c9d87
+    <2f70>   DW_AT_GNU_dwo_id  : 0xcd75544460bc0e9c
     <2f78>   DW_AT_GNU_ranges_base: 0x8480
@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Jul 10, 2017

Contributor

This DW_AT_GNU_dwo_id seems to be introduced by the C++ compiler when compiling rustllvm. @alexcrichton, can we disable split debuginfo here somehow?

Contributor

michaelwoerister commented Jul 10, 2017

This DW_AT_GNU_dwo_id seems to be introduced by the C++ compiler when compiling rustllvm. @alexcrichton, can we disable split debuginfo here somehow?

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister
Contributor

michaelwoerister commented Jul 10, 2017

cc @brson

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Jul 10, 2017

Contributor

Also FYI on Debian (where these tests are done) rustllvm is dynamically linked against the LLVM shared library, in case this affects the situation.

Contributor

infinity0 commented Jul 10, 2017

Also FYI on Debian (where these tests are done) rustllvm is dynamically linked against the LLVM shared library, in case this affects the situation.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jul 10, 2017

Member

@michaelwoerister sure yeah we can just pass whatever flags are necessary to the C compiler

Member

alexcrichton commented Jul 10, 2017

@michaelwoerister sure yeah we can just pass whatever flags are necessary to the C compiler

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Jul 28, 2017

Contributor

It looks like llvm-config -cxxflags on Debian at least outputs -gsplit-dwarf so I'm applying this patch for the time being:

--- a/src/librustc_llvm/build.rs
+++ b/src/librustc_llvm/build.rs
@@ -137,6 +137,11 @@
     let cxxflags = output(&mut cmd);
     let mut cfg = gcc::Config::new();
     for flag in cxxflags.split_whitespace() {
+        // Split-dwarf gives unreproducible DW_AT_GNU_dwo_id so don't do it
+        if flag == "-gsplit-dwarf" {
+            continue;
+        }
+
         // Ignore flags like `-m64` when we're doing a cross build
         if is_crossed && flag.starts_with("-m") {
             continue;

Not sure if rustc wants to include it, since technically it is an issue with either llvm-config or the C++ compiler.

Contributor

infinity0 commented Jul 28, 2017

It looks like llvm-config -cxxflags on Debian at least outputs -gsplit-dwarf so I'm applying this patch for the time being:

--- a/src/librustc_llvm/build.rs
+++ b/src/librustc_llvm/build.rs
@@ -137,6 +137,11 @@
     let cxxflags = output(&mut cmd);
     let mut cfg = gcc::Config::new();
     for flag in cxxflags.split_whitespace() {
+        // Split-dwarf gives unreproducible DW_AT_GNU_dwo_id so don't do it
+        if flag == "-gsplit-dwarf" {
+            continue;
+        }
+
         // Ignore flags like `-m64` when we're doing a cross build
         if is_crossed && flag.starts_with("-m") {
             continue;

Not sure if rustc wants to include it, since technically it is an issue with either llvm-config or the C++ compiler.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Jul 31, 2017

Contributor

I was hoping to be able to declare rustc the first reproducible compiler, but unfortunately it seems even with the above patch there was a regression between 1.18 and 1.19, see libcore.diff which was produced using the same build-path. Probably another case of non-deterministic hash tables being used; I will see if I can write a test case to avoid this happening again in the future.

For other diffs as well as the actual rustc binaries see the parent directory.

Contributor

infinity0 commented Jul 31, 2017

I was hoping to be able to declare rustc the first reproducible compiler, but unfortunately it seems even with the above patch there was a regression between 1.18 and 1.19, see libcore.diff which was produced using the same build-path. Probably another case of non-deterministic hash tables being used; I will see if I can write a test case to avoid this happening again in the future.

For other diffs as well as the actual rustc binaries see the parent directory.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Aug 1, 2017

Contributor

Oh, I just had a thought, I added codegen-units = 0 recently to our config.toml which enables parallel codegen. This might be interfering with reproducibility; I will try disabling that and re-running the builds.

Contributor

infinity0 commented Aug 1, 2017

Oh, I just had a thought, I added codegen-units = 0 recently to our config.toml which enables parallel codegen. This might be interfering with reproducibility; I will try disabling that and re-running the builds.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Aug 1, 2017

Contributor

OK! After disabling parallel codegen we get much much further, but there is still a regression, the build of liballoc_jemalloc is no longer deterministic.

tests.r-b.org has also caught up with us and is agreeing with this result, though a bug in diffoscope that got fixed recently causes it not to display any output at the time of writing (it should be fixed automatically soon).

This is the jemalloc diff between 1.17 and 1.19. Is it worth me trying to chase this down or were you guys planning to switch the default to the system allocator soon? I seem to remember an ongoing discussion about that.

Contributor

infinity0 commented Aug 1, 2017

OK! After disabling parallel codegen we get much much further, but there is still a regression, the build of liballoc_jemalloc is no longer deterministic.

tests.r-b.org has also caught up with us and is agreeing with this result, though a bug in diffoscope that got fixed recently causes it not to display any output at the time of writing (it should be fixed automatically soon).

This is the jemalloc diff between 1.17 and 1.19. Is it worth me trying to chase this down or were you guys planning to switch the default to the system allocator soon? I seem to remember an ongoing discussion about that.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Aug 1, 2017

Member

Thanks so much for the ongoing work here @infinity0!

@michaelwoerister do you know if codegen-units > 1 would cause the build to be non-reproducible? I figured that w/ incremental work that wouldn't be the case!

@infinity0 we're unfortunately still aways away from jettisoning jemalloc. We need the #[global_allocator] attribute to stabilize first before doing so, which would still be at least 12 weeks away (ish) if we stabilized it today. We don't currently have a timeline for the stabilization of that yet, but it should be relatively close in the sense that the design is much closer to "stabilizable" than the previous.

Member

alexcrichton commented Aug 1, 2017

Thanks so much for the ongoing work here @infinity0!

@michaelwoerister do you know if codegen-units > 1 would cause the build to be non-reproducible? I figured that w/ incremental work that wouldn't be the case!

@infinity0 we're unfortunately still aways away from jettisoning jemalloc. We need the #[global_allocator] attribute to stabilize first before doing so, which would still be at least 12 weeks away (ish) if we stabilized it today. We don't currently have a timeline for the stabilization of that yet, but it should be relatively close in the sense that the design is much closer to "stabilizable" than the previous.

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 2, 2017

Contributor

Thanks so much for the ongoing work here @infinity0!

+1, this is awesome :)

@michaelwoerister do you know if codegen-units > 1 would cause the build to be non-reproducible? I figured that w/ incremental work that wouldn't be the case!

I would have thought so too. There are two things that come to mind here:

  • There is some kind of instability in the new symbol internalizer. In rare cases it seems to oscillate between making symbols hidden and internal. I only discovered this yesterday.
  • Incr. comp. requires that each object file is deterministic but we currently don't test the final binary. There might be something in the linking step that changes things. Cargo seems to build things in random order -- that could affect the linker flags, right?
Contributor

michaelwoerister commented Aug 2, 2017

Thanks so much for the ongoing work here @infinity0!

+1, this is awesome :)

@michaelwoerister do you know if codegen-units > 1 would cause the build to be non-reproducible? I figured that w/ incremental work that wouldn't be the case!

I would have thought so too. There are two things that come to mind here:

  • There is some kind of instability in the new symbol internalizer. In rare cases it seems to oscillate between making symbols hidden and internal. I only discovered this yesterday.
  • Incr. comp. requires that each object file is deterministic but we currently don't test the final binary. There might be something in the linking step that changes things. Cargo seems to build things in random order -- that could affect the linker flags, right?
@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Aug 2, 2017

Member

Cargo seems to build things in random order -- that could affect the linker flags, right?

Hm perhaps! I think we did a change to Cargo awhile ago to make any one rustc invocation determinstic, e.g. rust-lang/cargo#3937

Member

alexcrichton commented Aug 2, 2017

Cargo seems to build things in random order -- that could affect the linker flags, right?

Hm perhaps! I think we did a change to Cargo awhile ago to make any one rustc invocation determinstic, e.g. rust-lang/cargo#3937

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Nov 10, 2017

Contributor

With 1.21.0+dfsg1-3 Debian, rustc the compiler is built deterministically (and build-path-independently) but libstd has some things remaining (112MB diff):

./usr/lib/x86_64-linux-gnu/librustc-2d0e4ac759a96c13.so
objdump --line-numbers --disassemble --demangle --section=.text {}
@@ -39978,25 +39978,25 @@
    26491:    48 89 e5                mov    %rsp,%rbp
 /usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/float/sub.rs:11
    26494:    f2 0f 5c c1             subsd  %xmm1,%xmm0
 /usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/macros.rs:255
    26498:    5d                      pop    %rbp
    26499:    c3                      retq   
    2649a:    66 90                   xchg   %ax,%ax
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
    2649c:    66 0f 6e c7             movd   %edi,%xmm0
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
    264a0:    48 c1 ef 20             shr    $0x20,%rdi
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
    264a4:    48 0b 3d 25 19 00 00    or     0x1925(%rip),%rdi        
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
    264ab:    66 0f 56 05 fd 18 00    orpd   0x18fd(%rip),%xmm0        
    264b2:    00 

What's strange is that some of the paths are already remapped to /usr/src as I intended with the -Zremap-path-prefix flags I'm passing. However, I'm only passing -Zremap-path-prefix-from=$PWD, but some other experiments with @kpcyrd on other rust crates suggests that we need to also pass -Zremap-path-prefix-from=$CARGO_HOME in the general case. I'll try that next time and report back, hopefully everything will be reproducible then.

Contributor

infinity0 commented Nov 10, 2017

With 1.21.0+dfsg1-3 Debian, rustc the compiler is built deterministically (and build-path-independently) but libstd has some things remaining (112MB diff):

./usr/lib/x86_64-linux-gnu/librustc-2d0e4ac759a96c13.so
objdump --line-numbers --disassemble --demangle --section=.text {}
@@ -39978,25 +39978,25 @@
    26491:    48 89 e5                mov    %rsp,%rbp
 /usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/float/sub.rs:11
    26494:    f2 0f 5c c1             subsd  %xmm1,%xmm0
 /usr/src/rustc-1.21.0/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/macros.rs:255
    26498:    5d                      pop    %rbp
    26499:    c3                      retq   
    2649a:    66 90                   xchg   %ax,%ax
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:39
    2649c:    66 0f 6e c7             movd   %edi,%xmm0
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:40
    264a0:    48 c1 ef 20             shr    $0x20,%rdi
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:41
    264a4:    48 0b 3d 25 19 00 00    or     0x1925(%rip),%rdi        
-/build/1st/rustc-1.21.0+dfsg1/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
+/build/rustc-1.21.0+dfsg1/2nd/src/rustc/compiler_builtins_shim/../../libcompiler_builtins/compiler-rt/lib/builtins/x86_64/floatundidf.S:42
    264ab:    66 0f 56 05 fd 18 00    orpd   0x18fd(%rip),%xmm0        
    264b2:    00 

What's strange is that some of the paths are already remapped to /usr/src as I intended with the -Zremap-path-prefix flags I'm passing. However, I'm only passing -Zremap-path-prefix-from=$PWD, but some other experiments with @kpcyrd on other rust crates suggests that we need to also pass -Zremap-path-prefix-from=$CARGO_HOME in the general case. I'll try that next time and report back, hopefully everything will be reproducible then.

@infinity0

This comment has been minimized.

Show comment
Hide comment
@infinity0

infinity0 Nov 10, 2017

Contributor

I should mention that I had to disable jemalloc in Debian for other reasons, so builds with that enabled might still hit the nondeterminism I mentioned earlier.

Contributor

infinity0 commented Nov 10, 2017

I should mention that I had to disable jemalloc in Debian for other reasons, so builds with that enabled might still hit the nondeterminism I mentioned earlier.

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Nov 16, 2017

Contributor

@infinity0

Nice to hear we almost got rid of nondeterminism! The last bunch of path dependence looks like it comes from compiler-rt asm files that are a part of compiler-builtins - I think the build-script that compiles them all is here.

I think it's just that we don't pass debuginfo remapping all the way down to gcc.

Contributor

arielb1 commented Nov 16, 2017

@infinity0

Nice to hear we almost got rid of nondeterminism! The last bunch of path dependence looks like it comes from compiler-rt asm files that are a part of compiler-builtins - I think the build-script that compiles them all is here.

I think it's just that we don't pass debuginfo remapping all the way down to gcc.

@daym

This comment has been minimized.

Show comment
Hide comment
@daym

daym Sep 25, 2018

Rust 1.22.1 has a reproducibility problem in liblibc's rust.metadata.bin .The symbols in there seem to be shuffled around (example: getutxent).

daym commented Sep 25, 2018

Rust 1.22.1 has a reproducibility problem in liblibc's rust.metadata.bin .The symbols in there seem to be shuffled around (example: getutxent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment