[Tracking] Reproducible builds #1666

gendx · 2020-03-05T14:24:03Z

Context

When working on google/OpenSK#67, we realized that building some boards was failing to build on my desktop, but not on Travis-CI. The reason was that the paths on the filesystem are stored in the binary. These are for example included by Rust to indicate the line where some code panicked. You can check how many of these are in your binary by running strings on it.

Because the stored paths are absolute, the size they take depends on which directory is used for building Tock, and therefore the resulting binary size will increase or decrease accordingly (can be a few KB).

But because we use custom linker scripts with a fixed layout such as:

tock/boards/nordic/nrf52840dk/layout.ld

Lines 1 to 10 in 76d1abf

    
           MEMORY 
        
           { 
        
             rom (rx)  : ORIGIN = 0x00000000, LENGTH = 128K 
        
             prog (rx) : ORIGIN = 0x00030000, LENGTH = 832K 
        
             ram (rwx) : ORIGIN = 0x20000000, LENGTH = 256K 
        
           } 
        
           MPU_MIN_ALIGN = 8K; 
        
           INCLUDE ../../kernel_layout.ld

it's possible that someone hits the limit of the rom section's size, whereas Travis-CI and/or other developers don't hit this threshold.

Reproducible builds

Reproducible builds have multiple advantages.

One is to be able to re-build Tock on another environment and still obtain binaries with the exact same cryptographic hash.
The fact that Tock uses custom linker scripts with hard constraints on code size is another advantage, to make sure that all developers have the same view of it, rather than having a layout file working for Alice but not for Bob.

Making builds reproducible isn't trivial, as it requires the Rust compiler (and LLVM) to give us reproducible results. But some things (such as rust-toolchain) are in Tock's control, and therefore I think it makes sense to track what's currently allowing and preventing reproducible builds.

The text was updated successfully, but these errors were encountered:

bradjc · 2020-03-05T19:51:29Z

Wow.

Looks like --remap-path-prefix might be able to help with this? https://www.reddit.com/r/rust/comments/95ylb8/rust_release_binary_contains_absolute_paths_from/e3wnemq/

diff --git a/boards/Makefile.common b/boards/Makefile.common
index 0d4eae0f2..3625f467d 100644
--- a/boards/Makefile.common
+++ b/boards/Makefile.common
@@ -16,7 +16,8 @@ RUSTUP    ?= rustup
 # lld the actual page size so it doesn't have to be conservative.
 RUSTFLAGS_FOR_CARGO_LINKING ?= -C link-arg=-Tlayout.ld -C linker=rust-lld \
 -C linker-flavor=ld.lld -C relocation-model=dynamic-no-pic \
--C link-arg=-zmax-page-size=512
+-C link-arg=-zmax-page-size=512 \
+--remap-path-prefix=/Users/bradjc/git=

 # Disallow warnings for continuous integration builds. Disallowing them here
 # ensures that warnings during testing won't prevent compilation from succeeding.

gendx · 2020-03-06T10:49:48Z

Nice find @bradjc! We still need to extract the main Makefile's directory instead of hard-coding a given user's build directory.

gendx · 2020-03-06T13:39:43Z

Updated the list to refer to #1657, as the Makefile doesn't seem idempotent.

bradjc · 2020-03-18T16:54:28Z

#1657 was merged, has this been addressed at this point?

gendx · 2020-03-18T19:10:04Z

#1657 was merged, has this been addressed at this point?

I can check the box, although the reason is still unclear, so it might come back when we update to another nightly.

gendx · 2020-03-25T11:18:14Z

I did a quick test, building the imix board in two build folders (/path/to/tock and /path/to/tock2) on the same machine. The SHA-256 were different.

Using V=1 make in the boards/imix/ folder shows different metadata parameters passed to rustc.

# in /path/to/tock
rustc --crate-name tock_cells ... \
    -C metadata=a72da9d1edb1895f \
    -C extra-filename=-a72da9d1edb1895f \
    ...
# in /path/to/tock2
rustc --crate-name tock_cells ... \
    -C metadata=e0d01c51ad799de2 \
    -C extra-filename=-e0d01c51ad799de2 \
    ...

I think that this metadata parameter is used as a seed for non-deterministic elements in the compiler (e.g. ordering of elements inside HashMaps, maybe also allocation of registers, or choice of equivalent instructions), so ultimately the code is different inside the binary.

Relatedly, rust-lang/cargo#6966 removed --remap-path-prefix in the computation of this metadata parameter. But there are still a bunch of absolute paths passed to rustc, which could be the issue.

gendx · 2020-03-25T11:34:01Z

Following rust-secure-code/wg#28 (comment), I also tried to use --remap-path-prefix for $HOME (but this overrode the current --remap-path-prefix), as well as $HOME/.cargo and $HOME/.rustup (these two had no effect on the SHA-256).

I also tried to remap the sysroot (rust-lang/rust#63505), but again didn't observe any effect.

I also found reproducibility issues related to the linker:

Sort the fat LTO modules to produce deterministic output. rust-lang/rust#63352

as well as debug symbols (we use -C debuginfo=2):

Tweaking these bits of the configuration may make Tock builds more reproducible.

gendx · 2020-03-25T14:16:02Z

I did a few more tests.

If I copy the boards/imix folder into boards/imix2, both build the same binary.
If I copy the boards/components folder into boards/components2, and make boards/imix2 point to ../components2 (in the Cargo.toml), the metadata for the components crate is different, and in turn imix/imix2 have different metadata (the other crates have the same metadata). The resulting binary is different.
As before, if I copy /build-dir/tock into /build-dir/tock2, then /build-dir/tock/boards/imix and /build-dir/tock2/boards/imix build different binaries.
However, if I make /build-dir/tock2/boards/imix have all its dependencies point back to /build-dir/tock (with relative paths in the Cargo.toml), then this builds the same binary as /build-dir/tock/boards/imix.

The invocation for crates internal to Tock looks like the following.

rustc \
    --crate-name components \
    --edition=2018 \
    /build-path/tock/boards/components/src/lib.rs \
    ...

On the other hand, the board invocation looks like the following.

rustc \
    --crate-name imix \
    --edition=2018 \
    src/main.rs \
    ...

What strikes me is the absolute path for local dependencies (/build-path/tock/boards/components/src/lib.rs) vs. a relative path for the current binary (src/main.rs). My guess is that locally defined crates (as opposed to those from crates.io) are identified by their absolute path, even though the Cargo.toml uses a relative path.

This is consistent with the results I observed in my tests.

Update

After a few more tests, it seems that defining a Cargo workspace encapsulating all the Tock crates would make the build reproducible (at least on a given machine), because then Cargo identifies all the crates by a relative path to the workspace root.

gendx · 2020-03-30T15:41:33Z

With #1714 (currently in review), we're getting close to having reproducible builds of Tock!

One remaining discrepancy is due to the TOCK_KERNEL_VERSION value embedded in some panic messages, because this value changes for every git commit. In particular, Travis-CI adds a merge commit when testing any pull-request. There could be some additional rule or option in the Makefiles to use a fixed TOCK_KERNEL_VERSION value instead.

1714: Use a Cargo workspace to make builds reproducible and speed up CI. r=gendx a=gendx ### Pull Request Overview This pull request adds a [Cargo workspace](https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html) at the top-level directory to make build reproducible (see the analysis in #1666 (comment)). With a workspace, Cargo now only writes to a single `target/` directory in the root folder, instead of having a separate `target/` in each board. This means that some changes have to be made in the Makefiles and tools. Another benefit of a single workspace is that shared dependencies can be built once (per CPU architecture) and cached across boards. For example, the kernel or the capsules only need to be compiled once per CPU architecture, instead of once per board, which can significantly speed up commands like `make allboards`. ### Testing Strategy This pull request was tested by: - Duplicating the Tock directory (two identical copies in `/path/to/tock` and `/path/to/tock2`), and running `make allboards`, checking that the SHA-256 checksum match. - Running `make flash` in the nRF52840-DK board's folder, and checking that flashing the kernel worked properly. - Travis-CI. ### TODO or Help Wanted This pull request still needs: - [x] Fixing netlify. Having a single `target/` folder allows to greatly simplify the `tools/build-all-docs.sh` script. I checked the output on a few files and it looks consistent, but some pages might break. On the other hand, some crates such as the [arty_e21](https://docs.tockos.org/arty_e21/index.html) board are currently not indexed on the navigation bar, and the `arty_e21` chip was not even documented so far on https://docs.tockos.org/. - [x] A bit of cleanup throughout Makefiles. ~~For now I made just enough changes in `boards/Makefile.common` for `make ci-travis` to work, but other make rules should be checked.~~ - [ ] Checking that the `tools/` still work properly with a single target folder now at the top-level. - [x] `build-all-docs.sh` Simplified, as mentioned above, now that all docs are built in the same folder. - [x] `post_size_changes_to_github.sh` Updated, but it's currently broken for other reasons (see #1722). - [x] Now fixed with a minimum search depth of 2. ~~I had to remove the `tools/check_wildcard_imports.sh` in `make ci-travis` because otherwise it reported the following error. The tool would need some refactoring.~~ ``` Wildcard import(s) found in .. Tock style rules prohibit this use of wildcard imports. The following wildcard imports were found: libraries/tock-register-interface/src/macros.rs: use super::super::*; tools/check_wildcard_imports.sh: if $(git grep -q 'use .*\*;' -- ':!src/macros.rs'); then tools/check_wildcard_imports.sh: git grep 'use .*\*;' ``` - [x] Cargo complained that the `[profile.dev]` and `[profile.release]` rules must be in the top-level workspace. I put them there given that all boards had the same profiles, but it's something to keep in mind if we want to use custom workspace for specific boards. I'm also not sure what's the impact of these profiles on the other libraries in the workspace. This didn't seem to cause any issue though. - [Optional] Should `tools/` also be part of the workspace? Have their own workspace? ### Documentation Updated - [x] Updated the relevant files in `/docs`, or no updates are required. Now the arty-e21 chip and board appear in the generated docs :) ### Formatting - [x] Ran `make formatall`. Co-authored-by: Guillaume Endignoux <guillaumee@google.com> Co-authored-by: gendx <gendx@users.noreply.github.com>

gendx · 2020-04-08T10:25:37Z

With #1714 merged, we've reached a big milestone in terms of reproducibility. I now have the same hashes when building from master on various folders on the same machine, as well as on various machines.

The only remaining part is the TOCK_KERNEL_VERSION that changes on every commit, including Travis-CI's implicit merge commits, because this is stored inside the binary as some debugging information.

Would it make sense to have some make repro rule that would use a fixed TOCK_KERNEL_VERSION?
Should there be some make reprocheck which would check the checksums against some checksum file that we have to keep updated on every commit - to check that we don't have any regression in terms of reproducibility? That is, this would run make repro, compute the checksum, and compare them to some boards.sha256 file containing expected checksums. Travis-CI, or some other CI tool, would then run this make reprocheck test.

ppannuto · 2020-04-08T17:55:49Z

I might argue that TOCK_KERNEL_VERSION is currently doing the right thing -- it captures exactly the state of the repository that was built for a given image. The addition of a merge commit means a different point in the history of repository. Unless we rebase every branch before pulling, it is quite possible/likely that the travis merge commit will introduce meaningful code changes compared to the candidate branch.

Travis has both "pr" and "push" builds:

pr: what would happen if this branch were merged to master (adds a commit)
push: what is the status of this branch (no added commit)

According to the travis docs:

TRAVIS_PULL_REQUEST is set to the pull request number if the current job is a pull request build, or false if it’s not.

So I think what might make the most sense would be enforcing the reproducibility check on the push build but not on the pr build

gendx · 2020-04-09T12:25:15Z

Unless we rebase every branch before pulling, it is quite possible/likely that the travis merge commit will introduce meaningful code changes compared to the candidate branch.

That's a good point indeed!

I might argue that TOCK_KERNEL_VERSION is currently doing the right thing -- it captures exactly the state of the repository that was built for a given image. The addition of a merge commit means a different point in the history of repository.

One thing to note is that a commit's hash includes both the state of the repository (as a filesystem), but also the commit message, commit author, and timestamp. So Travis' merge commit will generally be different than what one can obtain locally by creating their own merge commit.

For example, the latest commit on master contains the following git object.

$ git cat-file -p HEAD
tree 629404bf87e4cc0ac1544ff95cd5b4e242a97640
parent a4ad8070469c8403c81104efefa1cb2bbcba726a
parent d3a6cd58ec31682dba16d56a46269d91f26c51b7
author bors[bot] <26634292+bors[bot]@users.noreply.github.com> 1586357094 +0000
committer GitHub <noreply@github.com> 1586357094 +0000
gpgsig -----BEGIN PGP SIGNATURE-----
 
 wsBcBAABCAAQBQJejeNmCRBK7hj4Ov3rIwAAdHIIAAuCeUnrhZWwbrF4T/8Jjba7
 z+hMeosGRjWh5IpCGh7RvOSajtUrtNDdiVjkET3BoJYTjrg3C3xwAPGFw9OMDRxG
 Z07iT4rBrgv6YnTqpOaqLYlrDcEnOSHfupHX+6CKK3r8ebySaASzT7MGAYRG8+1f
 KwOrC1C5W5BnfFIz7TWbb968r+ZGw7dyrgQWIGPcdoY+vYz4yi4jGTZh4ahJufb4
 xlr6mDFX5KjbDbMZcPn+7qFMHnYQ0tMdaYouuJf9rrOLWUCTz5zWb/cydKDJ9bK4
 1yOEpw73AwN1IBmqEPllsd+Uy2GhtvkAbyz1zM6mK0cvX0oOBpsjxHg2maMxwYY=
 =1AW5
 -----END PGP SIGNATURE-----
 

Merge #1743

1743: Update stm32 boards makefiles r=ppannuto a=alexandruradovici

[...]

The commit is f920947, which can then be retrieved from the command line with git hash-object on the commit object.

$ git cat-file -p HEAD | git hash-object -t commit --stdin
f9209471923d4a6dbacdb6464ffe548bd7108aa9

Regarding reproducibility, I think that a more interesting description of the state of the repository would be the "tree" (first line in the pretty-printed commit), which is a hash of all the files checked into the repository.

$ git cat-file -p HEAD^{tree}
040000 tree b633c8fd62afc5bfdb32db071e4dd00daa3efce7	.github
100644 blob 5230cda22cc728e454dde2b0e3e9c5361821a60a	.gitignore
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	.gitmodules
100644 blob 6ddd26fc3f7b7964723dec145f26e4d90720c52b	.travis.yml
[...]

So I think what might make the most sense would be enforcing the reproducibility check on the push build but not on the pr build

Enforcing the push build would be simpler for developers of pull requests. But that would mean that the actual state of the repository on master would almost always be different, due to the merge commits put on top of it.

Requiring to always rebase pull requests on the current master would avoid the merge commit problem (and I think GitHub can be configured to require rebased pull requests), at the expense of more burden for developers of pull requests (especially when there are unrelated pull requests in parallel, i.e. without any merge conflicts, in which case the current model works smoothly).

ppannuto · 2020-04-09T14:49:20Z

One thing to note is that a commit's hash includes both the state of the repository (as a filesystem), but also the commit message...

Hmm, this is a very good point. We're partially running into a usability vs. correctness challenge here I think. While we could dig into the git internals, the intent of including a hash in the panic trace is that it's easy for developers to investigate reported errors (i.e. can reproduce by simply git checkout xxx); that falls apart a bit with more nuanced tracing of git.

$ git checkout b633c8fd62af
fatal: Cannot switch branch to a non-commit 'b633c8fd62af'

Of course, you could do something like https://stackoverflow.com/questions/41088069/how-to-find-the-commits-that-point-to-a-git-tree-object -- and we could put something in tools/ that does this, but that's getting complex I think

One idea I thought about briefly would be modifying the label to be git describe --tags --always --all, which would effectively make the label (1) a tag if a tagged release, (2) the branch name [most common case], (3) a hash. But branches are a bit too ephemeral and could easily be pointing to a later/different commit between deployment and bug reporting (particularly since I think the travis build would always report the name heads/staging, as that's the branch bors uses to build)

Stepping back a bit, I think we should ask what we would like the reproducibility mechanism to look like (this became very stream-of-conscious, but hopefully useful for discussion):

Should there be an in-file-system record of reproduction artifact?
- pro: highly audit-able, deterministic, and strong enforcement of reproducible builds
- con: large amount of churn / noise in git history
  - mitigation: store just one hash-of-hashes of all boards?
- con: possibly annoying for development, effectively requires an allboards build on all commits or pushes (consider the 'fix a spelling error in a comment' type commit having a several minute latency)
- my current thinking: against
Should reproducibility be enforced on every commit, every commit to master?
- every commit too fine-grained for the development reasons listed above
- every commit to master basically necessary to make sure things don't break (this is the ethos of CI really)
- my current thinking: somehow
Can we do this wholly in the cloud? Idea: (ab)using tags
- Add a github action / cloud service / etc that runs on PRs that tags every commit git tag repro-{hash_of_hashes} with a hash of all the hashes of all the boards
- Add a github action that triggers on the creation of a tag that adds a status check to the PR that validates the reproduction of the build
- pro: reproduction artifacts are part of the repository without cluttering the typical log / history mechanism
- con: (maybe?) are lots of lightweight tags bad in any way?

gendx · 2020-04-09T16:28:53Z

Stepping back a bit, I think we should ask what we would like the reproducibility mechanism to look like (this became very stream-of-conscious, but hopefully useful for discussion):

Should there be an in-file-system record of reproduction artifact?

pro: highly audit-able, deterministic, and strong enforcement of reproducible builds

con: large amount of churn / noise in git history

mitigation: store just one hash-of-hashes of all boards?

con: possibly annoying for development, effectively requires an allboards build on all commits or pushes (consider the 'fix a spelling error in a comment' type commit having a several minute latency)

my current thinking: against

Definitely agree with the concerns. That would be a lot of annoyance to keep in sync on every commit.

Should reproducibility be enforced on every commit, every commit to master?

every commit too fine-grained for the development reasons listed above

every commit to master basically necessary to make sure things don't break (this is the ethos of CI really)

my current thinking: somehow

Similarly, the overhead on developers would be quite high.

Can we do this wholly in the cloud? Idea: (ab)using tags

Add a github action / cloud service / etc that runs on PRs that tags every commit git tag repro-{hash_of_hashes} with a hash of all the hashes of all the boards

Add a github action that triggers on the creation of a tag that adds a status check to the PR that validates the reproduction of the build

pro: reproduction artifacts are part of the repository without cluttering the typical log / history mechanism

con: (maybe?) are lots of lightweight tags bad in any way?

I quite like this idea (providing transparency) - some action builds the boards on each commit (or each master branch commit), and publishes the hashes. Tags may add quite some extra overhead, but I don't think they would be necessary. The UX could be similar to the size reporting tool (#1701). Then anyone can reproduce this on their machine and compare the hashes.

The question that remains is how to enforce reproducibility after each commit? Ideally, there should be some CI workflow checking that builds remain reproducible.

For example, if some commit removes the --remap-path-prefix argument in the Makefile

tock/boards/Makefile.common

Line 32 in b134ba0

--remap-path-prefix=$(TOCK_ROOT_DIRECTORY)= \

then the builds are not reproducible anymore across directories, but there is currently no tool in Tock's CI to prevent that.

In any case, I think that providing transparency by publishing the hashes is a good step forward.
If we don't find a scalable solution for enforcing reproducibility (as @ppannuto mentioned, updating the hashes on each commit wouldn't be scalable in terms of developer overhead), then it's not too dramatic, and we can notice problems every now and then + do more thorough manual checks on every release. Similarly to the OSX support, which isn't enforced in CI but works most of the time and occasionally an OSX developer notices a problem and we fix the bug.

gendx · 2020-04-15T14:00:14Z

While working on google/OpenSK#94, I noticed that although builds seem reproducible across Linux machines (my local development machine vs. GitHub workflows), the binaries obtained on MacOS are currently not the same.

gendx · 2020-04-28T10:50:48Z

I added a reference to tock/elf2tab#16 to track reproducibility of elf2tab packaging.

bradjc · 2023-09-05T14:43:33Z

Does anyone know of any github/rust projects that do something similar in terms of using CI to verify builds are reproducible and publishing SHAs?

bradjc added the tracking label Mar 6, 2020

gendx mentioned this issue Mar 6, 2020

[Tracking] Optimizing Tock for code size #1663

Open

19 tasks

This was referenced Mar 6, 2020

Add --remap-path-prefix to make builds more deterministic and smaller. #1668

Merged

Reproducible builds google/OpenSK#70

Closed

gendx mentioned this issue Mar 25, 2020

Use a Cargo workspace to make builds reproducible and speed up CI. #1714

Merged

9 tasks

This was referenced Apr 20, 2020

Non-reproducible builds depending on the compiling OS rust-lang/rust#71361

Closed

Non-reproducible -C metadata=hash passed to rustc depending on the compiling OS rust-lang/cargo#8140

Open

hudson-ayers mentioned this issue Dec 7, 2022

Can not get "console" example running on OpenTitan (CW310) tock/libtock-rs#437

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking] Reproducible builds #1666

[Tracking] Reproducible builds #1666

gendx commented Mar 5, 2020 •

edited by bradjc

Loading

bradjc commented Mar 5, 2020

gendx commented Mar 6, 2020

gendx commented Mar 6, 2020

bradjc commented Mar 18, 2020

gendx commented Mar 18, 2020

gendx commented Mar 25, 2020 •

edited

Loading

gendx commented Mar 25, 2020 •

edited

Loading

gendx commented Mar 25, 2020 •

edited

Loading

gendx commented Mar 30, 2020

gendx commented Apr 8, 2020

ppannuto commented Apr 8, 2020

gendx commented Apr 9, 2020

ppannuto commented Apr 9, 2020

gendx commented Apr 9, 2020

gendx commented Apr 15, 2020

gendx commented Apr 28, 2020

bradjc commented Sep 5, 2023

[Tracking] Reproducible builds #1666

[Tracking] Reproducible builds #1666

Comments

gendx commented Mar 5, 2020 • edited by bradjc Loading

Context

Reproducible builds

bradjc commented Mar 5, 2020

gendx commented Mar 6, 2020

gendx commented Mar 6, 2020

bradjc commented Mar 18, 2020

gendx commented Mar 18, 2020

gendx commented Mar 25, 2020 • edited Loading

gendx commented Mar 25, 2020 • edited Loading

gendx commented Mar 25, 2020 • edited Loading

gendx commented Mar 30, 2020

gendx commented Apr 8, 2020

ppannuto commented Apr 8, 2020

gendx commented Apr 9, 2020

ppannuto commented Apr 9, 2020

gendx commented Apr 9, 2020

gendx commented Apr 15, 2020

gendx commented Apr 28, 2020

bradjc commented Sep 5, 2023

gendx commented Mar 5, 2020 •

edited by bradjc

Loading

gendx commented Mar 25, 2020 •

edited

Loading

gendx commented Mar 25, 2020 •

edited

Loading

gendx commented Mar 25, 2020 •

edited

Loading