New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic/Reproducible builds #2641

Open
cedricwalter opened this Issue Oct 12, 2017 · 26 comments

Comments

10 participants
@cedricwalter

cedricwalter commented Oct 12, 2017

It seems that builds of Monero are non deterministic. Since this is a difficult goal and there is many way to achieve it, I want first to open the discussion here before opening a new PR

I've checked how Bitcoin and Tor is doing it, they use Gitian. I would recommend doing something similar...

Gitian is a thin wrapper around the Ubuntu virtualization tools written in a combination of Ruby and bash. It was originally developed by Bitcoin developers to ensure the build security and integrity of the Bitcoin software.
Gitian uses Ubuntu's python-vmbuilder to create a qcow2 base image for an Ubuntu version and architecture combination and a set of git and tarball inputs that you specify in a 'descriptor', and then proceeds to run a shell script that you provide to build a component inside that controlled environment. This build process produces an output set that includes the compiled result and another "output descriptor" that captures the versions and hashes of all packages present on the machine during compilation.
Gitian requires either Intel VT support (for qemu-kvm), or LXC support, and currently only supports launching Ubuntu build environments from Ubuntu itself.

Bitcoin

TorBrowser

Code base

I want through the source code and checked already that Non-determinism is not originating from the code base itself:

  • File paths: direct or indirect embedding of non-deterministic source file paths in the final binary; for example use of C/C++ macro FILE with the use of absolute file paths instead of relative file paths.
  • File content references; for example use of C/C++ macro LINE, COUNTER.
    Timestamps: for example, use of C/C++ macros DATE, TIME, TIMESTAMP, embedding the compilation time in the binary, etc. If the gyp define DONT_EMBED_BUILD_METADATA is set, these won't be embedded.
  • Source control metadata: checkout revision number embedded in the binary. That fact the SCM reference changed doesn't mean the content changed and as such shouldn't affect the final binary, except extraneous metadata.

I'm open to any ideas or solutions, lets have a good discussion!

@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Oct 12, 2017

Contributor

Why is __LINE__ non deterministic ?

Contributor

moneromooo-monero commented Oct 12, 2017

Why is __LINE__ non deterministic ?

@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Oct 12, 2017

Contributor

Anyway, pigeons was looking at that, and will find his old notes about it so it can be done. Or at least some more work done towards it.

Contributor

moneromooo-monero commented Oct 12, 2017

Anyway, pigeons was looking at that, and will find his old notes about it so it can be done. Or at least some more work done towards it.

@cedricwalter

This comment has been minimized.

Show comment
Hide comment
@cedricwalter

cedricwalter Nov 5, 2017

i could help if needed pigeons, he should just PM me

cedricwalter commented Nov 5, 2017

i could help if needed pigeons, he should just PM me

@cedricwalter

This comment has been minimized.

Show comment
Hide comment
@cedricwalter

cedricwalter Nov 5, 2017

LINE is not deterministic according to the chromium project (https://www.chromium.org/developers/testing/isolated-testing/deterministic-builds), all these macro are implemented differently on each platform (windows, linux, macos) and each compiler. Example for FILE see http://blog.mindfab.net/2013/12/on-way-to-deterministic-binariy-gcc.html

it is a huge topic that will need more than one person to complete: it took debian lots of time but we should be able to profit from their experience https://wiki.debian.org/ReproducibleBuilds

cedricwalter commented Nov 5, 2017

LINE is not deterministic according to the chromium project (https://www.chromium.org/developers/testing/isolated-testing/deterministic-builds), all these macro are implemented differently on each platform (windows, linux, macos) and each compiler. Example for FILE see http://blog.mindfab.net/2013/12/on-way-to-deterministic-binariy-gcc.html

it is a huge topic that will need more than one person to complete: it took debian lots of time but we should be able to profit from their experience https://wiki.debian.org/ReproducibleBuilds

@danrmiller

This comment has been minimized.

Show comment
Hide comment
@danrmiller

danrmiller Nov 5, 2017

Contributor

Thanks @cedricwalter are you on freenode?

Contributor

danrmiller commented Nov 5, 2017

Thanks @cedricwalter are you on freenode?

@cedricwalter

This comment has been minimized.

Show comment
Hide comment
@cedricwalter

cedricwalter Nov 5, 2017

@danrmiller not yet, but on which channel monero-dev?

cedricwalter commented Nov 5, 2017

@danrmiller not yet, but on which channel monero-dev?

@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Nov 6, 2017

Contributor

The link doesn't give any info, but OK. We can cross that bridge if and when we end up needing to.

#monero-dev is a good channel for discussing this, yes.

Contributor

moneromooo-monero commented Nov 6, 2017

The link doesn't give any info, but OK. We can cross that bridge if and when we end up needing to.

#monero-dev is a good channel for discussing this, yes.

@jonathancross

This comment has been minimized.

Show comment
Hide comment
@jonathancross

jonathancross Nov 6, 2017

Contributor

@TheCharlatan Also mentioned an interest in helping out with deterministic builds.

Contributor

jonathancross commented Nov 6, 2017

@TheCharlatan Also mentioned an interest in helping out with deterministic builds.

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Nov 6, 2017

Contributor

The version control metadata should not be a problem, if the build is done similar to bitcoin's gitian. Gitian checks out the source tree for every build for every platform in the same way.
The following is required for a clean Gitian build:

  • Localized, statically compiled dependencies, similar to the depends system in bitcoin. Since depends uses autotools, it would probably be easier to use something like mxe: https://github.com/mxe/mxe , which has support for cmake already, but does not contain installer scripts for all the monero dependencies yet.
  • A script that is executed for every build iteration, using lxc with predefined configurations.
Contributor

TheCharlatan commented Nov 6, 2017

The version control metadata should not be a problem, if the build is done similar to bitcoin's gitian. Gitian checks out the source tree for every build for every platform in the same way.
The following is required for a clean Gitian build:

  • Localized, statically compiled dependencies, similar to the depends system in bitcoin. Since depends uses autotools, it would probably be easier to use something like mxe: https://github.com/mxe/mxe , which has support for cmake already, but does not contain installer scripts for all the monero dependencies yet.
  • A script that is executed for every build iteration, using lxc with predefined configurations.
@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Nov 19, 2017

Contributor

For the record, I tried compiling simplewallet.cpp twice, and I got identical object files (after stripping debug info), so it's looking like we're in a good starting position :)

Contributor

moneromooo-monero commented Nov 19, 2017

For the record, I tried compiling simplewallet.cpp twice, and I got identical object files (after stripping debug info), so it's looking like we're in a good starting position :)

@dEBRUYNE-1

This comment has been minimized.

Show comment
Hide comment
@dEBRUYNE-1

dEBRUYNE-1 Jan 8, 2018

Contributor

+proposal

Contributor

dEBRUYNE-1 commented Jan 8, 2018

+proposal

@anonimal

This comment has been minimized.

Show comment
Hide comment
@anonimal

anonimal Jan 8, 2018

Contributor

This issue should be moved to the meta repo as it will affect all applicable monero umbrella projects.

Contributor

anonimal commented Jan 8, 2018

This issue should be moved to the meta repo as it will affect all applicable monero umbrella projects.

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Feb 4, 2018

Contributor

Some updates: I managed to static compile a linux binary with all important dependencies linked from a modified version of bitcoin's depends system and an additional cmake toolchain file. It might be a good idea to break this into smaller pieces, since I cannot really estimate the time required to get it running on all platforms. It would be nice to get a minimal set of requirements (e.g. which platforms should be supported, what manual interaction is acceptable) in order to open a pull request for this so more people can start working on it.
Edit:
To give a taste on how it works in Bitcoin:
A local script calls the gitian builder who creates a container in which the following is run:

  • A depends build for every platform with make HOST=PLATFORM_TRIPLET , where platform triplet is for example x86_64-apple-darwin, or x86_64-w64-mingw32
  • A configure script for the source code is then run with CONFIG.SITE=/path/to/depends/PLATFORM_TRIPLET/share/config.site prepended (in monero this would be something like cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/depends/toolchain_file)
  • This is run for every platform, creating deterministic binaries for each triplet
  • Once built there are a few options for additional signing (detached sigs, no signing, check sigs)
Contributor

TheCharlatan commented Feb 4, 2018

Some updates: I managed to static compile a linux binary with all important dependencies linked from a modified version of bitcoin's depends system and an additional cmake toolchain file. It might be a good idea to break this into smaller pieces, since I cannot really estimate the time required to get it running on all platforms. It would be nice to get a minimal set of requirements (e.g. which platforms should be supported, what manual interaction is acceptable) in order to open a pull request for this so more people can start working on it.
Edit:
To give a taste on how it works in Bitcoin:
A local script calls the gitian builder who creates a container in which the following is run:

  • A depends build for every platform with make HOST=PLATFORM_TRIPLET , where platform triplet is for example x86_64-apple-darwin, or x86_64-w64-mingw32
  • A configure script for the source code is then run with CONFIG.SITE=/path/to/depends/PLATFORM_TRIPLET/share/config.site prepended (in monero this would be something like cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/depends/toolchain_file)
  • This is run for every platform, creating deterministic binaries for each triplet
  • Once built there are a few options for additional signing (detached sigs, no signing, check sigs)
@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Mar 7, 2018

Contributor

Are you still working on this (or planning to) ?

Contributor

moneromooo-monero commented Mar 7, 2018

Are you still working on this (or planning to) ?

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Mar 8, 2018

Contributor

Still working on it. The cross compilation is a bit problematic, since the current build system expects vendored sources from external/ to be built. Not quite sure how to properly ship around this while keeping native compilation intact. This is why I will probably focus on getting the deterministic build done now on Linux, and think about the cross compilation again at a later stage.

Contributor

TheCharlatan commented Mar 8, 2018

Still working on it. The cross compilation is a bit problematic, since the current build system expects vendored sources from external/ to be built. Not quite sure how to properly ship around this while keeping native compilation intact. This is why I will probably focus on getting the deterministic build done now on Linux, and think about the cross compilation again at a later stage.

@moneromooo-monero

This comment has been minimized.

Show comment
Hide comment
@moneromooo-monero

moneromooo-monero Mar 10, 2018

Contributor

Getting there in steps is certainly fine. Thanks for doing this.

Contributor

moneromooo-monero commented Mar 10, 2018

Getting there in steps is certainly fine. Thanks for doing this.

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Mar 18, 2018

Contributor

I now opened #3430 to get depends on monero. This should at least take care of deterministically getting dependencies for all platforms.

Contributor

TheCharlatan commented Mar 18, 2018

I now opened #3430 to get depends on monero. This should at least take care of deterministically getting dependencies for all platforms.

@garlicgambit

This comment has been minimized.

Show comment
Hide comment
@garlicgambit

garlicgambit Apr 22, 2018

Any status updates on the progress? Need any support with something?

garlicgambit commented Apr 22, 2018

Any status updates on the progress? Need any support with something?

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Apr 22, 2018

Contributor

The pr is still open, I will continue work on it once it is merged. If you want to move ahead, checkout my depends branch and try to setup a gitian descriptor, probably for a 64bit linux to start out, like here in bitcoin.

Contributor

TheCharlatan commented Apr 22, 2018

The pr is still open, I will continue work on it once it is merged. If you want to move ahead, checkout my depends branch and try to setup a gitian descriptor, probably for a 64bit linux to start out, like here in bitcoin.

@h01ger

This comment has been minimized.

Show comment
Hide comment
@h01ger

h01ger Aug 27, 2018

actually, monero can be (re-)build in a deterministic way, if the same build path is used. see https://tests.reproducible-builds.org/monero (the sun icons on the left...) :-)

h01ger commented Aug 27, 2018

actually, monero can be (re-)build in a deterministic way, if the same build path is used. see https://tests.reproducible-builds.org/monero (the sun icons on the left...) :-)

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Sep 10, 2018

Contributor

@h01ger yes, it can and I have achieved reproducibility in the past on linux amd64. The hard thing is to make an easy as possible recipe for all architectures and target hosts (including mac and windows).

Contributor

TheCharlatan commented Sep 10, 2018

@h01ger yes, it can and I have achieved reproducibility in the past on linux amd64. The hard thing is to make an easy as possible recipe for all architectures and target hosts (including mac and windows).

@h01ger

This comment has been minimized.

Show comment
Hide comment
@h01ger

h01ger Sep 10, 2018

right. I guess it would be very nice to have some generic way/toolchain for that, maybe even a tool. and documentation...

h01ger commented Sep 10, 2018

right. I guess it would be very nice to have some generic way/toolchain for that, maybe even a tool. and documentation...

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Sep 18, 2018

Contributor

#3430 has been merged now. This adds a generic toolchain for some targets; mac, windows, linux 64 bit and arm 32 bit. Looking at getting a gitian build script for it going now.

Contributor

TheCharlatan commented Sep 18, 2018

#3430 has been merged now. This adds a generic toolchain for some targets; mac, windows, linux 64 bit and arm 32 bit. Looking at getting a gitian build script for it going now.

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Sep 26, 2018

Contributor

Now that the builds are more or less stable https://travis-ci.org/TheCharlatan/monero/builds/433563684 (hooray!) , I'll post a list of issues that still remain and need to be dealt with. Support/input on any of the items is welcome.

  • hid and libusb are not statically linked into the end libraries yet, the dynamic libraries of libusb and hidapi are used. If static hidapi is used, the linker throws a bunch of errors, that libusb needs to be linked correctly
  • The linux binaries all still link the system's libc. When compiling on ubuntu 18 for example (which is my current preferred host OS) this means that the binary expects a new libc version on the machine it is running on. Since libc is not backwards compatible, this will result in non-portable binaries. Measures to ensure backwards compatibility should therefore be taken. Bitcoin has already dealt with those, so we can again build on their work: https://github.com/bitcoin/bitcoin/tree/master/src/compat . They also check the back compatibility of the used symbols at the end of every build.
  • A full gitian build script needs to be written. This should be similar to: https://github.com/bitcoin/bitcoin/blob/master/contrib/gitian-build.py . Docs on the gitian build process can be found here: https://github.com/bitcoin-core/docs/blob/master/gitian-building.md .
  • Probably the debug symbols need to be split on Linux. This can be done by passing --enable-deterministic-archives for the archiver.
  • Something like autotools' make dist for cmake might be useful as well to ensure that no git metadata leaks into the source during compile time. This should be taken care of though, when building in a seperate build director. Just something to keep in mind.
Contributor

TheCharlatan commented Sep 26, 2018

Now that the builds are more or less stable https://travis-ci.org/TheCharlatan/monero/builds/433563684 (hooray!) , I'll post a list of issues that still remain and need to be dealt with. Support/input on any of the items is welcome.

  • hid and libusb are not statically linked into the end libraries yet, the dynamic libraries of libusb and hidapi are used. If static hidapi is used, the linker throws a bunch of errors, that libusb needs to be linked correctly
  • The linux binaries all still link the system's libc. When compiling on ubuntu 18 for example (which is my current preferred host OS) this means that the binary expects a new libc version on the machine it is running on. Since libc is not backwards compatible, this will result in non-portable binaries. Measures to ensure backwards compatibility should therefore be taken. Bitcoin has already dealt with those, so we can again build on their work: https://github.com/bitcoin/bitcoin/tree/master/src/compat . They also check the back compatibility of the used symbols at the end of every build.
  • A full gitian build script needs to be written. This should be similar to: https://github.com/bitcoin/bitcoin/blob/master/contrib/gitian-build.py . Docs on the gitian build process can be found here: https://github.com/bitcoin-core/docs/blob/master/gitian-building.md .
  • Probably the debug symbols need to be split on Linux. This can be done by passing --enable-deterministic-archives for the archiver.
  • Something like autotools' make dist for cmake might be useful as well to ensure that no git metadata leaks into the source during compile time. This should be taken care of though, when building in a seperate build director. Just something to keep in mind.
@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Oct 3, 2018

Contributor

Progress on the gitian build script (gitian-build.py) can be tracked here: https://github.com/TheCharlatan/monero/tree/gitian/contrib/gitian . if you want to participate, checkout that branch and submit improvements there.

Contributor

TheCharlatan commented Oct 3, 2018

Progress on the gitian build script (gitian-build.py) can be tracked here: https://github.com/TheCharlatan/monero/tree/gitian/contrib/gitian . if you want to participate, checkout that branch and submit improvements there.

@TheCharlatan

This comment has been minimized.

Show comment
Hide comment
@TheCharlatan

TheCharlatan Oct 8, 2018

Contributor

I now opened #4526 to add a gitian build script to monero.

Contributor

TheCharlatan commented Oct 8, 2018

I now opened #4526 to add a gitian build script to monero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment