Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic output according to filesystem path #9790

Closed
mikeshultz opened this issue Sep 13, 2020 · 3 comments
Closed

Non-deterministic output according to filesystem path #9790

mikeshultz opened this issue Sep 13, 2020 · 3 comments

Comments

@mikeshultz
Copy link

Description

I'm seeing non-deterministic output according to filesystem path when compiling libraries. I've been able to reproduce it with multiple versions.

Environment

  • Compiler version: 0.5.11-0.7.1 (tested)
  • Target EVM version (as per compiler settings): default
  • Framework/IDE (e.g. Truffle or Remix): n/a
  • EVM execution environment / backend / blockchain client: n/a
  • Operating system: Linux

Steps to Reproduce

I put together a test case repository. If you run run.sh /path/to/solc it'll compile 3 identical libraries in 3 different directories displaying md5 hashes of the source contract and output.

Example from my last run:

$ ./run.sh ~/.local/bin/solc.0.7.1
Compiled contract a: 253b4238c2f607c39950d7cd2e6f4641/40f8f82ad16272edc8dd616069b22605
Compiled contract b: 253b4238c2f607c39950d7cd2e6f4641/48da6b634e3ee37565a99955d02454a4
Compiled contract c: 253b4238c2f607c39950d7cd2e6f4641/19888f5974c4bb763ed80282f4bbd239

This may be related to #168 which was closed before resolution.

@chriseth
Copy link
Contributor

This is known behaviour. You tell the compiler: "Compile the file called a/SafeMath.sol", so this is how the file is known to the compiler. Any imports performed will also call the file by that name and because of that, this fact has to be hashed into the compilation artefact to make the build reproducible (otherwise, imports might resolve to different files).

If you want to abstract away the directories a / b / c, you have three options:

  • compile from inside the directory and call the compile directly on the file. This will cause the compiler to put the file in the root directory of its virtual filesystem
  • use the --base-path option to tell the compiler where the root of its virtual filesystem is
  • use standard-json, where you have full control about how to name files and are not constrained to how they are called in the file system or how they are arranged relative to each other in directories.

@mikeshultz
Copy link
Author

I'm curious about the hashing logic. What does the bytecode care about source imports? Any interfaces or library code should be effectively integrated or placeholdered at the point of bytecode output, no? I get why placeholder hashes might include file paths since it's very relevant, but not why that might effect contract bytecode. If you happen to move source files around but leave execution instructions all the same, I think most of us would expect the build to output the same bytecode at the end.

Not a big deal to me, it's just unintuitive. Thank you for the options, those will be helpful.

@chriseth
Copy link
Contributor

Of course, there are changes to filenames and paths that should not change the semantics of the contract, but if you exchange the names of to files, then this can have a big influence, especially if they define the same symbols. The idea is that we want a hash in the bytecode that basically is a watermark and allows re-compilation in exactly the same way as it was done before deployment, including the names of all files and all comments in the source code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants