Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solidity 0.5.17 installed with Homebrew produces a different output than the version downloaded from solc-bin.ethereum.org #10183

Closed
alcuadrado opened this issue Nov 2, 2020 · 6 comments · Fixed by ethereum/solc-bin#76
Assignees

Comments

@alcuadrado
Copy link
Member

Hi

@udwig, reported to us that he was unable to verify a contract on Etherscan, even when using the exact same solidity standard input json. We spent some time debugging it, and found this problem with solc or with its distribution:

If you install solc 0.5.17+commit.d19bba13 using Homebrew and downloading it from solc-bin.ethereum.org, you can get a different output for the same input.

Here's a script that can be used to reproduce it:

brew tap ethereum/ethereum
brew install ethereum/ethereum/solidity@5

wget --quiet https://gist.githubusercontent.com/alcuadrado/06c1ad0cf7e23758e17a32e829f26497/raw/b9f864cfda21d0cc9c5365fa1a1b435aec6285e8/input-to-get-different-outputs.json

wget --quiet https://solc-bin.ethereum.org/macosx-amd64/solc-macosx-amd64-v0.5.17+commit.d19bba13
chmod +x solc-macosx-amd64-v0.5.17+commit.d19bba13

echo '\n\n\n'
echo Brew solc
solc --version
cat input-to-get-different-outputs.json | solc --standard-json | md5sum

echo '\n\n\n'
echo Native solc
./solc-macosx-amd64-v0.5.17+commit.d19bba13 --version
cat input-to-get-different-outputs.json | ./solc-macosx-amd64-v0.5.17+commit.d19bba13 --standard-json | md5sum

It will output

Brew solc
solc, the solidity compiler commandline interface
Version: 0.5.17+commit.d19bba13.Darwin.appleclang
b2051480575328ea24cdfa7cbdc71cf5  -




Native solc
solc, the solidity compiler commandline interface
Version: 0.5.17+commit.d19bba13.mod.Darwin.appleclang
a13020de9abb553ca40d7171744a8735  -

Another interesting thing is that Etherscan seems to be running the same version than the one distributed by Homebrew. It may actually be solcjs. I got into this other bug when trying to reproduce it with solcjs.

Any idea what's going on? Are the solc-bin.ethereum.org binaries somehow corrupted? This is really important to us, as it can lead to people deploying contracts that aren't verifiable on Etherscan.

@alcuadrado alcuadrado changed the title Solidity 0.5.17 installed with Homebrew gives different output than the version downloaded from solc-bin.ethereum.org Solidity 0.5.17 installed with Homebrew produces a different output than the version downloaded from solc-bin.ethereum.org Nov 2, 2020
@cameel
Copy link
Member

cameel commented Nov 2, 2020

Version: 0.5.17+commit.d19bba13.Darwin.appleclang
Version: 0.5.17+commit.d19bba13.mod.Darwin.appleclang

Looks like a difference in metadata. Metadata includes compiler version and versions clearly differ. The build system adds the .mod part when the source you're building from has modifications. And in ethereum/solc-bin#53, as can be seen in the script, I had to apply quite a few changes to the older versions to get them to build and pass tests. This is all in the build configuration and tests so there should be no difference in compiler logic but the build system can't discern that.

For example I had to modify the config to stop compiler from treating warnings as errors (there are warnings below 0.6.1) and to link jsoncpp statically (otherwise the executable would not be really self-contained). I might be able to work around it on some versions but with significant effort and in the end not all of them.

The only sure-fire way to fix these builds I can think of would be to add another modification that would remove the .mod part from version string but this seems a bit too hacky...

@cameel
Copy link
Member

cameel commented Nov 2, 2020

I'm going to redo the builds for which the version does not match and just omit the .mod part.

@alcuadrado
Copy link
Member Author

Thanks for your reply, @cameel.

Metadata includes compiler version and versions clearly differ.

I'm actually surprised that the build info is included in the metadata that gets hashed and embedded into the bytecode.

I just checked, and the example in the documentation includes it, but it would be great if you could make it more explicit, as it's not something you'd expect.

While I get the idea behind doing that, I believe it makes it unfeasible to use the metadata hash at all. There's at least 4 different official builds for each solc version, and to be able to verify a contract (including its metadata) you should have access to all of them. Etherescan may be able to have this infra, but it's definitely not something an end-user would be able to do.

We weren't having this kind of issue before because most of us were using the same Emscripten builds, but this will become increasingly common now that you published native versions for all the major platforms, and tools are adopting them.

I'm going to redo the builds for which the version does not match and just omit the .mod part.

I think sticking with the .mod and contacting Etherscan would be a better path forward. If you actually rebuild those versions, removing the .mod, and publish them in solc-bin replacing the existing builds, it would be almost impossible to reproduce some contract builds

@cameel
Copy link
Member

cameel commented Nov 4, 2020

@alcuadrado

I'm actually surprised that the build info is included in the metadata that gets hashed and embedded into the bytecode.

That's actually tripping many people up. See for example #10082. But I think that information about compiler version used to generate the bytecode is pretty important when it comes to reproducing it.

And it isn't really all that extreme. There's a proposal to include much more stuff in the version string: #6228, though AFAIK this isn't on the roadmap right now. EDIT: Looking closer, it's about just printing it with --version, not including that stuff in the version string.

While I get the idea behind doing that, I believe it makes it unfeasible to use the metadata hash at all. There's at least 4 different official builds for each solc version, and to be able to verify a contract (including its metadata) you should have access to all of them.

This is not a problem with official release builds because we do perform extensive checks that the generated byte code is identical on all platforms. We have the t_bytecode_compare CI job that runs for every PR and also on the final binaries that are published on the github release page. The job basically takes all the 4k+ .sol test cases from the repository, compiles them on all platforms and makes sure that the resulting bytecode is byte-for-byte identical.

The situation with these MacOS rebuilds is a special case because they were not originally included in releases and I built them independently in solc-bin which does not share the CI infrastructure. We discussed running bytecode comparison on them but it was before the recent refactor (it was much more obtrusive before then, e.g. it required a special repo) and it would add a lot of extra effort to the task. In the end we decided that it was not worth the effort and that passing all the tests should be good enough. And it would be if I did not miss this slight difference in --version (I think that this is the only difference).

Lesson learned. I'm going to do the bytecode comparison when I rebuild the binaries now.

I think sticking with the .mod and contacting Etherscan would be a better path forward. If you actually rebuild those versions, removing the .mod, and publish them in solc-bin replacing the existing builds, it would be almost impossible to reproduce some contract builds

This really depends on how widely those builds are used. They have only been added recently so I was assuming that replacing them with ones that can produce bytecode identical with other platforms would be the lesser evil here.

AFAIK Etherscan is using the emscripten builds from the bin/ directory in solc-bin. https://etherscan.io/solcversions includes nightlies which are only present in bin/.

We definitely could keep the old binaries in a separate directory in solc-bin in case someone does need them but I'm not sure how widespread that need really is.

@cameel
Copy link
Member

cameel commented Jan 25, 2021

Sorry it took so long but I'm finally done with verifying the rebuilt binaries. Running the bytecode comparison on the historical binaries required adding extra functionality to our scripts and also making them backwards compatible with the CLI interface of older versions. I'm glad I did it because it turned out that there are more problems than just the macOS binaries and that binaries for 0.3.6, 0.4.7 and 0.4.8 won't produce the same bytecode even after the rebuild.

So here's the overview of the situation:

So now getting the binaries out is just a matter of ethereum/solc-bin#76 passing review. After it's merged anyone who needs the old ones for verification will still be able to get them from the commit I tagged as
macosx-binaries-reporting-wrong-versions.

@cameel
Copy link
Member

cameel commented Jan 25, 2021

Turns out that #10846 is not the only issue with the macOS binary for 0.3.6:

  • It's not actually using the same commit hash as v0.3.6 (probably ignoring commit_hash.txt) unlike 0.4.x and above.
  • There are a few input files for which it fails and emscripten binary does not or the other way around.

Since it's such an old version and resolving these issues will be pretty tedious I think it's not really worth it. So the question would be whether we should keep such a binary in solc-bin or is it better to remove it? It still produces valid bytecode and is fully usable. It's just that you won't be able to verify its output using the emscripten binary (and that only if you're using libraries; contracts are fine).

@alcuadrado Do you have an opinion on that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants