Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(v0.8.21) Contract code changes with additional files in compilation #14494

Closed
kuzdogan opened this issue Aug 15, 2023 · 16 comments · Fixed by #14574
Closed

(v0.8.21) Contract code changes with additional files in compilation #14494

kuzdogan opened this issue Aug 15, 2023 · 16 comments · Fixed by #14574
Assignees
Labels
bug 🐛 selected for development It's on our short-term development
Milestone

Comments

@kuzdogan
Copy link
Member

kuzdogan commented Aug 15, 2023

Here I am again with another issue with extra files in compilation causing trouble 🫡

By extra files I mean files that the target contract does not depend on.

Recap

First a recap on previous issues similarly on extra files in compilation:

  1. Different bytecode with same metadata on 0.6.12 #12281
    This was an older issue that seems to affect the AST IDs in 0.6.12 and 0.7.0. We also encountered this and handled in Sourcify's case Handle Solidity v0.6.12 and v0.7.0 extra files bytecode mismatch sourcify#618
  2. Contract bytecode changes vastly when independent contracts added to the compiler  #14250
    I've encountered this in a 0.8.19 contract and got fixed in 0.8.21 with Deterministically choose memory slots for variables during stack-to-memory. #14311.

Current Issue

I've encountered the current issue in ethereum/sourcify#1065.

My initial suspicion was that it's an instance of the 2. case because the contract was a 0.8.19 contract. Unfortunately, I was able to reproduce the case in 0.8.21.

Although the 1. case should have been resolved, I suspect it still can be related because the metadata hashes are identical but the bytecodes differ.

To reproduce

I named inputs as "hardhat" and "sourcify". "Sourcify" uses the minimum number of source files, because it is based on the ones listed in the contract metadata. "Hardhat" uses extra because by default Hardhat inputs everything to the compiler. Sourcify has two different inputs 0.8.19 and 0.8.21 because it includes the evmVersion inside the input i.e. paris vs shanghai.

N2MERC721-hardhat-input.json.txt
N2MERC721-sourcify-input-0.8.21.json.txt
N2MERC721-sourcify-input-0.8.19.json.txt

You can see the two inputs differ only in some extra files + some of the compiler settings

These are the extra files in Hardhat input
[
  '@nfts2me/contracts/interfaces/IN2MBeaconFactory.sol',
  '@openzeppelin/contracts-upgradeable/token/ERC1155/ERC1155Upgradeable.sol',
  '@openzeppelin/contracts-upgradeable/token/ERC1155/extensions/ERC1155SupplyUpgradeable.sol',
  '@openzeppelin/contracts-upgradeable/token/ERC1155/extensions/IERC1155MetadataURIUpgradeable.sol',
  '@openzeppelin/contracts-upgradeable/token/ERC1155/IERC1155ReceiverUpgradeable.sol',
  '@openzeppelin/contracts-upgradeable/token/ERC1155/IERC1155Upgradeable.sol',
  '@openzeppelin/contracts/governance/utils/IVotes.sol',
  '@openzeppelin/contracts/proxy/Proxy.sol',
  '@openzeppelin/contracts/token/ERC1155/extensions/IERC1155MetadataURI.sol',
  '@openzeppelin/contracts/token/ERC1155/IERC1155.sol',
  '@openzeppelin/contracts/token/ERC721/extensions/IERC721Metadata.sol',
  '@openzeppelin/contracts/token/ERC721/IERC721.sol',
  '@openzeppelin/contracts/utils/introspection/IERC165.sol',
  'contracts/beacon/N2MERC1155Upgradeable.sol',
  'contracts/beacon/N2MERC721Upgradeable.sol',
  'contracts/beacon/N2MUpgradeable.sol',
  'contracts/interfaces/IN2M_ERCBase.sol',
  'contracts/interfaces/IN2M_ERCCommon.sol',
  'contracts/interfaces/IN2M_ERCLibrary.sol',
  'contracts/interfaces/IN2M_ERCStorage.sol',
  'contracts/interfaces/IN2MBeaconFactory.sol',
  'contracts/interfaces/IN2MCrossFactory.sol',
  'contracts/interfaces/IN2MERC1155.sol',
  'contracts/interfaces/IN2MERC721.sol',
  'contracts/interfaces/IOperatorFilterRegistry.sol',
  'contracts/interfaces/IReverseRegistrar.sol',
  'contracts/N2MERC1155.sol',
  'contracts/ownable/NFTOwnableUpgradeable.sol',
  'contracts/TextUtils.sol'
]
Also leaving the JS script I used to check the differences of the keys of `.sources` in the JSON inputs

Usage:
node checkFileKeysDifferences.js solcInput-1.json solcInput-2.json

const fs = require("fs");

function findKeyDifferences(file1, file2) {
  const json1 = JSON.parse(fs.readFileSync(file1, "utf8"));
  const json2 = JSON.parse(fs.readFileSync(file2, "utf8"));

  const sources1 = new Set(Object.keys(json1.sources) || []);
  const sources2 = new Set(Object.keys(json2.sources) || []);

  const keysOnlyInFile1 = Array.from(sources1).filter(
    (key) => !sources2.has(key)
  );
  const keysOnlyInFile2 = Array.from(sources2).filter(
    (key) => !sources1.has(key)
  );

  return {
    keysOnlyInFile1,
    keysOnlyInFile2,
  };
}

// Retrieve file paths from command-line arguments
const [file1Path, file2Path] = process.argv.slice(2);

if (!file1Path || !file2Path) {
  console.log("Please provide two file paths as command-line arguments.");
  process.exit(1);
}

const { keysOnlyInFile1, keysOnlyInFile2 } = findKeyDifferences(
  file1Path,
  file2Path
);

console.log("Keys only in file 1:", keysOnlyInFile1);
console.log("Keys only in file 2:", keysOnlyInFile2);

To extract the runtime bytecode:

cat N2MERC721-sourcify-input-0.8.19.json | solc --standard-json | jq '.contracts."contracts/N2MERC721.sol".N2MERC721.evm.deployedBytecode.object' > N2MERC721-0.8.19-sourcify-runtime-bytecode-from-solc.txt

My Bytecode outputs

N2MERC721-0.8.21-sourcify-runtime-bytecode.txt
N2MERC721-0.8.21-hardhat-runtime-bytecode.txt
N2MERC721-0.8.19-sourcify-runtime-bytecode.txt
N2MERC721-0.8.19-hardhat-runtime-bytecode.txt

To check the diff:

git diff --word-diff --word-diff-regex=. N2MERC721-0.8.19-hardhat-runtime-bytecode.txt N2MERC721-0.8.19-sourcify-runtime-bytecode.txt

Will give you a diff like this:
image

Environment

  • Compiler version: 0.8.19 and 0.8.21 (both Emscripten and Darwin.appleclang)
  • Target EVM version (as per compiler settings): paris and shanghai
  • Framework/IDE (e.g. Truffle or Remix): none
  • EVM execution environment / backend / blockchain client:
  • Operating system: MacOS

Again, one particular thing that got my attention was that the metadata hashes were identical in all cases. This leaves me thinking if this is similar to the 1. case I mentioned.

@ekpyron
Copy link
Member

ekpyron commented Aug 15, 2023

Ok, I can so far confirm that this will result in different libevmasm assembly.
The optimized IR differs in two ways, both due to differing AST IDs:

  • identifier names - that should not affect assembling these days, though
  • string identifiers for immutables - that may be worth a look, maybe we missed an effect of that

@ekpyron ekpyron added the selected for development It's on our short-term development label Aug 15, 2023
@ekpyron
Copy link
Member

ekpyron commented Aug 15, 2023

The assembly differs in some details in the stack ordering (I e.g. see different swaps). Haven't been able to trace down a reason yet.

@ekpyron
Copy link
Member

ekpyron commented Aug 15, 2023

Interestingly, the difference in swapping is not equivalent, e.g. I'm seeing stuff like

-      dup2
+      dup3

So this may also be non-determinism in the stack layouts (yielding different stack layouts across basic blocks resulting in differing stack shuffling). But I'd rather expect it to be a secondary effect of something else.

@ekpyron
Copy link
Member

ekpyron commented Aug 15, 2023

But ok - optimized Yul being identical modulo AST IDs, but libevmasm assembly differing points towards the code transform

@ekpyron
Copy link
Member

ekpyron commented Aug 15, 2023

Independently of solving this particular issue, though, since we seem to keep running into, we should look into test coverage for this. So far, the ultimate reason has been offsets in the AST IDs, so we should consider a bytecode comparison run with an artificial offset in the AST IDs - that may catch more of these cases, resp. avoid them in the future.

@ekpyron
Copy link
Member

ekpyron commented Aug 16, 2023

Looking more closely at this again, there's actually a small differences in the optimized Yul that may explain the divergence, so it's may be the yul optimizer after all.

@ekpyron
Copy link
Member

ekpyron commented Aug 16, 2023

The difference is an additional variable in one yul function:

            function fun_upperBinaryLookup(var_self_3885_slot, var_key, var_low, var_high) -> var
            {

                for { }
                                                       lt(var_low, var_high)

                { }
                {

                    let expr := and(var_low, var_high)

                    let _1 := 1
                    let sum := add(expr, shr(_1, xor(var_low, var_high)))

                    if gt(expr, sum) { panic_error_0x11() }

                    let var_mid := sum

                    mstore( 0, var_self_3885_slot)

                    let _2 := 0xffffffff
                    let cleaned := and(sload( add(keccak256( 0, 0x20), sum)), _2)

                    switch gt( cleaned, and( var_key, _2))
                    case 0 {

                        let sum_1 := add(sum, _1)
                        if gt(sum, sum_1) { panic_error_0x11() }

                        var_low := sum_1
                    }
                    default
                    {

                        var_high := var_mid
                    }
                }

                var := var_high
            }

vs

            function fun_upperBinaryLookup(var_self_slot, var_key, var_low, var_high) -> var
            {

                for { }
                                                       lt(var_low, var_high)

                { }
                {

                    let expr := and(var_low, var_high)

                    let _1 := 1
                    let sum := add(expr, shr(_1, xor(var_low, var_high)))

                    if gt(expr, sum) { panic_error_0x11() }

                    mstore( 0, var_self_slot)

                    let _2 := 0xffffffff
                    let cleaned := and(sload( add(keccak256( 0, 0x20), sum)), _2)

                    switch gt( cleaned, and( var_key, _2))
                    case 0 {

                        let sum_1 := add(sum, _1)
                        if gt(sum, sum_1) { panic_error_0x11() }

                        var_low := sum_1
                    }
                    default
                    {

                        var_high := sum
                    }
                }

                var := var_high
            }

var_mid is eliminated in one version.

Eliminating the variable manually yields the same bytecode again (so it's not the naming in yul identifiers or immutables).

@cameel
Copy link
Member

cameel commented Aug 24, 2023

I played with this today with the intention of preparing a bytecode comparison run for #14495. Unfortunately a simple addition of a single file with pragmas or even a simple contract to compilation does not seem to break things - bytecode check passes just fine. Something more complicated is needed to trigger this.

So I tried to reduce this issue to a minimal example to figure out the necessary conditions and here's what I was left with:

mkdir @

cat <<=== > a.sol
    import "@/za.sol";
    import "b.sol";

    contract A is B {
        function a() public pure {
            zb();
        }
    }
===

cat <<=== > b.sol
    import "@/zb.sol";
    import "c.sol";

    abstract contract B is C {}
===

cat <<=== > c.sol
    abstract contract C
    {
        function f() public pure {}

        function c() public view returns (uint) {
            try this.f() { return 0; }
            catch {}
        }
    }
===

cat <<=== > @/za.sol
    import "@/zc.sol";
===

cat <<=== > @/zb.sol
    import "@/zc.sol";

    function zb() pure {
        zc();
    }
===

cat <<=== > @/zc.sol
    function zc() pure returns (bytes memory returndata) {
        if (returndata.length == 0) {
            return "";
        }
    }
===

diff --color --unified \
    <(solc a.sol b.sol --bin --via-ir --optimize | grep '======= a.sol:A =======' --after-context=2 | fold --width 100) \
    <(solc a.sol       --bin --via-ir --optimize | grep '======= a.sol:A =======' --after-context=2 | fold --width 100)

I could not go below 6 files or get rid of a subdirectory, though I suspect this is needed only due to the way it affects the import/compilation order and examples without it should be possible as well.

The import pattern goes like this:

a──►za──►zc
│         ▲
▼         │
b──►zb────┘
│
▼
c

But the pattern itself is not enough. The try/catch and all the other calls are necessary and must also be placed in specific files.

Not sure if I should continue with the bytecode check in that case. Seems like it will be hard to come up with something that does not cause issues by itself, but does when combined with some random code examples and is not simply this specific case.

@ekpyron
Copy link
Member

ekpyron commented Aug 29, 2023

Sadly, I think what you found there is a completely unrelated issue. In your case I see a different order of Yul functions in optimized IR code, whereas in the orginally posted example, I see a single variable not being eliminated from one Yul function. The underlying cause may still be the same, but the symptoms are different enough that I'd actually expect two causes.

@ekpyron
Copy link
Member

ekpyron commented Aug 29, 2023

But yeah, since it then looks like it's non-trivial to guard against these cases by traditional testing, I wonder if fuzzing can help (pinging @bshastry). To summarize the overreaching problem here for @bshastry: We want to guarantee that the bytecode the compiler produces for a contract A is the same, if all the sources and compiler settings involved in generating code for A are the same. However, in via-IR code generation we keep hitting cases, in which adding completely unrelated source files (e.g. for a contract B) to the same compiler run, will affect the bytecode of contract A. The idea would be to generate two random source files, each containing a random contract, and compile those individually (using the optimized via-IR pipeline) first, and then check that the compiler produces identical bytecode for each of them, if both are compiled at the same time.
I'd imagine setting up a fuzzing pipeline for this should be rather simple - and we do have the cases above still open on develop currently, so we could see if we get them reported by the fuzzer as a sanity check.

The target here would not be to find security vulnerabilities, but compiler misbehaviour that's causing issues for tooling.

@cameel
Copy link
Member

cameel commented Sep 4, 2023

Idea from the call: create a NameRandomizer step and use it with fuzzing or bytecode comparison to check if the bytecode changes.

@ekpyron
Copy link
Member

ekpyron commented Sep 5, 2023

I just had another quick look at #14494 (comment) - that indeed has a differing order of functions in unoptimized IR already, so it's indeed a different issue that already occurs in via-IR code generation and not only during optimization (the original case of this issue is confirmed in the optimizer, since it vanishes if we ignore the name hints in the name dispenser of the yul optimizer)

@bshastry
Copy link
Contributor

The idea would be to generate two random source files, each containing a random contract, and compile those individually (using the optimized via-IR pipeline) first, and then check that the compiler produces identical bytecode for each of them, if both are compiled at the same time.

@ekpyron Just so I understand correctly

Source A

contract A { ... }

Source B

contract B { ... }

where the two source files are generated at random. And then,

we compile using optimized via-ir setting

Compilation 1: Source A: contract A
Compilation 2: Source B: contract B
Compilation 3: Source A + Source B: contract A
Compilation 4: Source A + Source B: contract B

and the invariant that should hold is bytecodes of the following pairs are identical?

  • (1) and (3)
  • (2) and (4)

tl;dr In each fuzzing iteration, the fuzzer generates two random source units and performs a total of 4 compilations to check if individual and combined compilation produce the same bytecode.

@bshastry
Copy link
Contributor

I'm also wondering if wrt #14494 (comment), it suffices to compare (1) and (3), thereby limiting it to two compilations per fuzzer iteration since source A and B are generated uniformly at random.

Also, in the original issue above, there seem to be more than two source files. #14494 (comment) says at least 6 source files were required to trigger issue, so I'm wondering if two source fuzzing setup would suffice. Maybe, generate N source files (N > 2), pick one contract from one source at random and check if individual and grouped (all N source files) report identical bytecode?

@ekpyron
Copy link
Member

ekpyron commented Sep 20, 2023

Yes, changing (1) and (3), respectively in general picking one specific contract, should suffice. The case of #14494 (comment) was a bit different and we already solved it in #14562 - that case relied on the files to actually import each other, which would make a fuzzing setup more complicated, but it's also less likely that there's more hidden issues like this.

So I'd focus on cases like originally reported here, which are in the Yul optimizer and won't involve importing, but just additional source files - but additional source files are also just the means for changing AST IDs (our guarantee is that changing AST IDs should not affect compilation result).

So in the meantime @cameel had the idea of #14548, which may work for tracing issues like this down more directly - for fuzzing this would just mean:

  • Generate a random contract A
  • Compile A via-ir optimized to bytecode (artifact 1)
  • Export the AST of the random source file to json.
  • In that json AST randomly permute all AST IDs.
  • Import the result and compile again via-ir optimized to bytecode (artifact 2)
  • Check that artifact 1 and artifact 2 are identical.

We wanted to build that as regular CI run on our test suite first to see if that already catches the issue we have on develop - if not, trying to fuzz with this mechanism on random contracts might be worth a try. The underlying issue is really to avoid the Yul optimizer to be affected by different AST IDs - this setup should catch exactly that.

@cameel
Copy link
Member

cameel commented Sep 22, 2023

I debugged this yesterday and found the problem - the direct cause of the disappearance of the var_mid variable in @ekpyron's snippet is the fact that CommonSubexpressionEliminator chooses a different replacement expression. This in turn is caused by a bug in DataFlowAnalyzer.

Here's the analysis I posted on the channel:

The reason that CSE chooses a different variable is that in one case DataFlowAnalyzer loses knowledge about the value of that variable and in the other that knowledge is present. Without that knowledge it won't substitute that variable in expressions. It instead moves on to choose another candidate.

The bug is in the variable clearing loop in DataFlowAnalyzer::clearValues(). The loop iterates over a set while adding new items to it, which probably results in some items being skipped and not cleared (if they happen to sort before the element we're currently at). Which elements are skipped depends on how they sort and that depends on variable names. In this case it's actually not just different AST IDs but a variable having an ID in one case and not having it in another: var_low vs var_low_7623. I suspect that just a different number would not cause this on their own because the IDs would still retain their order relative to each other.

Fixing the loop makes the variable stay in the generated IR. I haven't compared bytecode yet, but I'm convinced that this is the root cause.

The funny thing is that this bug is something that I already noticed a few months ago, while reviewing @nikola-matic's changes to the DataFlowAnalyser PR that's still pending: https://github.com/ethereum/solidity/pull/14112#discussion\_r1207017088. It looked fishy and broken, but not that serious. I did not even realize that it could have such deep consequences.

Today I prepared a proper fix for it (#14574) and verified that it really works. I also created a smaller repro to include as a regression test. At first I tried from Solidity level but it was too hard because the bug requires naming differences to be triggered and these differences are caused by the AST ID differences. It would probably have to be based on that multi-file repro I posted above. In the and I gave up and instead trimmed down the unoptimized Yul output. That reproduces the problem when passed through solc --strict-assembly --optimize.

Combined with @r0qs's fix, this now compiles the Standard JSON input from the description without bytecode differences

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 selected for development It's on our short-term development
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants