Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ecrecover] Legacy codegen cleans dirty storage bits on read but Sol->Yul on write itself #11602

Closed
bshastry opened this issue Jul 1, 2021 · 13 comments

Comments

@bshastry
Copy link
Contributor

bshastry commented Jul 1, 2021

contract C {
  bytes s0;
  constructor() {
    ecrecover(
        0,
        1,
        bytes32(0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa),
        bytes32(0x1337133713371337133713371337133713371337133713371337133713371337)
    ).code;
    s0=("thisisa33bytearraythisisa33bytear");
  }
  function f() external returns (bytes32 r) {
      assembly {
          mstore(0, s0.slot)
          r := sload(add(keccak256(0, 32), 1))
      }
  }
}
// ====
// compileViaYul: also
// ----
// f() -> 0x7237133713371337133713371337133713371337133713371337133713371337

This test checks the value obtained via inline assembly from the storage slot 32 bytes after the start of storage variable s0.

Ideally, this slot should return the last byte of the 33-byte bytes array literal that is being assigned to s0 (which would be the hex of the ascii value of r i.e., 0x72). However, what is actually returned includes "dirty bits" that seem to be from the last argument to the ecrecover call preceeding the storage variable assignment in the constructor.

The semantic test above passes via legacy code gen but fails via Yul. Via Yul the return value from f() is 0x7200...00.

Please note that the legacy codegen does clean up the dirty bits but lazily such that non inline assembly accesses e.g., r = s0[32]; returns 0x7200...00 as expected i.e., the cleaned up value from the adjacent slot.

@bshastry bshastry added this to New issues in Solidity via automation Jul 1, 2021
@bshastry bshastry changed the title [codegen] Legacy codegen retains dirty storage bits but Sol->Yul does not [codegen] Legacy codegen cleans dirty storage bits on read but Sol->Yul on write itself Jul 1, 2021
@bshastry
Copy link
Contributor Author

bshastry commented Jul 1, 2021

I'm not sure if this is a "bug" as such. Perhaps sufficient to document in IR breaking changes docs? We already document the fact that dirty storage bits may be retained by the legacy codegen using the following example. But there the "dirtying" is visible to the auditor's eyes unlike the present example (via a call to ecrecover)

    contract C {
        bytes x;
        function f() public returns (uint _r) {
            bytes memory m = "tmp";
            assembly {
                mstore(m, 8)
                mstore(add(m, 32), "deadbeef15dead")
            }
            x = m;
            assembly {
                _r := sload(x.slot)
            }
        }
    }

@bshastry bshastry changed the title [codegen] Legacy codegen cleans dirty storage bits on read but Sol->Yul on write itself [ecrecover] Legacy codegen cleans dirty storage bits on read but Sol->Yul on write itself Jul 1, 2021
@chriseth
Copy link
Contributor

chriseth commented Jul 1, 2021

Whether or not the dirtying is visible or not does not change the issue in my opinion.

@bshastry
Copy link
Contributor Author

bshastry commented Jul 1, 2021

Whether or not the dirtying is visible or not does not change the issue in my opinion.

I'm sorry, didn't follow. Is the argument that this issue is not different from the documented example in the comment above (where deadbeef15dead is assigned)? Or otherwise i.e., it is different?

@chriseth
Copy link
Contributor

chriseth commented Jul 1, 2021

Sorry: I think this is the same issue as the one we already have documented. There are numerous places where memory can get dirty, not only in explicit mstore operations.

@bshastry
Copy link
Contributor Author

bshastry commented Jul 1, 2021

Sorry: I think this is the same issue as the one we already have documented. There are numerous places where memory can get dirty, not only in explicit mstore operations.

If I understood it correctly, what is being dirtied here via the ecrecover call is storage, not memory, right? At least this is what the fuzzer reported.

The fuzzer does not compare memory traces anyway.

@bshastry
Copy link
Contributor Author

bshastry commented Oct 1, 2021

Summary of discussion with Chris on gitter:

Most likely explanation for this issue is that the string thisisa33bytearray... is copied to memory next to one of the arguments to ecrecover that are not cleared after the call. Because the 31 bytes next to the string contain garbage, and the bytes copy from memory to storage does not clean dirty bytes, these dirty bytes are reflected in storage as well. These dirty bytes are only readable via inline assembly.

The main difference between this issue and #12050 is that the dirty bytes are not part of the underlying type i.e., the length of bytes is specified by its first byte.

@bshastry
Copy link
Contributor Author

bshastry commented May 19, 2022

In the semantic test below, s2 contains dirty bits via legacy when accessed via inline assembly, IR does not. Essentially

  • call to g() via legacy returns (r1 = 0x3030...62; r2 = 0x37...02)
  • call to g() via IR returns (r1 = 0x3030...62 i.e., same as legacy; but r2 = 0x37...00` i.e., the last byte zero)

I'm not sure where the stray 0x02 (last byte that creeps into storage originates from but my guess would be the length parameter of the function pointer array parameter of emitted event since commenting out the emit call makes the stray byte go away)

contract C{
  event ev0(function (bytes10, uint208) external returns (bytes27)[] i0, string i1);
  string  public s2 ;
  bytes13[] s3;
  constructor() {
    emit ev0(new function (bytes10, uint208) external returns(bytes27)[](2), (""));
    (s2, s3)= (string("00000000000000000000000004e57bdb7"), [ bytes13(0xffffffffffffffffffffffffff)]);
  }
  function g() external returns (uint r1, uint r2) {
    assembly {
      mstore(0, s2.slot)
      let slot := keccak256(0, 32)
      r1 := sload(slot)
      r2 := sload(add(slot, 1))
    }
  }
}
// ====
// compileViaYul: also
// ----
// constructor() ->
// ~ emit ev0(function[],string): 0x40, 0xa0, 0x02, 0x00, 0x00, 0x00
// gas irOptimized: 303941
// gas legacyOptimized: 279801
// g() -> 0x3030303030303030303030303030303030303030303030303034653537626462, 0x3700000000000000000000000000000000000000000000000000000000000002
// gas irOptimized: 277193
// gas legacy: 295222
// gas legacyOptimized: 257286

@ekpyron
Copy link
Member

ekpyron commented May 19, 2022

Slightly simpler repro:

contract C{
  event ev0(function (bytes10, uint208) external returns (bytes27)[] i0, string i1);
  string  public s2 ;
  constructor() {
    emit ev0(new function (bytes10, uint208) external returns(bytes27)[](2), (""));
    s2 = string("00000000000000000000000004e57bdb7");
  }
  function h() external returns (string memory) {
        assembly {
                sstore(s2.slot, 129) // forced resize
        }
        return s2;
  }
}
// ====
// compileViaYul: also
// ----
// constructor() ->
// ~ emit ev0(function[],string): 0x40, 0xa0, 0x02, 0x00, 0x00, 0x00
// gas irOptimized: 277193
// gas legacy: 295222
// gas legacyOptimized: 257286
// h() -> 0x20, 0x40, 0x3030303030303030303030303030303030303030303030303034653537626462, 0x3700000000000000000000000000000000000000000000000000000000000002

I guess there's dirty memory before the assignment to s2 that's not properly cleared...
The desired behaviour is the IR behaviour - storage should be clean at all times...
So looks like a genuine minor bug in legacy codegen to me.

@ekpyron
Copy link
Member

ekpyron commented May 19, 2022

Actually:

contract C {
  event ev0(function (bytes10, uint208) external returns (bytes27)[] i0, string i1);
  bytes public s2 ;
  constructor() {
    emit ev0(new function (bytes10, uint208) external returns(bytes27)[](2), (""));
    s2 = bytes("00000000000000000000000004e57bdb7");
  }
  function h() external returns (bytes memory) {
        for (uint i = 0; i < 31; ++i)
                s2.push();
        return s2;
  }
}
// ====
// compileViaYul: false
// ----
// constructor() ->
// ~ emit ev0(function[],string): 0x40, 0xa0, 0x02, 0x00, 0x00, 0x00
// gas irOptimized: 277193
// gas legacy: 295222
// gas legacyOptimized: 257286
// h() -> 0x20, 0x40, 0x3030303030303030303030303030303030303030303030303034653537626462, 0x3700000000000000000000000000000000000000000000000000000000000002

has the issue without any inline assembly...

@ekpyron ekpyron moved this from New issues to Implementation Backlog Important in Solidity May 19, 2022
@ekpyron
Copy link
Member

ekpyron commented May 19, 2022

Also happens with a less exotic event...

contract C {
  event ev0(uint[] i0, string i1);
  bytes public s2;
  constructor() {
    emit ev0(new uint[](2), (""));
    s2 = bytes("00000000000000000000000004e57bdb7");
  }
  function h() external returns (bytes memory) {
        for (uint i = 0; i < 31; ++i)
                s2.push();
        return s2;
  }
}
// ====
// compileViaYul: false
// ----
// constructor() ->
// ~ emit ev0(function[],string): 0x40, 0xa0, 0x02, 0x00, 0x00, 0x00
// gas irOptimized: 277193
// gas legacy: 340105
// gas legacyOptimized: 257286
// h() -> 0x20, 0x40, 0x3030303030303030303030303030303030303030303030303034653537626462, 0x3700000000000000000000000000000000000000000000000000000000000002

@ekpyron
Copy link
Member

ekpyron commented May 19, 2022

Another simplification:

contract C {
  event ev0(uint[] i0, uint);
  bytes public s2;
  constructor() {
    emit ev0(new uint[](2), 0); // or probably anything else leaving memory dirty in the right spot
    bytes memory m = new bytes(63);
    s2 = m;
  }
  function h() external returns (bytes memory) {
        s2.push();
        return s2;
  }
}
// ====
// compileViaYul: false
// ----
// constructor() ->
// ~ emit ev0(uint256[],string): 0x40, 0xa0, 0x02, 0x00, 0x00, 0x00
// h() -> 0x20, 0x40, 0x00, 0x02

...this is starting to look somewhat serious to me...

@cameel
Copy link
Member

cameel commented Jun 13, 2022

So do we consider #13045 a complete fix for this or are there potentially more cases that we need to explore?

@ekpyron
Copy link
Member

ekpyron commented Jun 15, 2022

Reading through this again, I think #13045 should have been a complete fix, so I'm closing this.

@ekpyron ekpyron closed this as completed Jun 15, 2022
Solidity automation moved this from Implementation Backlog Important to Done Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Solidity
  
Done
Development

No branches or pull requests

4 participants