Skip to content
This repository has been archived by the owner on Oct 20, 2024. It is now read-only.

Bytecode padding directive #301

Open
Philogy opened this issue Oct 2, 2023 · 3 comments
Open

Bytecode padding directive #301

Philogy opened this issue Oct 2, 2023 · 3 comments

Comments

@Philogy
Copy link
Contributor

Philogy commented Oct 2, 2023

I'd love a directive that allows me to indicate a size n that at compile time should:

  1. Check the resulting size of the section of bytecode
  2. Pad with 0x00 (STOP) bytes up to size n
  3. Give me a compile-time error if the section is larger than n

The use case for this is efficient function dispatchers or internal switch-like statements that require code sections to be padded up to a consistent size. In METH this leads not only to relatively ugly code:

    dest_0x18:
        // 0x18160ddd
        __FUNC_SIG(totalSupply)
        __NON_PAYABLE_SELECTOR_CHECK()
        TOTAL_SUPPLY(callvalue)
        /* padding (45) */ stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop
    dest_0x19: __NO_MATCH()
    dest_0x1a: __NO_MATCH()
    dest_0x1b: __NO_MATCH()
    dest_0x1c: __NO_MATCH()
    dest_0x1d: __NO_MATCH()
    dest_0x1e: __NO_MATCH()
    dest_0x1f: __NO_MATCH()
    dest_0x20:
        // 0x205c2878
        __FUNC_SIG(withdrawTo)
        INVALID_NON_PAYABLE()
        WITHDRAW_TO(callvalue)
        /// @dev Selectors 0x21000000 - 0x21ffffff will exceptionally revert.
        /* padding (38) */ stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop

But is also very error-prone, if optimize a macro and it results in smaller/larger bytecode and I forget to adjust the padding all subsequent functions are broken due to the corrupted block boundaries. I've had to write a somewhat janky script that helps me identify where the initial block boundary violation happened.

I'd imagine the syntax for this to be something like this:

#define macro MAIN() = takes(0) returns(0) {
     // ... other logic
     dest_0x18: #define padded(0x3f) {
          // 0x18160ddd
          __FUNC_SIG(totalSupply)
          __NON_PAYABLE_SELECTOR_CHECK()
          TOTAL_SUPPLY(callvalue)
          // padding directive implicitly pads block to be 0x3f (63) bytes large
     }    
}

I'm offering a $100 bounty (to be paid in mainnet ETH) to the contributor who implements this once merged.

@lmanini
Copy link

lmanini commented Oct 12, 2023

ChallengeAcceptedActorGIF

@iFrostizz
Copy link
Contributor

What about making this a new jump table that does this instead of a new #define token ? It seems a convenient add for the "regular" and "packed" ones. Is there any case for which we wouldn't be able to determine the size of the "stop wall" at compile-time ? And also, does that mean that every jump location has to be less than x bytes long which might be constraining at some point ?

@Philogy
Copy link
Contributor Author

Philogy commented Oct 30, 2023

To construct a more efficient function dispatcher you want to directly convert a function selector into a jump destination while avoiding having to do a lookup in some table. This typically requires the entry JUMPDESTs for your functions to be at equally spaced intervals e.g.:

jump_offset = (selector % 16) * 64 // implies JUMPDESTs in 64-byte increments

However rarely will all your functions be exactly the length you need, so you need to add some padding e.g. (visualizations not actual Huff):

dispatcher()
[ dest ]  [            body            ]
  fn1:     <logic> <logic> <pad> <pad>
  fn2:     <logic>  <pad>  <pad> <pad>

The issue is what if you change the logic of one your functions because you found an optimization or want to add a feature, the padding is now invalid:

dispatcher()
[ dest ]  [            body            ]
  fn1:     <logic>  <pad> <pad>   fn2:
 <logic>    <pad>   <pad> <pad>    -

Ideally I have some directive I can use to wrap my functions in so that the padding is adjust automatically and I get helpful error message if I happen to exceed the set size e.g.:

dispatcher()
padded (64) { fn1:     <logic> <logic> }
padded (64) { fn2:     <logic>              }

Not quite sure how a new jump table type would achieve these, I guess it would change how you define the constraint of distance between labels, do you mean smth like this?:

#define macro MAIN() = takes(0) returns(0) {
    __DISPATCHER()

    fn1: FN1()
    fn2: FN2()
    no3: REVERT()
    no4: REVERT()
    fn5: FN5()
}

#define jumptable fixed_size(64) {
    fn1 fn2 no3 no4 fn5
}

I like this less as it feels less direct. Having padding be defined in macros seems cleaner and would allow you to reuse such blocks via macros.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants