Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-semantics-preserving refactoring for zero_pad() #1610

Closed
daejunpark opened this issue Sep 10, 2019 · 12 comments
Closed

Non-semantics-preserving refactoring for zero_pad() #1610

daejunpark opened this issue Sep 10, 2019 · 12 comments

Comments

@daejunpark
Copy link
Contributor

Version Information

  • vyper Version (output of vyper --version): 0.1.0b12+commit.8663ac5

What's your issue about?

(copied here from #1572 (review) for a better discussion)

The refactoring (09a8af7) made as part of PR #1572 does not preserve the original semantics.

This is the original code:

        new_maxlen = ceil32(source_lll.typ.maxlen)

        holder.append([
            'with', '_ceil32_end', ['ceil32', ['mload', dest_placeholder]], [
                'seq', ['with', '_bytearray_loc', dest_placeholder, [
                    'seq', ['repeat', zero_pad_i, ['mload', '_bytearray_loc'], new_maxlen, [
                        'seq',
                        # stay within allocated bounds
                        ['if', ['ge', ['mload', zero_pad_i], '_ceil32_end'], 'break'],
                        [
                            'mstore8',
                            ['add', ['add', '_bytearray_loc', 32], ['mload', zero_pad_i]],
                            0,
                        ],
                    ]],
                ]],
            ]
        ])

And this is the new code that replaces the above code entirely:

        holder.append(
            zero_pad(dest_placeholder, maxlen, zero_pad_i=zero_pad_i)
        )

where the definition of zero_pad() is:

def zero_pad(bytez_placeholder, maxlen, context=None, zero_pad_i=None):
    zero_padder = LLLnode.from_list(['pass'])
    if maxlen > 0:
        # Iterator used to zero pad memory.
        if zero_pad_i is None:
            zero_pad_i = context.new_placeholder(BaseType('uint256'))
        zero_padder = LLLnode.from_list([
            'repeat', zero_pad_i, ['mload', bytez_placeholder], ceil32(maxlen), [
                'seq',
                # stay within allocated bounds
                ['if', ['gt', ['mload', zero_pad_i], ceil32(maxlen)], 'break'],
                ['mstore8', ['add', ['add', 32, bytez_placeholder], ['mload', zero_pad_i]], 0]
            ]],
            annotation="Zero pad",
        )
    return zero_padder

In order for this refactoring to be semantics-preserving, the following two conditions need to be satisfied, but I have a counter-example that the first condition is not always met. I haven't found a counter-example for the second, but I'm not sure if it is always the case.

  1. maxlen is equal to new_maxlen.
  2. The runtime value of _ceil32_end is equal to new_maxlen.

How can it be fixed?

To fix the first case, add back the new_maxlen computation:

        new_maxlen = ceil32(source_lll.typ.maxlen)

and pass new_maxlen to zero_pad(), like:

        holder.append(
            zero_pad(dest_placeholder, new_maxlen, zero_pad_i=zero_pad_i)
        )

For the second case, either prove that the second condition is always met, or generalize zero_pad() to handle the case that the second condition is not met.

@daejunpark
Copy link
Contributor Author

FYI, for the loop terminating condition, the original code uses ge, but the new code uses gt, while the use of ge is correct. This problem was reported separately in #1599.

@charles-cooper
Copy link
Member

I'm not sure the new code is incorrect, because maxlen is never used directly but only ceil32(maxlen). Actually I don't know that the original code is correct. It seems that it zeroes out at most one word (up to ceil32(runtime_len)) and doesn't zero out anything between ceil32(runtime_len) and ceil32(maxlen).

@daejunpark
Copy link
Contributor Author

daejunpark commented Sep 10, 2019

@charles-cooper There is a counter example that maxlen is 576, and new_maxlen is 64. So, the new code pads ~500 more zeros, i.e., ~500 more iterations.

The important thing is not ceil32, but source_lll.typ.maxlen. The maxlen is the size of the entire buffer, while new_maxlen is the length for the current element. For example, when constructing a log event data, such as https://github.com/ethereum/eth2.0-specs/blob/dev/deposit_contract/contracts/validator_registration.v.py#L85, the maxlen refers to the length of the entire log message, while the new_maxlen refers to the individual argument length.

@charles-cooper
Copy link
Member

Thanks @daejunpark - please see 033d7df

@charles-cooper
Copy link
Member

Thank you for your patience in explaining this issue. For condition 2 I pushed 5469fae.

@daejunpark
Copy link
Contributor Author

@charles-cooper Awesome. Thanks! I'd like to compile the deposit contract to make sure that these changes produce the expected bytecode. How to pull your changes? (I cannot find it when git remote update.)

@charles-cooper
Copy link
Member

@daejunpark I believe you can find it on my fork of vyper

@daejunpark
Copy link
Contributor Author

Thanks! (I didn't know that Github also shows the commits of forks.)

The deposit contract IR generated by your zero_pad branch looks good to me. I'll re-run the verification once a new Vyper version that contains your changes is released.

@charles-cooper
Copy link
Member

@daejunpark I also added the analogous fixes to #1605 so it is semantically on par with my zero_pad branch. It does look a lot clearer to me since we can dispense with maxlen entirely.

@charles-cooper
Copy link
Member

@daejunpark #1611 was released in beta-13. I want to get #1605 into beta-14, what is the process for checking that the eth2.0 deposit contract tests pass?

@daejunpark
Copy link
Contributor Author

@charles-cooper sorry for the delay in replying. If you meant the formal verification, I'll need to re-run the verification against the new compiler version. If you meant the regular testing, you can refer to: #1563 (comment)

@charles-cooper
Copy link
Member

Since both #1611 and #1605 have been merged I think this can be closed. Please reopen if this is still an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants