Non-semantics-preserving refactoring for `zero_pad()` #1610

daejunpark · 2019-09-10T19:41:09Z

Version Information

vyper Version (output of vyper --version): 0.1.0b12+commit.8663ac5

What's your issue about?

(copied here from #1572 (review) for a better discussion)

The refactoring (09a8af7) made as part of PR #1572 does not preserve the original semantics.

This is the original code:

        new_maxlen = ceil32(source_lll.typ.maxlen)

        holder.append([
            'with', '_ceil32_end', ['ceil32', ['mload', dest_placeholder]], [
                'seq', ['with', '_bytearray_loc', dest_placeholder, [
                    'seq', ['repeat', zero_pad_i, ['mload', '_bytearray_loc'], new_maxlen, [
                        'seq',
                        # stay within allocated bounds
                        ['if', ['ge', ['mload', zero_pad_i], '_ceil32_end'], 'break'],
                        [
                            'mstore8',
                            ['add', ['add', '_bytearray_loc', 32], ['mload', zero_pad_i]],
                            0,
                        ],
                    ]],
                ]],
            ]
        ])

And this is the new code that replaces the above code entirely:

        holder.append(
            zero_pad(dest_placeholder, maxlen, zero_pad_i=zero_pad_i)
        )

where the definition of zero_pad() is:

def zero_pad(bytez_placeholder, maxlen, context=None, zero_pad_i=None):
    zero_padder = LLLnode.from_list(['pass'])
    if maxlen > 0:
        # Iterator used to zero pad memory.
        if zero_pad_i is None:
            zero_pad_i = context.new_placeholder(BaseType('uint256'))
        zero_padder = LLLnode.from_list([
            'repeat', zero_pad_i, ['mload', bytez_placeholder], ceil32(maxlen), [
                'seq',
                # stay within allocated bounds
                ['if', ['gt', ['mload', zero_pad_i], ceil32(maxlen)], 'break'],
                ['mstore8', ['add', ['add', 32, bytez_placeholder], ['mload', zero_pad_i]], 0]
            ]],
            annotation="Zero pad",
        )
    return zero_padder

In order for this refactoring to be semantics-preserving, the following two conditions need to be satisfied, but I have a counter-example that the first condition is not always met. I haven't found a counter-example for the second, but I'm not sure if it is always the case.

maxlen is equal to new_maxlen.
The runtime value of _ceil32_end is equal to new_maxlen.

How can it be fixed?

To fix the first case, add back the new_maxlen computation:

        new_maxlen = ceil32(source_lll.typ.maxlen)

and pass new_maxlen to zero_pad(), like:

        holder.append(
            zero_pad(dest_placeholder, new_maxlen, zero_pad_i=zero_pad_i)
        )

For the second case, either prove that the second condition is always met, or generalize zero_pad() to handle the case that the second condition is not met.

The text was updated successfully, but these errors were encountered:

daejunpark · 2019-09-10T19:44:27Z

FYI, for the loop terminating condition, the original code uses ge, but the new code uses gt, while the use of ge is correct. This problem was reported separately in #1599.

charles-cooper · 2019-09-10T20:23:53Z

I'm not sure the new code is incorrect, because maxlen is never used directly but only ceil32(maxlen). Actually I don't know that the original code is correct. It seems that it zeroes out at most one word (up to ceil32(runtime_len)) and doesn't zero out anything between ceil32(runtime_len) and ceil32(maxlen).

daejunpark · 2019-09-10T20:30:48Z

@charles-cooper There is a counter example that maxlen is 576, and new_maxlen is 64. So, the new code pads ~500 more zeros, i.e., ~500 more iterations.

The important thing is not ceil32, but source_lll.typ.maxlen. The maxlen is the size of the entire buffer, while new_maxlen is the length for the current element. For example, when constructing a log event data, such as https://github.com/ethereum/eth2.0-specs/blob/dev/deposit_contract/contracts/validator_registration.v.py#L85, the maxlen refers to the length of the entire log message, while the new_maxlen refers to the individual argument length.

charles-cooper · 2019-09-10T20:37:22Z

Thanks @daejunpark - please see 033d7df

charles-cooper · 2019-09-10T20:44:38Z

Thank you for your patience in explaining this issue. For condition 2 I pushed 5469fae.

daejunpark · 2019-09-10T20:56:12Z

@charles-cooper Awesome. Thanks! I'd like to compile the deposit contract to make sure that these changes produce the expected bytecode. How to pull your changes? (I cannot find it when git remote update.)

charles-cooper · 2019-09-10T21:10:00Z

@daejunpark I believe you can find it on my fork of vyper

daejunpark · 2019-09-10T21:23:29Z

Thanks! (I didn't know that Github also shows the commits of forks.)

The deposit contract IR generated by your zero_pad branch looks good to me. I'll re-run the verification once a new Vyper version that contains your changes is released.

charles-cooper · 2019-09-10T21:55:01Z

@daejunpark I also added the analogous fixes to #1605 so it is semantically on par with my zero_pad branch. It does look a lot clearer to me since we can dispense with maxlen entirely.

charles-cooper · 2019-10-01T12:41:06Z

@daejunpark #1611 was released in beta-13. I want to get #1605 into beta-14, what is the process for checking that the eth2.0 deposit contract tests pass?

daejunpark · 2019-10-07T09:05:10Z

@charles-cooper sorry for the delay in replying. If you meant the formal verification, I'll need to re-run the verification against the new compiler version. If you meant the regular testing, you can refer to: #1563 (comment)

charles-cooper · 2019-10-24T08:35:30Z

Since both #1611 and #1605 have been merged I think this can be closed. Please reopen if this is still an issue.

daejunpark mentioned this issue Sep 10, 2019

Off-by-one error in zero_pad #1599

Closed

This was referenced Sep 16, 2019

Fix termination condition in zero_pad #1611

Merged

Zero pad from calldata #1605

Merged

charles-cooper closed this as completed Oct 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-semantics-preserving refactoring for `zero_pad()` #1610

Non-semantics-preserving refactoring for `zero_pad()` #1610

daejunpark commented Sep 10, 2019

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019 •

edited

Loading

charles-cooper commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

charles-cooper commented Oct 1, 2019

daejunpark commented Oct 7, 2019

charles-cooper commented Oct 24, 2019

Non-semantics-preserving refactoring for zero_pad() #1610

Non-semantics-preserving refactoring for zero_pad() #1610

Comments

daejunpark commented Sep 10, 2019

Version Information

What's your issue about?

How can it be fixed?

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019 • edited Loading

charles-cooper commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

daejunpark commented Sep 10, 2019

charles-cooper commented Sep 10, 2019

charles-cooper commented Oct 1, 2019

daejunpark commented Oct 7, 2019

charles-cooper commented Oct 24, 2019

Non-semantics-preserving refactoring for `zero_pad()` #1610

Non-semantics-preserving refactoring for `zero_pad()` #1610

daejunpark commented Sep 10, 2019 •

edited

Loading