Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow invalid pointers in arrays and tuples #226

Merged
merged 2 commits into from Mar 1, 2024

Conversation

pacrob
Copy link
Contributor

@pacrob pacrob commented Feb 7, 2024

What was wrong?

Incorrect values for in pointers can cause problems. If a pointer value is not large enough, i.e. it points to an area in the payload that is still within the pointers section, the encoding is malformed. In certain situations, ~infinite loops can occur.

How was it fixed?

When decoding pointers, determine the location in the stream that divides pointers and values and make sure all pointers point past that location. Also check for pointers that point beyond the end of the payload.

Added some code comments to make it easier to remember how HeadTailDecoder works.

Added pytest-timeout to dependencies, as if the new tests are run without the added offset checking, they'll spin for a long time before failing.

Todo:

  • Clean up commit history

  • Clear any breakpoints

  • clean up testing

  • Add or update documentation related to these changes

  • Add entry to the release notes

Cute Animal Picture

image

@pacrob pacrob force-pushed the disallow-recursive-pointers branch 2 times, most recently from 401a381 to 9f99b5d Compare February 8, 2024 22:22
@pacrob pacrob force-pushed the disallow-recursive-pointers branch 6 times, most recently from 915b739 to 24afe35 Compare February 16, 2024 19:38
@pacrob pacrob changed the title Disallow recursive pointers in nested dynamic arrays Disallow malformed pointers in nested dynamic arrays Feb 16, 2024
@pacrob pacrob changed the title Disallow malformed pointers in nested dynamic arrays Disallow malformed pointers in dynamic arrays Feb 16, 2024
@pacrob pacrob force-pushed the disallow-recursive-pointers branch 9 times, most recently from 90ee2b7 to f7fcbd8 Compare February 20, 2024 23:21
@pacrob pacrob changed the title Disallow malformed pointers in dynamic arrays Disallow invalid pointers in arrays and tuples Feb 21, 2024
@pacrob pacrob marked this pull request as ready for review February 21, 2024 21:10
@pacrob pacrob force-pushed the disallow-recursive-pointers branch 2 times, most recently from a0ad898 to 04939a6 Compare February 22, 2024 19:36
@@ -131,6 +132,13 @@ def __call__(self, stream: ContextFramesBytesIO) -> Any:


class HeadTailDecoder(BaseDecoder):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@to_tuple # type: ignore[misc] # untyped decorator
def decode(self, stream: ContextFramesBytesIO) -> Generator[Any, None, None]:
self.validate_pointers(stream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use more context here. Could this be called in the loop below and maybe allow removal of the inner decoder loops inside validate_pointers? I'm also curious if the validation is necessary before decoding? Could validation just be part of the decode in HeadTailDecoder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to know how long the head section of a dynamic tuple will be until you have stepped through each decoder - if the decoder is for a dynamic type, it will be 32 bytes every time (because it's a pointer), but if it's for a non-dynamic array, there will be a single decoder for multiple chunks of 32 bytes.

I think it would be possible to take the logic from validate_pointers and put it in decode to eliminate the second loop through the decoders (where it actually checks the pointer values against the end_of_offsets). I like the current clarity and separation of concerns, but I can try if you like.

The validation needs to be in the tuple and array decoders, because only they have the context for how long they are. A HeadTailDecoder only has the info for a single dynamic value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now what the difference means, assuming there may never be more than a few decoders at a time I don't have any concerns.

@to_tuple # type: ignore[misc] # untyped decorator
def decode(self, stream: ContextFramesBytesIO) -> Generator[Any, None, None]:
self.validate_pointers(stream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now what the difference means, assuming there may never be more than a few decoders at a time I don't have any concerns.

end_of_offsets = current_location + 32 * len_of_head
total_stream_length = len(stream.getbuffer())
for decoder in self.decoders:
if isinstance(decoder, HeadTailDecoder):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It would be nice to share this logic across decoders, maybe this could become a utility function that could take the stream and an array_size, which could be called from here using array_size=1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit heard and politely declined. There is enough required difference in how tuples and arrays are checked that any logic extraction have a lot of if tuple/elseif array. And I don't foresee any future datastructures being created that would make use of such shared base methods, thus accept code that is ~repeated twice.

Copy link
Contributor

@kclowes kclowes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Nice work tracking it down! 🐞 I like the comments you made in the decoder too. Very helpful.

@pacrob pacrob merged commit 82c1ad3 into ethereum:main Mar 1, 2024
16 checks passed
@pacrob pacrob deleted the disallow-recursive-pointers branch March 1, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants