Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC chunks with streaming SSZ decoding, snappy frames, and stricter DOS limits where possible. #1606

Merged
merged 3 commits into from
Mar 9, 2020

Conversation

protolambda
Copy link
Collaborator

This is a proposal to focus on length-encoding SSZ contents, enable streaming of chunk contents, and put stricter DOS limits in place.

ssz without compression is still exactly the same, testnets can live on.

So far I am aware that Prysm has non-streaming Snappy compression in place as option, and Lighthouse has a PR open that made a start on snappy support, but blocked by questions/ongoing network changes by Adrian. In other clients, such as Lodestar, Artemis and Nimbus, I could find the Snappy dependency, but no implementation of ssz_snappy for RPC.

The motivation here is that it is relatively quick to calculate the encoded length of an SSZ value, since fixed-length values are easily computed from the type, and list values often from multiplying a fixed-length element size with a list length.

Given a SSZ length, a SSZ decoder can directly read contents from the stream, avoiding the need for a temporary buffer. Additionally, snappy-frames avoid the need for more than a single frame of buffered bytes to decompress data (worst case 2**16=65536 bytes), avoiding the need to buffer the compressed bytes fully.
The writer just calculates the SSZ length, and can then stream-encode the contents, even if using compression. Instead of having to fully buffer the compressed data to get the compressed length.

Additionally, this puts better DOS protection in place, as we can calculate the maximum size of a SSZ object, and should use it to be stricter about inputs. And given the worst-case snappy encoded length, we can account for compressed bytes even if we don't know the exact number of compressed bytes in advance. We don't need to know the exact number, as we can just verify the decompressed bytes instead, while checking the bytes read from the stream against the limits we should be checking regardless.

…f chunk contents, and put stricter DOS limits in place
@protolambda
Copy link
Collaborator Author

@arnetheduck @AgeManning @nisdas Please have a look at this proposal. The PR description may not be that clear, the updated networking doc explains it best.

A reader:
- SHOULD not read more than `max_encoded_len(n)` bytes (`32 + n + n/6`) after reading the SSZ length prefix `n` from the header, [this is considered the worst-case compression result by Snappy](https://github.com/google/snappy/blob/537f4ad6240e586970fe554614542e9717df7902/snappy.cc#L98).
- SHOULD not accept a SSZ length prefix that is bigger than the expected maximum length for the SSZ type (derived from SSZ type information such as vector lengths and list limits).
- MUST consider remaining bytes, after having read the `n` SSZ bytes, as an invalid input. An EOF is expected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be clarified, if we read more than n bytes, we should return an error on the request/response and ignore the data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@protolambda can you clean this one up too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looking into this now.

Snappy has two formats: "block" and "frames" (streaming). To support large requests and response chunks, snappy-framing is used.

Since snappy frame contents [have a maximum size of `65536` bytes](https://github.com/google/snappy/blob/master/framing_format.txt#L104)
and frame headers are just `identifier (1) + checksum (4)` bytes, the expected buffering of a single frame is acceptable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what I understand is now we have two length prefixes encoded into our data ?
one is for snappy frames in the frame header and the other for our ssz objects in their decompressed form ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, 1 Snappy frame doesn't necessarily match 1 eth2 rpc chunk. With status quo it does match one snappy block, but it also can't stream within a chunk.

@AgeManning
Copy link
Contributor

I think it's reasonable to use the framed format of snappy over the wire. It may not be super important for us now given current objects but could be more useful for larger objects in the future.

In principle, we could have a snappy_framed and snappy_block protocols and clients could take preference over either, but i'm happy with this current proposal.

@djrtwo
Copy link
Contributor

djrtwo commented Feb 15, 2020

I am pro frames.
@protolambda is there anything we need to do here to get this merged?

specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
djrtwo and others added 2 commits February 27, 2020 12:00
Co-Authored-By: Hsiao-Wei Wang <hwwang156@gmail.com>
@protolambda
Copy link
Collaborator Author

Clarified the invalid input handling, and cleaned up to sum the different cases better. Can I get a final review?

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants