Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC chunks with streaming SSZ decoding, snappy frames, and stricter DOS limits where possible. #1606

Merged
merged 3 commits into from Mar 9, 2020

Conversation

@protolambda
Copy link
Contributor

protolambda commented Feb 4, 2020

This is a proposal to focus on length-encoding SSZ contents, enable streaming of chunk contents, and put stricter DOS limits in place.

ssz without compression is still exactly the same, testnets can live on.

So far I am aware that Prysm has non-streaming Snappy compression in place as option, and Lighthouse has a PR open that made a start on snappy support, but blocked by questions/ongoing network changes by Adrian. In other clients, such as Lodestar, Artemis and Nimbus, I could find the Snappy dependency, but no implementation of ssz_snappy for RPC.

The motivation here is that it is relatively quick to calculate the encoded length of an SSZ value, since fixed-length values are easily computed from the type, and list values often from multiplying a fixed-length element size with a list length.

Given a SSZ length, a SSZ decoder can directly read contents from the stream, avoiding the need for a temporary buffer. Additionally, snappy-frames avoid the need for more than a single frame of buffered bytes to decompress data (worst case 2**16=65536 bytes), avoiding the need to buffer the compressed bytes fully.
The writer just calculates the SSZ length, and can then stream-encode the contents, even if using compression. Instead of having to fully buffer the compressed data to get the compressed length.

Additionally, this puts better DOS protection in place, as we can calculate the maximum size of a SSZ object, and should use it to be stricter about inputs. And given the worst-case snappy encoded length, we can account for compressed bytes even if we don't know the exact number of compressed bytes in advance. We don't need to know the exact number, as we can just verify the decompressed bytes instead, while checking the bytes read from the stream against the limits we should be checking regardless.

…f chunk contents, and put stricter DOS limits in place
@protolambda

This comment has been minimized.

Copy link
Contributor Author

protolambda commented Feb 4, 2020

@arnetheduck @AgeManning @nisdas Please have a look at this proposal. The PR description may not be that clear, the updated networking doc explains it best.

A reader:
- SHOULD not read more than `max_encoded_len(n)` bytes (`32 + n + n/6`) after reading the SSZ length prefix `n` from the header, [this is considered the worst-case compression result by Snappy](https://github.com/google/snappy/blob/537f4ad6240e586970fe554614542e9717df7902/snappy.cc#L98).
- SHOULD not accept a SSZ length prefix that is bigger than the expected maximum length for the SSZ type (derived from SSZ type information such as vector lengths and list limits).
- MUST consider remaining bytes, after having read the `n` SSZ bytes, as an invalid input. An EOF is expected.

This comment has been minimized.

Copy link
@nisdas

nisdas Feb 4, 2020

Contributor

This should be clarified, if we read more than n bytes, we should return an error on the request/response and ignore the data.

This comment has been minimized.

Copy link
@djrtwo

djrtwo Feb 27, 2020

Contributor

@protolambda can you clean this one up too?

This comment has been minimized.

Copy link
@protolambda

protolambda Feb 27, 2020

Author Contributor

Yes, looking into this now.

Snappy has two formats: "block" and "frames" (streaming). To support large requests and response chunks, snappy-framing is used.

Since snappy frame contents [have a maximum size of `65536` bytes](https://github.com/google/snappy/blob/master/framing_format.txt#L104)
and frame headers are just `identifier (1) + checksum (4)` bytes, the expected buffering of a single frame is acceptable.

This comment has been minimized.

Copy link
@nisdas

nisdas Feb 4, 2020

Contributor

So what I understand is now we have two length prefixes encoded into our data ?
one is for snappy frames in the frame header and the other for our ssz objects in their decompressed form ?

This comment has been minimized.

Copy link
@protolambda

protolambda Feb 4, 2020

Author Contributor

Possibly, 1 Snappy frame doesn't necessarily match 1 eth2 rpc chunk. With status quo it does match one snappy block, but it also can't stream within a chunk.

@AgeManning

This comment has been minimized.

Copy link
Contributor

AgeManning commented Feb 6, 2020

I think it's reasonable to use the framed format of snappy over the wire. It may not be super important for us now given current objects but could be more useful for larger objects in the future.

In principle, we could have a snappy_framed and snappy_block protocols and clients could take preference over either, but i'm happy with this current proposal.

@djrtwo

This comment has been minimized.

Copy link
Contributor

djrtwo commented Feb 15, 2020

I am pro frames.
@protolambda is there anything we need to do here to get this merged?

specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
specs/phase0/p2p-interface.md Outdated Show resolved Hide resolved
djrtwo and others added 2 commits Feb 27, 2020
Co-Authored-By: Hsiao-Wei Wang <hwwang156@gmail.com>
@protolambda

This comment has been minimized.

Copy link
Contributor Author

protolambda commented Feb 27, 2020

Clarified the invalid input handling, and cleaned up to sum the different cases better. Can I get a final review?

@djrtwo
djrtwo approved these changes Mar 9, 2020
Copy link
Contributor

djrtwo left a comment

looks good to me

@djrtwo djrtwo merged commit 6230a22 into dev Mar 9, 2020
10 checks passed
10 checks passed
ci/circleci: checkout_specs Your tests passed on CircleCI!
Details
ci/circleci: codespell Your tests passed on CircleCI!
Details
ci/circleci: install_deposit_contract_compiler Your tests passed on CircleCI!
Details
ci/circleci: install_deposit_contract_tester Your tests passed on CircleCI!
Details
ci/circleci: install_pyspec_test Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
ci/circleci: table_of_contents Your tests passed on CircleCI!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
ci/circleci: test_compile_deposit_contract Your tests passed on CircleCI!
Details
ci/circleci: test_deposit_contract Your tests passed on CircleCI!
Details
@djrtwo djrtwo deleted the rpc-snappy-lengths branch Mar 9, 2020
@terencechain terencechain mentioned this pull request Mar 22, 2020
16 of 16 tasks complete
@nisdas nisdas mentioned this pull request Mar 23, 2020
2 of 2 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.