Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to use SSZ for consensus only #129

Closed
rawfalafel opened this issue Nov 14, 2018 · 16 comments
Closed

Proposal to use SSZ for consensus only #129

rawfalafel opened this issue Nov 14, 2018 · 16 comments
Labels
general:RFC Request for Comments

Comments

@rawfalafel
Copy link
Contributor

rawfalafel commented Nov 14, 2018

Instead of using ssz for both network and consensus, I think we should use ssz only for consensus and standardize around protobuf at the network layer.

Benefits

  • Eliminates the ssz decoder since there's no need for the encoding to be reversible.
  • Eliminates the need for a length prefix in the ssz encoder. Same reason as above.
  • Allows the ssz encoder to be space inefficient (e.g. padding) since the result doesn't need to be stored or sent over the wire.
  • protobuf is already battle-tested and fits this use case well.
  • protobuf supports schemas and versioning out of the box.
  • protobuf already has solid libraries for many languages.
  • protobuf's code generation provides good ergonomics for developers.

Potential conerns

  • Supporting two serializers increases complexity
    • In my opinion, inventing a serializer that satisfies the needs of both consensus and networking is much more complex.
    • Using protobuf for networking is way simpler. This is definitely true for Prysm, and I imagine this is true for other implementation teams as well, but I'd like to hear other team's opinions.
    • ssz becomes simpler if its only used for consensus.
  • Re-encoding a block from the wire is inefficient
    • The cost of re-encoding is negligible when compared to the other steps necessary for verification. The byte array needs to be decoded, the encoding needs to be hashed, and the signature needs to be verified.
  • Protobuf is bad because [insert opinion here]
    • I'm open to other libraries, but protobuf seems like a good fit for the reasons stated above.

Previous discussions

@arnetheduck
Copy link
Contributor

Recapping, non-determinism (unspecified field ordering, overlong encodings allowed) and complexity of additional dependency were cited reasons for discarding it the first time around, while machine-readable schema and compactness of varint encoding were main positives. The data-type fit is also not that good, with no native types for fixed-length arrays (hashes, addresses) and unnecessary types like zigzags and floats adding surface area - this can be seen in the maximum value tests @paulhauner points to here: https://ethresear.ch/t/discussion-p2p-message-serialization-standard/2781/16

other team's opinions

For nimbus, we'll end up writing custom decoders regardless of chosen format - generated protobuf code tends to be optimized for generic use cases and given data type mismatch will need a fair bit of "manual" processing anyway. No net simplicity gain, quite the opposite for the client itself (not considering 3rd party tooling).

Wire encoding is an aspect of the protocol that's easy to change once the semantics nailed - thus we're also happy to delay a final decision for when more hard data is available.

Good points on keeping the two concerns (network/consensus) separate, during the design phase at least.

@paulhauner
Copy link
Contributor

So I see two points here;

  • p2p and hashing should use distinct serialization schemes.
  • p2p serialization should be protobuf.

Distinct serialization schemes

Regarding using distinct serialization schemes, I have the following points:

  1. I agree that hashing would be more space/time efficient if we don't serialize the length bytes. I don't have a clear idea as to the volume of this efficiency.
  2. I don't see any reasoning here why we shouldn't use SSZ for the p2p format. E.g., it's bloated or slow.
  3. Implementing a SSZ decoder is not very difficult. I've implemented one myself and I know a few other teams have also.

Protobuf

Here are my reservations regarding using protobuf:

  1. It is externally-defined. I like that the RLP and SSZ specifications are designed and maintained by Ethereum and their design scope is limited strictly to Ethereum.
  2. I don't see using protobuf as simplifying the spec by removing the SSZ decoder. Instead, I see it as adding complexity to the spec by effectively importing the entire protobuf spec. With protobuf we'd have 2x encoders and 1x decoder. The protobuf spec is generic and I understand it has features that we don't require.
  3. It is maintained by Google. This is a bit philosophical and murky, but baking Google-maintained code into the P2P layer seems to go somewhat against the grain of the "decentralized web". (It's my understanding that libp2p uses protobuf, so maybe this has already happened?).
  4. It's non-deterministic. "Consensus hashing" is now merkelized, but one could still keep records of non-merkelized SSZ blobs to determine if they're known or not. I'm not sure if this will be required, but it seems like a nice feature to have up our sleeves.

Summary

I can see there would be some gains in hashing speed by dropping length bytes from SSZ, but I'm uncertain as to how significant they are. It's my understanding that this spec targets simplicity over hardcore optimisation and my concern is that introducing a new p2p serialization format lies too far on the side of optimization.

I'd like to see the following before I'd consider a switch to separate P2P and hashing serialization:

  • Clear reasoning as to why SSZ shouldn't be used on the wire.
  • Benchmarks that demonstrate the speed gains in merkle hashing that are obtained from dropping the length bytes.

@hwwhww hwwhww added the general:RFC Request for Comments label Nov 15, 2018
@mkalinin
Copy link
Collaborator

Considerations

  • IMO, whatever serialization algorithm is used its implementation should be a part of client's codebase to eliminate risks occurring with usage of external libs; these risks are pretty big in our context.
  • Hashing algorithm that is about to get merged soon Added tree hashing algorithm #120.
    This hashing algorithm is a kind of serialization+hashing out of the box. It is, also, much more memory efficient than plain hash(SSZ(obj)).

Benefits

  • Eliminates the ssz decoder.

But introduces protobuf spec implementation (in the light of maintaining own protobuf impls).

  • Eliminates the need for a length prefix.

As I understand it relates to faster hashing. Hashing algorithm mentioned above doesn't work with lengths prefixes and it, also, does not create copies of object's data.

  • Allows the ssz encoder to be space inefficient.

New hashing algorithm uses paddings but they don't affect SSZ anyhow.

  • protobuf is already battle-tested and fits this use case well.

This is a real thing. There should be a bunch of tests for SSZ including those that check overflows which are effectively discovered by fuzz testing. But in case of custom protobug implementation we're getting to the same problem here as with SSZ.

  • protobuf supports schemas and versioning out of the box.

SSZ serializes objects in a pretty deterministic way without schema definitions which IMO is enough for eth 2.0 needs. Versioning is a good thing, SSZ is not forward compatible in terms of schema update.

Summary

As a p2p serialization format SSZ lacks forward compatibility in message schemas. It could become a real problem in the future cause even after several versions of PoC there might occur a need of adding something to block, attestation or whatever other messages. A good example is chainId field that had to be added to TransactionMessage in eth 1.0.

SSZ spec is pretty simple and with good test coverage it should be easy implementable with high level of reliability. And in my opinion it's important that the spec is maintained by Ethereum community.

@vbuterin
Copy link
Contributor

Is this advocating that data structures like blocks should be decomposed and then re-serialized using protobuf instead of SSZ over the wire? If so, that seems like a large increase in complexity...

What's wrong with the current approach of sending consensus objects around in the same byte format in which they appear in consensus? The network protocol that's used to wrap the objects could still be protobuf or whatever else if desired.

I agree that hashing would be more space/time efficient if we don't serialize the length bytes. I don't have a clear idea as to the volume of this efficiency.

It's tiny. The main hashing cost is state recalculations, and in this case, account objects are fixed size and lists tend to be very long so the marginal contribution of length prefixes is very low.

In general, sacrificing the property that every object has exactly one way to encode it as bytes and going down this path reeks of sacrificing essential simplicity for a little temporary over-optimization....

@rawfalafel
Copy link
Contributor Author

Thanks for the feedback everyone!

What's wrong with the current approach of sending consensus objects around in the same byte format in which they appear in consensus? The network protocol that's used to wrap the objects could still be protobuf or whatever else if desired.

Is it important that the network sends consensus objects in the same format that they're hashed in? I was arguing that it isn't, since the clients themselves should be handling the consensus objects in some data structure, not the encoding. Its handy when monitoring the network, and there may be other important reasons that I'm missing.

It is externally-defined. I like that the RLP and SSZ specifications are designed and maintained by Ethereum and their design scope is limited strictly to Ethereum.

I agree that this is a risk. There's always the chance that we want a feature that isn't provided, and we're forced to hack around it or fork.

A big concern I'm hearing is that protobuf's surface area is too large for our use case. I kinda agree, and we would have to do a more thorough audit if we want to adopt protobuf (or another 3rd party library), but I would also argue that there's less risk from a security perspective, since its not used for consensus.

The second concern I see is that the benefits I listed above aren't enough to justify switching over from ssz to protobuf. I can only speak about Go, where an implementation of ssz will be slower and harder to use/debug without significant effort. But part of the reason for this is because Golang lacks generics, which isn't relevant to other languages, and maybe I should keep this to myself :) I'm curious how protobuf would compare to ssz in Rust, so I may try that at some point.

The mood that I'm feeling in this thread is that everyone's okay with using ssz for both consensus and the network. If so, I'm okay with maintaining the status quo. However, my concern is that ssz continues to be used for all encoding/decoding in Ethereum 2.0, including non-consensus objects, and at some points it stops living up to the "simple serializer" name.

@prestonvanloon
Copy link
Contributor

It is maintained by Google. This is a bit philosophical and murky, but baking Google-maintained code into the P2P layer seems to go somewhat against the grain of the "decentralized web". (It's my understanding that libp2p uses protobuf, so maybe this has already happened?).

Google can't force you to update protobuf version and you are free to fork it at any time. Git based projects are as decentralized as you would like them to be and I think this a poor argument against a technology. To take this to the extreme would mean writing everything from the ground up. Go is also maintained by Google and subject to change at any time, but it doesn't mean using it goes against the virtues of decentralized technology.

I agree that there needs to be a clear argument for using protobuf over SSZ, and I would be interested to see benchmarks between the two strategies at scale.

Going forward, Prysm clients may support both wire protocols and prefer protobuf for prysm <> prysm client communication. This might provide some helpful metrics in the future.

@paulhauner
Copy link
Contributor

paulhauner commented Nov 18, 2018 via email

@paulhauner
Copy link
Contributor

paulhauner commented Nov 18, 2018

...you are free to fork it at any time. Git based projects are as decentralized as you would like them to be and I think this a poor argument against a technology

I realise it can be forked, but developing and maintaining a succinct, application specific component could be much less effort than forking and maintaining a much larger, generalised component.

To take this to the extreme would mean writing everything from the ground up. Go is also maintained by Google and subject to change at any time, but it doesn't mean using it goes against the virtues of decentralized technology.

I mentioned this point was philosophical and I think it’s reasonable that a philosophy becomes nonsensical when taken to the extreme. I was trying to avoid having external components baked into the spec, not avoiding using external components in the implementation at all costs.

Regardless of all of this, it seems that protobuf is baked into the libp2p spec so my objections to it are futile. Paul: 0, Google: 1.

I don’t want to end up bikeshedding about libp2p so I’m happy to accept protobuf as a cool piece of tech that we’re going to use. I honestly do think it’s cool and would use it without hesitation in other scenarios.

I’m still open for technical discussions on why we shouldn’t push consensus-serialised objects around the network, if more information comes to light :)

@arnetheduck
Copy link
Contributor

Speaking from experience, reimplementing the wire encoding part of protobuf - which is what you actually need to work with protobuf-encoded data ("minimal protobuf") - is trivial. It's comparable to ssz in terms of complexity. I wouldn't worry too much about it from an "3rd party libraries are bad" perspective.

The fact that libp2p uses protobuf is not a reason to put it in the application protocol, just like you wouldn't consider the TCP packed binary fields to be a constraint on the design.

Protobuf being a key-value format, and not supporting fixed-lengths arrays, are solid arguments not to go with it however - you will need to perform "encoding" and "decoding" steps to get from consensus data to protobuf-typed data ("hashes are encoded as variable-length byte buffers" is an encoding step!), and then do the protobuf wire encoding.

I like that prysmatic, that is suggesting the feature, will be doing both - then we can get some solid numbers and make an informed choice. Until then, the numbers from Lighthouses simulations speak in SSZ's favor, as do other arguments.

@paulhauner
Copy link
Contributor

paulhauner commented Nov 21, 2018

Speaking from experience, reimplementing the wire encoding part of protobuf - which is what you actually need to work with protobuf-encoded data ("minimal protobuf") - is trivial.

Awesome, I would be keen to do this.

@arnetheduck
Copy link
Contributor

http://jpa.kapsi.fi/nanopb/ is a fairly easy role model implementation, though even that one strives to be more "generic" than you have to be, if you really want to specialize

@vbuterin
Copy link
Contributor

It's worth noting that if we go this way, then the only part of SSZ we would actually be using is the tree hashing, which is actually a fairly small and compact part of the protocol.

@conor10
Copy link

conor10 commented Dec 2, 2018

It would be great to see protobufs used for the network layer. @rawfalafel does your proposal include gRPC too?

When you throw gRPC into the mix, it's trivial move beyond simple RPC and have streaming services between components backed by HTTP/2, which is another significant value-add.

I've seen countless developer hours wasted on getting different network serialisation protocols to work between services. Protobufs provide a concise, straight forwards interface language with great cross-platform support. I've seen multiple projects switch to using it and never look back.

It will also save a lot of time for those of us developing and maintaining Ethereum integration libraries as we transition to eth2.0 support ... 😉

@paulhauner
Copy link
Contributor

paulhauner commented Dec 3, 2018

It's worth noting that if we go this way, then the only part of SSZ we would actually be using is the tree hashing, which is actually a fairly small and compact part of the protocol.

We would still need an SSZ encoder to go from language-specific-object to bytes though, right? Protobuf as a whole makes no guarantees about byte level determinism, however there seem to be some people who achieved it in a "hackish" way (their words, not mine). Maybe there's an elegant way?

@conor10 I think everyone agrees about using established protocol (e.g., protobufs) for talking between "internal" components, like beacon-node to validator-client. As I understand it, this discussion is around whether p2p nodes should send block, attestations, etc to each other as SSZ or protobuf.

If protobuf can elegantly replace SSZ I'm not opposed to doing so. I'm ditching all my "philosophical" concerns after having a look at minimal protobuf. Seems pretty easy to maintain.

@rawfalafel
Copy link
Contributor Author

Protobuf being a key-value format, and not supporting fixed-lengths arrays, are solid arguments not to go with it however

Good point. Protobuf not supporting fixed length byte arrays is pretty annoying.

It would be great to see protobufs used for the network layer. @rawfalafel does your proposal include gRPC too?

Prysmatic plans to use gRPC for internal communication, but I don't know where I stand about making it a standard.

@JustinDrake
Copy link
Collaborator

I understand the consensus (pun unintended) is to use SSZ for both consensus and network. Feel free to reopen :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:RFC Request for Comments
Projects
None yet
Development

No branches or pull requests

9 participants