Add initial libp2p standardization #935

AgeManning · 2019-04-16T05:08:50Z

Ahead of multi-client testnets, I think it will be useful to start a discussion on standardizing the libp2p protocols and their respective message formats.

This is a minimal initial draft which outlines some of the protocols, their id's and suggested formats which could be used for eth2.0 clients.

The protocols and formats listed here are currently in use by lighthouse but are entirely malleable and currently exist as suggestions for client implementations.

JustinDrake · 2019-04-16T05:13:48Z

Should we standardise on peer ids?

prestonvanloon · 2019-04-16T05:15:51Z

At prysmatic labs, we are using multiple pubsub topics where each topic name is a 1:1 mapping to a message type. An example would be that attestation announcements come across the wire as /eth/serenity/beacon/v1/attestationannounce. Likewise with RPC calls, we have a stream for each RPC method/topic.

Has there been any analysis of the trade-offs for this? Subscribing to only the topics you care about may reduce your network traffic while adding some potential complexity

prestonvanloon · 2019-04-16T05:20:25Z

Having 1:1 topic mappings also reduces the need for a type nipple. The message is either the expected type or not.

AgeManning · 2019-04-16T05:22:09Z

@JustinDrake - PeerId's seem to be specified here: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/networking/node-identification.md#peer-id-generation
So I have left them out of this particular doc. Perhaps we merge the two.

AgeManning · 2019-04-16T05:32:02Z

@prestonvanloon - Agree that potentially we want a topic for each message type (which removes the need for a nibble). In gossipsub for each topic, we have to maintain a set of mesh peers. So there may be some overhead in managing two sets of overlay networks for the blocks/attestations. I guess this is the point of this PR to agree on how we should proceed. I believe the WhiteBlock guys will be doing some testing for gossipsub in the near future and we will be able to have a more informed opinion on which way to proceed.

In regards to having a separate protocol id for requests/responses in the RPC protocol seems like a useful separation as we will not need to encode the different types. However it seems to go against current libp2p protocol designs which typically have a singular protocol id per protocol. I'd opt to go whichever is the most performant.

JustinDrake · 2019-04-16T05:32:18Z

go-libp2p-crypto contains the canonical implementation of how to hash secp256k1 keys for use as a peer ID

Would it make sense to push for BLS12-381 keys in the longer term, to unify the consensus and networking layers?

prestonvanloon · 2019-04-16T05:39:35Z

@AgeManning Agreed on the RPC part, it seems sub-optimal to open multiple streams between two peers.

cc: @zscole is this on your radar?

AgeManning · 2019-04-16T07:19:10Z

@JustinDrake - Makes sense for consistency. The caveats might be that BLS will likely take longer to verify (compared to secp256k1) and there is talk of generating new key pairs for every message sent across the pubsub network (to anonymize validators) (I believe BLS generation is comparable to secp256k1 keys so this shouldn't be an issue).

zscole · 2019-04-16T14:10:28Z

cc: @zscole is this on your radar?

I don't see why multiple streams would be necessary for anything other than FTP. I don't think it would present any inherent optimization, but this is something we can test for to validate if really necessary.

specs/networking/libp2p-standardization.md

djrtwo · 2019-04-19T03:31:38Z

specs/networking/libp2p-standardization.md

+
+## Identify
+
+#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`)


Is /p2p/ not yet supported? and why not use /eth/ here? Is this something global beyond the eth protocol?

Yep so, the protocols that come with libp2p have their own protocol IDs. Currently (to my knowledge) these are hard coded and not customisable. I'd have to make a pr to libp2p or rewrite the protocol to change the naming. (For rust-libp2p).
I assume the same for go in order for the protocols to be interoperable.
I'm curious if we want to make the protocol IDs customisable within the libp2p protocols.

You should be able to customise all of these protocol IDs. Doing so will segregate your network from the IPFS or libp2p public networks at the protocol level, such that even if peers happen to cross-connect, they will share no protocols in common, and therefore, won't be able to interact via streams.

Ideally you'll override the protocol ID as low as in multistream-select (to be renamed to multiselect). This is used during connection bootstrapping, and if these IDs don't match, peers will disconnect immediately. This is the logically strictest segregation.

mslipper · 2019-04-22T07:20:05Z

Looks like I missed this - I'm going to close my PR (#975) in favor of this one. @AgeManning would you mind if I committed changes to this one in order to merge our efforts?

specs/networking/libp2p-standardization.md

AgeManning · 2019-04-24T06:18:44Z

Looks like I missed this - I'm going to close my PR (#975) in favor of this one. @AgeManning would you mind if I committed changes to this one in order to merge our efforts?

@mslipper - Of course. I made this mainly to start the discussion. All edits/commits welcome. :)

specs/networking/libp2p-standardization.md

jannikluhn · 2019-04-24T10:27:28Z

specs/networking/libp2p-standardization.md

+#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`)
+
+The Identify protocol (defined in go - [identify-go](https://github.com/ipfs/go-ipfs/blob/master/core/commands/id.go) and rust [rust-identify](https://github.com/libp2p/rust-libp2p/blob/master/protocols/identify/src/lib.rs))
+allows a node A to query another node B which information B knows about A. This also includes the addresses B is listening on.


What's the relationship to discv5? Is this supposed to be a temporary substitute, a permanent complement, or a replacement?

Yep, temporary placeholder. This is currently what lighthouse is using in place of a proper discovery protocol.

Let's add an explicit note that this is a placeholder

I'm working on discv5 at the moment. Currently, I think this protocol may still be useful for NAT traversal and can be fed into the discv5 protocol. Potentially we don't need it at all. I'll add a placeholder note for now.

jannikluhn · 2019-04-24T10:28:57Z

specs/networking/libp2p-standardization.md

+
+## Discovery
+
+#### Protocol Id: `/eth/serenity/disc/1.0.0`


I think the current plan for discv5 is to use an independent transport.

Yes, agree. This entire document will be updated with discv5 soon.

I thought there was a plan to implement discv5 as a libp2p protocol?

I'm currently working on this. My current plan is to mimic mDNS in the integration into libp2p. In this case, there will be no protocol id, but will allow implementations to use the discv5 discovery gadget which listens separately on a UDP port. Messages from this gadget will be available to the user to pass into other libp2p protocols. i.e we discover a node, this discovery can be used to dial a tcp connection or add to a kademlia DHT in libp2p.
Once, I've built an implementation, I will update this document.

it would be nice to have at least the ability to run discv5 over other transports - what would that look like, in this setup?

mhchia · 2019-04-30T15:12:55Z

specs/networking/libp2p-standardization.md

+### Topics
+
+*The Go and Js implementations use string topics - This is likely to be
+updated to topic hashes in later versions - https://github.com/libp2p/rust-libp2p/issues/473*


I asked about the topic hash in the issue. It seems not finalized yet, but shouldn't be a big issue.

specs/networking/libp2p-standardization.md

Co-Authored-By: AgeManning <Age@AgeManning.com>

raulk

Thanks for undertaking this, @AgeManning! I believe this document creates a lot of clarity for the community.

From my end, I'll publish a short primer on connection bootstrapping and protocol negotiation in https://discuss.libp2p.io, as I believe this is the area that has created the most confusion so far.

specs/networking/libp2p-standardization.md

raulk · 2019-05-02T13:10:33Z

specs/networking/libp2p-standardization.md

+
+Libp2p raw gossipsub messages are sent across the wire as fixed-size length-prefixed byte arrays.
+
+The byte array is prefixed with an unsigned 64 bit length number encoded as an


64-bit varint?

Yep. This is based off the current floodsub implementation in rust-libp2p (figure all implementations should be standardized). Data is sent via this function:
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L32-L41
Notice the the len of the data is using the unsigned_varint package with a u64_buffer():
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L45 and
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L46

raulk · 2019-05-02T13:14:30Z

specs/networking/libp2p-standardization.md

+
+#### Eth2.0 Specifics
+
+Each message has a maximum size of 512KB (estimated from expected largest uncompressed


It's worth specifying that this is Eth2.0's choice, as gossipsub does not impose a max message length, to my knowledge.

Speaking of the maximum size of a message, even though there isn't a specified constant, I played with the max size of DelimitedReader, and now suspect it(1 MB?) is the upper bound of the message size in go implementation, IIUC.

Floodsub in rust-libp2p has a maximum size of 2048 bytes:
https://github.com/libp2p/rust-libp2p/blob/master/protocols/floodsub/src/protocol.rs#L60
I've made this configurable in the gossipsub implementation.
The max size exists so that if we see a large amount of data coming through a stream we only read a maximum amount. I understand this to prevent potential DOS vectors from malicious user's sending arbitrary large streams of data.

raulk · 2019-05-02T13:18:05Z

specs/networking/libp2p-standardization.md

+
+## Identify
+
+#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`)


You should be able to customise all of these protocol IDs. Doing so will segregate your network from the IPFS or libp2p public networks at the protocol level, such that even if peers happen to cross-connect, they will share no protocols in common, and therefore, won't be able to interact via streams.

Ideally you'll override the protocol ID as low as in multistream-select (to be renamed to multiselect). This is used during connection bootstrapping, and if these IDs don't match, peers will disconnect immediately. This is the logically strictest segregation.

raulk · 2019-05-02T13:19:21Z

specs/networking/libp2p-standardization.md

+The protocol has two configurable parameters, which can be used to identify the
+type of connecting node. Suggested format:
+```
+	version: `/eth/serenity/1.0.0`


Ok, to my knowledge, this is currently not the case in rust-libp2p. I'll make a PR to make these configurable.

raulk · 2019-05-02T13:19:58Z

specs/networking/libp2p-standardization.md

+
+## Discovery
+
+#### Protocol Id: `/eth/serenity/disc/1.0.0`


I thought there was a plan to implement discv5 as a libp2p protocol?

raulk · 2019-05-02T13:29:59Z

On the topic (pun intended) of dedicated topics per message type and shard vs. wildcard topics per shard, I'd suggest the former. A few reasons off the top of my head:

It curtails irrelevant gossip amplification. The effort invested by each peer to maintain the mesh overlay is proportional to the traffic it's interested in.
Validation functions can be type-specific versus coarse-grained god-functions.
Easier debugging and traceability.
Allows lighter clients to subscribe only to the data they need.

jrhea · 2019-05-12T03:52:28Z

Having 1:1 topic mappings also reduces the need for a type nipple. The message is either the expected type or not.

True. We should make sure there are only two type nipples. ✌️😂🤠

djrtwo

Okay @AgeManning, looks good to me. Ready to merge?

specs/networking/libp2p-standardization.md

AgeManning · 2019-05-13T00:09:46Z

@djrtwo - I consider this still a discussion PR and it will be updated in a week or so with discv5. However it might be useful as a guideline to start libp2p co-ordination so happy to merge this WIP PR if you like. I can make future PR's to update it :).

AgeManning · 2019-05-23T01:56:31Z

I've added a small paragraph for Discv5.
We no longer need the identify protocol as Discv5 handles ip and topic discovery for us. It does not maintain it's own protocol-id as it has no need for establishing long-lasting streams. For these reasons I use the word standalone even though it could be built from libp2p-components. It's standalone from a multistream-select perspective.

fjl · 2019-05-23T15:31:43Z

This is really cool. Why should clients optionally support yamux?

djrtwo · 2019-05-23T15:34:11Z

Merging.
We can take subsequent convos to issues/PRs

AgeManning · 2019-05-24T11:15:14Z

@fjl - I originally made this as there was no discussion on what clients should use. By default, in rust, we can support both simultaneously. I've not done any testing or benchmarking between either protocol, but my feel is that mplex is minimal and yamux is more complex and potentially used more in prod (I have no real knowledge here however). I opted for the minimal by default and left the other libp2p-default multiplexer as optional. This was in part to spark discussion with people who know more about this than me (which didn't happen).

Add initial libp2p standardization

22d4496

hwwhww added the scope:networking label Apr 17, 2019

djrtwo reviewed Apr 19, 2019

View reviewed changes

arnetheduck reviewed Apr 22, 2019

View reviewed changes

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved

Add @prestonvanloon and @djrtwo's comments for muliple beacon topics

b83a7c4

jannikluhn reviewed Apr 24, 2019

View reviewed changes

mhchia reviewed May 1, 2019

View reviewed changes

ralexstokes reviewed May 2, 2019

View reviewed changes

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved

AgeManning and others added 2 commits May 2, 2019 16:34

Add Transport and lower-level libp2p specifications

bbca108

Update specs/networking/libp2p-standardization.md

7818183

Co-Authored-By: AgeManning <Age@AgeManning.com>

raulk reviewed May 2, 2019

View reviewed changes

wemeetagain mentioned this pull request May 6, 2019

First Release ChainSafe/lodestar#194

Closed

33 tasks

AgeManning and others added 4 commits May 6, 2019 12:28

Update libp2p-standardization based on latest comments

c7fea5f

Merge branch 'libp2p' of github.com:AgeManning/eth2.0-specs into libp2p

c33bdfd

change some language to be more declarative rather than about the future

bff71b6

Rename shard topics to explicitly state

3c87754

djrtwo approved these changes May 12, 2019

View reviewed changes

wemeetagain reviewed May 12, 2019

View reviewed changes

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved

Correct typo

feb3b5e

AgeManning mentioned this pull request May 13, 2019

Allow custom protocol Id for all protocols libp2p/rust-libp2p#1115

Closed

Update with discv5

ae6d30f

djrtwo merged commit 72e1267 into ethereum:dev May 23, 2019


		## Identify

		#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`)


		Libp2p raw gossipsub messages are sent across the wire as fixed-size length-prefixed byte arrays.

		The byte array is prefixed with an unsigned 64 bit length number encoded as an


		#### Eth2.0 Specifics

		Each message has a maximum size of 512KB (estimated from expected largest uncompressed

Add initial libp2p standardization #935

Add initial libp2p standardization #935

Conversation

AgeManning commented Apr 16, 2019

JustinDrake commented Apr 16, 2019

prestonvanloon commented Apr 16, 2019

prestonvanloon commented Apr 16, 2019

AgeManning commented Apr 16, 2019

AgeManning commented Apr 16, 2019 • edited Loading

JustinDrake commented Apr 16, 2019

prestonvanloon commented Apr 16, 2019

AgeManning commented Apr 16, 2019

zscole commented Apr 16, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk May 2, 2019 • edited Loading

Choose a reason for hiding this comment

mslipper commented Apr 22, 2019

AgeManning commented Apr 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhchia Apr 30, 2019 • edited Loading

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhchia May 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk May 2, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk commented May 2, 2019

jrhea commented May 12, 2019 • edited Loading

djrtwo left a comment

Choose a reason for hiding this comment

AgeManning commented May 13, 2019

AgeManning commented May 23, 2019 • edited Loading

fjl commented May 23, 2019

djrtwo commented May 23, 2019

AgeManning commented May 24, 2019

AgeManning commented Apr 16, 2019 •

edited

Loading

raulk May 2, 2019 •

edited

Loading

AgeManning commented Apr 24, 2019 •

edited

Loading

mhchia Apr 30, 2019 •

edited

Loading

mhchia May 3, 2019 •

edited

Loading

raulk May 2, 2019 •

edited

Loading

jrhea commented May 12, 2019 •

edited

Loading

AgeManning commented May 23, 2019 •

edited

Loading