Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial libp2p standardization #935

Merged
merged 10 commits into from
May 23, 2019
192 changes: 192 additions & 0 deletions specs/networking/libp2p-standardization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
ETH 2.0 Networking Spec - Libp2p standard protocols
===

# Abstract

Ethereum 2.0 clients plan to use the libp2p protocol networking stack for
mainnet release. This document aims to standardize the libp2p client protocols,
configuration and messaging formats.

# Libp2p Protocols

## Transport

This section details the libp2p transport layer that underlies the
[protocols](#protocols) that are listed in this document.

Libp2p allows composition of multiple transports. Eth2.0 clients should support
TCP/IP and optionally websockets. Websockets are useful for implementations
AgeManning marked this conversation as resolved.
Show resolved Hide resolved
running in the browser and therefore native clients would ideally support these implementations
by supporting websockets.

An ideal libp2p transport would therefore be TCP/IP with a fallback to
djrtwo marked this conversation as resolved.
Show resolved Hide resolved
websockets.

### Encryption

Libp2p currently offers [Secio](https://github.com/libp2p/specs/pull/106) which
djrtwo marked this conversation as resolved.
Show resolved Hide resolved
can upgrade a transport which will then encrypt all future communication. Secio
generates a symmetric ephemeral key which peers use to encrypt their
communication. It can support a range of ciphers and currently supports key
derivation for elliptic curve-based public keys.

Current defaults are:
- Key agreement: `ECDH-P256` (also supports `ECDH-P384`)
- Cipher: `AES-128` (also supports `AES-256`, `TwofishCTR`)
- Digests: `SHA256` (also supports `SHA512`)


## Protocols

This section lists the necessary libp2p protocols required by Ethereum 2.0
running a libp2p network stack.

## Multistream-select

#### Protocol id: `/multistream/1.0.0`

Clients running libp2p should support the [multistream-select](https://github.com/multiformats/multistream-select/)
AgeManning marked this conversation as resolved.
Show resolved Hide resolved
protocol which allows clients to negotiate libp2p protocols establish streams
per protocol.

## Multiplexing

Libp2p allows clients to compose multiple multiplexing methods. Clients should
support [mplex](https://github.com/libp2p/specs/tree/master/mplex) and
optionally [yamux](https://github.com/hashicorp/yamux/blob/master/spec.md)
(these can be composed).

**Mplex protocol id: `/mplex/6.7.0`**

**Yamux protocol id: `/yamux/1.0.0`**

## Gossipsub

#### Protocol id: `/meshsub/1.0.0`

*Note: Parameters listed here are subject to a large-scale network feasibility
study*

The [Gossipsub](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub)
protocol will be used for block and attestation propagation across the
network.

### Configuration Parameters

Gossipsub has a number of internal configuration parameters which directly
effect the network performance. Clients can implement independently, however
we aim to standardize these across clients to optimize the gossip network for
propagation times and message duplication. Current network-related defaults are:

```
(
// The target number of peers in the overlay mesh network (D in the libp2p specs).
mesh_size: 6
// The minimum number of peers in the mesh network before adding more (D_lo in the libp2p specs).
mesh_lo: 4
// The maximum number of peers in the mesh network before removing some (D_high in the libp2p sepcs).
mesh_high: 12
// The number of peers to gossip to during a heartbeat (D_lazy in the libp2p sepcs).
gossip_lazy: 6 // defaults to `mesh_size`
// Time to live for fanout peers (seconds).
fanout_ttl: 60
// The number of heartbeats to gossip about.
gossip_history: 3
// Time between each heartbeat (seconds).
heartbeat_interval: 1
)
```

### Topics

*The Go and Js implementations use string topics - This is likely to be
updated to topic hashes in later versions - https://github.com/libp2p/rust-libp2p/issues/473*
Copy link

@mhchia mhchia Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked about the topic hash in the issue. It seems not finalized yet, but shouldn't be a big issue.


For Eth2.0 clients, topics will be sent as `SHA2-256` hashes of the topic string.

There are two main topics used to propagate attestations and beacon blocks to
all nodes on the network.

- The `beacon_block` topic - This topic is used solely for propagating new
beacon blocks to all nodes on the networks.
- The `beacon_attestation` topic - This topic is used to propagate
aggregated attestations to subscribing nodes (typically block proposers) to
be included into future blocks. Attestations will be aggregated in their
respective subnets before publishing on this topic.

Shards will be grouped into their own subnets (defined by a shard topic). The
number of shard subnets will be defined via `SHARD_SUBNET_COUNT` and the shard
`shard_number % SHARD_SUBNET_COUNT` will be assigned to the topic:
`shard{shard_number % SHARD_SUBNET_COUNT}`.
AgeManning marked this conversation as resolved.
Show resolved Hide resolved

### Messages

#### Libp2p Specification

*This section simply outlines the data sent across the wire as specified by
libp2p - this section is aimed at gossipsub implementers to standardize their implementation of this protocol*

Libp2p raw gossipsub messages are sent across the wire as fixed-size length-prefixed byte arrays.

The byte array is prefixed with an unsigned 64 bit length number encoded as an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64-bit varint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is based off the current floodsub implementation in rust-libp2p (figure all implementations should be standardized). Data is sent via this function:
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L32-L41
Notice the the len of the data is using the unsigned_varint package with a u64_buffer():
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L45 and
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L46

`unsigned varint` (https://github.com/multiformats/unsigned-varint). Gossipsub messages therefore take the form:
```
+--------------------------+
| message length |
+--------------------------+
| |
| body (<1M) |
| |
+--------------------------+
```

The body represents a protobuf-encoded [Message](https://github.com/libp2p/go-libp2p-pubsub/blob/master/pb/rpc.proto#L17-L24).

In the following section we discuss the data being sent in the `data` field of
the protobuf gossipsub `Message`.

#### Eth2.0 Specifics

Each message has a maximum size of 512KB (estimated from expected largest uncompressed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth specifying that this is Eth2.0's choice, as gossipsub does not impose a max message length, to my knowledge.

Copy link

@mhchia mhchia May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of the maximum size of a message, even though there isn't a specified constant, I played with the max size of DelimitedReader, and now suspect it(1 MB?) is the upper bound of the message size in go implementation, IIUC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floodsub in rust-libp2p has a maximum size of 2048 bytes:
https://github.com/libp2p/rust-libp2p/blob/master/protocols/floodsub/src/protocol.rs#L60
I've made this configurable in the gossipsub implementation.
The max size exists so that if we see a large amount of data coming through a stream we only read a maximum amount. I understand this to prevent potential DOS vectors from malicious user's sending arbitrary large streams of data.

block size).

The `data` that is sent in a Gossipsub message is an SSZ-encoded object. For the `beacon_block` topic,
this will be a `beacon_block`. For the `beacon_attestation` topic, this will be
an `attestation`.
AgeManning marked this conversation as resolved.
Show resolved Hide resolved

## Eth-2 RPC

#### Protocol Id: `/eth/serenity/beacon/rpc/1`

The [RPC Interface](./rpc-interface.md) is specified in this repository.

## Identify

#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is /p2p/ not yet supported? and why not use /eth/ here? Is this something global beyond the eth protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep so, the protocols that come with libp2p have their own protocol IDs. Currently (to my knowledge) these are hard coded and not customisable. I'd have to make a pr to libp2p or rewrite the protocol to change the naming. (For rust-libp2p).
I assume the same for go in order for the protocols to be interoperable.
I'm curious if we want to make the protocol IDs customisable within the libp2p protocols.

Copy link
Contributor

@raulk raulk May 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to customise all of these protocol IDs. Doing so will segregate your network from the IPFS or libp2p public networks at the protocol level, such that even if peers happen to cross-connect, they will share no protocols in common, and therefore, won't be able to interact via streams.

Ideally you'll override the protocol ID as low as in multistream-select (to be renamed to multiselect). This is used during connection bootstrapping, and if these IDs don't match, peers will disconnect immediately. This is the logically strictest segregation.


*To be updated to incorporate discv5*


The Identify protocol (defined in go - [identify-go](https://github.com/ipfs/go-ipfs/blob/master/core/commands/id.go) and rust [rust-identify](https://github.com/libp2p/rust-libp2p/blob/master/protocols/identify/src/lib.rs))
allows a node A to query another node B which information B knows about A. This also includes the addresses B is listening on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relationship to discv5? Is this supposed to be a temporary substitute, a permanent complement, or a replacement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, temporary placeholder. This is currently what lighthouse is using in place of a proper discovery protocol.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an explicit note that this is a placeholder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on discv5 at the moment. Currently, I think this protocol may still be useful for NAT traversal and can be fed into the discv5 protocol. Potentially we don't need it at all. I'll add a placeholder note for now.


This protocol allows nodes to discover addresses of other nodes to be added to
peer discovery. It further allows nodes to determine the capabilities of all it's connected
peers.

### Configuration Parameters

The protocol has two configurable parameters, which can be used to identify the
type of connecting node. Suggested format:
```
version: `/eth/serenity/1.0.0`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, to my knowledge, this is currently not the case in rust-libp2p. I'll make a PR to make these configurable.

user_agent: <client name and version>
```

## Discovery

#### Protocol Id: `/eth/serenity/disc/1.0.0`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current plan for discv5 is to use an independent transport.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree. This entire document will be updated with discv5 soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there was a plan to implement discv5 as a libp2p protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently working on this. My current plan is to mimic mDNS in the integration into libp2p. In this case, there will be no protocol id, but will allow implementations to use the discv5 discovery gadget which listens separately on a UDP port. Messages from this gadget will be available to the user to pass into other libp2p protocols. i.e we discover a node, this discovery can be used to dial a tcp connection or add to a kademlia DHT in libp2p.
Once, I've built an implementation, I will update this document.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have at least the ability to run discv5 over other transports - what would that look like, in this setup?


*To be updated to incorporate discv5*

The discovery protocol to be determined.