Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple, minimal peer exchange protocol #222

Open
raulk opened this issue Oct 25, 2019 · 2 comments
Open

Simple, minimal peer exchange protocol #222

raulk opened this issue Oct 25, 2019 · 2 comments

Comments

@raulk
Copy link
Member

@raulk raulk commented Oct 25, 2019

This issue proposes a general-use peer exchange protocol, that is not embedded in any specific protocol like gossipsub/episub.

The goal of PEX is to enable peers to share records about other peers they're connected to in a 1:1, ad-hoc fashion. It does not intend to produce deterministic results like DHTs, nor does it rely on a structured network or shared heuristic. BitTorrent uses PEX to streamline tracker-less peer discovery.

In the context of gossipsub, PEX proves useful to find additional peers in a topic we subscribe to, as a way of strengthening our topic mesh. Through subscription beaconing (ie. peers gossiping about which topics they're subscribed to), it can even be possible to bootstrap a topic subscription without hitting the DHT, or other structured discovery mechanisms at all.

I'm thinking we should spec out a minimal PEX protocol, consisting of a simple advertisement schema, and two operations: advertise, lookup.

Advertisement schema

An advertisement struct consists of a peer address record and a set of CIDs we are advertising, signed by the peer's key to prevent MITM attacks.

Local advertisement record maintenance

Our advertisement record is kept in memory and updated at runtime. It is populated with:

  • our peer ID.
  • own addresses (which may change over time; the PEX protocol can subscribe to updates via the eventbus).
  • advertised CIDs.

The Host API would expose methods so that downstream components (e.g. protocols) can manage advertised CIDs, e.g.:

// We don't want to add an accessor for PEX in the Host interface.
// The host-service refactor is a prerequisite to be able to do this.
svc, ok := host.GetService(&PEX{})
if !ok {
    return nil
}
pexsvc := svc.(PEXService)

ad := pex.NewAd("gossipsub:topic_name")
cancelFn, err := pexsvc.Advertise(ad)
if err != nil {
    return err
}

// ... store the cancelFn in state ...

// atomically replace the advertised value, possibly not useful for gossipsub, 
// but it will be for other protocols.
// Helps mitigate add/remove noise when sending deltas.
ad.Replace("gossipsub:topic_name_b")

// when done / closing down
err := cancelFn()
if err != nil {
    return err
}

Advertise operation

Upon establishing a libp2p connection:

  1. We open a stream for protocol ID /libp2p/pex/v01.
  2. If successful, we push our local advertisement record.
  3. When receiving a record, we store it in memory.

We repeat the above when advertisements or addresses change. Note that process looks a lot like the identify protocol logic. We could extend the identify protocol to support advertised CIDs. Note that protocol IDs are insufficient to contextualise an advertisement (e.g. we want to know that a peer is a member of gossipsub topic abc, not that it supports gossipsub).

Lookup operation

When the local application/protocol intends to look up peers advertising a specific CID, it sends a lookup RPC to all connected neighbours, who reply with the advertisement records of all directly connected peers they know to be advertising the CID.

If a peer returns irrelevant/malformed/badly signed ads, we decrease their score on the grounds of displaying malicious behaviour. Below a certain threshold, we blacklist/disconnect the peer.

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Privacy reflections

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

@vyzo

This comment has been minimized.

Copy link
Contributor

@vyzo vyzo commented Oct 25, 2019

We might also want to have a push protocol for advertisements instead of relying on poll lookup.

@jbenet

This comment has been minimized.

Copy link
Contributor

@jbenet jbenet commented Oct 28, 2019

PEX

great to see this here! 👍 -- we've needed something like PEX in libp2p for a long time

ad := pex.NewAd("gossipsub:topic_name")

oh cool, i didn't recall PEX kept specific topics/swarms associated with each peer. makes sense. We probably want to do something like tags actually:

Get("gossipsub:topic_name") # get all peers related to this gossipsub topic
Get("providers:<selector>") # get all peers related to this ipld selector
Get("transport:QUIC") # get all peers that have QUIC
Get("kad-dht") # get all peers that speak kad-dht
Get("filecoin") # get all peers that speak filecoin
Get("filecoin:retrieval") # get all peers that speak filecoin:retrieval
Get("kad-dht", "gossipsub:topic_name") # get all peers related to this gossipsub topic, and who speak kad-dht

In this sense, maybe we should be doing pathing (/ separated), and re-using the protocol identifiers we already use (for uniqueness and default simplicity):

Get(Path(gossipsub.ProtocolID, "topic_name"))
Get(Path(providers.ProtocolID, selector))
Get(Path("transport", quic.ProtocolID))
Get(Path(filecoin.ProtocolID, filecoin.RetrievalProtocolID))
Get(Path(gossipsub.ProtocolID, "topic_name"), Path("kad-dht))

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

not sure we should even reach peers-of-peers, but maybe.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Yeah i think this needs to be explicitly out of scope for this protocol. this should be a very simple 1-1 protocol (or just about).

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

yes 👍

Security Considerations

  • PEX MUST be easy to turn off and never required
  • PEX SHOULD not give all peers one is connected to. this would be a potential attack vector.
    • limiting by topic/protocol is good (PEX interface should allow making it only respond to certain labels -eg only gossipsub peers, etc)
    • limiting by number of ranodmly-chosen-peers per label can also work (eg. will return at most 5 peers per label)
  • PEX SHOULD be able to return peers that are no longer connected -- it should be able to use the local cache / peerbook, which may have lots and lots of peers, even if not directly connected atm -- this should be an option that can be off.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.