Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Kademlia DHT spec #108

Open
wants to merge 4 commits into
base: master
from

Conversation

@raulk
Copy link
Member

commented Nov 7, 2018

This was a complicated birth.

TODO:

  • Specify that we haven't implemented PINGs.
  • Deviations in bucket behaviour from baseline Kademlia. Since we don't use PINGs, we don't test the least recently seen peer. We evict it blindly, thus causing a high degree of bucket thrashing and not observing the heuristic "the longer a peer has been around, the more likely it is to stay around".
  • Revisit RPC messages section. Copy protobufs and explain each field meticulously.
  • Resolve all feedback below.

@ghost ghost assigned raulk Nov 7, 2018

@ghost ghost added the in progress label Nov 7, 2018

@raulk raulk requested a review from jhiesey Nov 7, 2018

@raulk raulk changed the title WIP Initial Kademlia DHT spec WIP Kademlia DHT spec Nov 7, 2018

@jhiesey
Copy link

left a comment

Looks pretty good, just a few comments. Do you want me to add the WIP (jhiesey) sections before or after merging this?


The Kademlia Distributed Hash Table (DHT) subsystem in libp2p is a DHT
implementation largely based on the Kademlia [0] whitepaper, augmented with
notions from S/Kademlia [1], Coral [2] and mainlineDHT \[3\].

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

Shouldn't \[3\] not have the brackets escaped? I'm not a markdown expert though

This comment has been minimized.

Copy link
@raulk

raulk Nov 12, 2018

Author Member

I had to escape it because, for some reason, Github elided it when unescaped. Didn't dig into why, though. Settled for the easy path.


The routing table is unfolded lazily, starting with a single bucket a position 0
(representing the most distant peers), and splitting it subsequently as closer
peers are found, and the capacity of the nearmost bucket is exceeded.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

This doesn't say if we ever split buckets we aren't in. The original Kademlia paper (end of section 2.4) does as an edge case; S/Kademlia doesn't.

This comment has been minimized.

Copy link
@raulk

raulk Nov 12, 2018

Author Member

Good call, will double check


## Record keys

Records in the DHT are keyed by CID [4]. There are intentions to move to

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

What's going on currently is that the CID or PeerId is sha256 hashed again, right? So we're going to move to sha256 of the multihash?

This comment has been minimized.

Copy link
@anacrolix

anacrolix Dec 21, 2018

How does this work if the hash used changes? It seems that this allows you to choose a new hash to generate a new location on the DHT, that still references the same actual object.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

Right, all locations will end up changing if we change the hash. So we'll have to have a new (incompatible) DHT protocol at that point.

(quorum) responses from distinct nodes to check for consistency before returning
an answer.

Should the responses be different, the `Validator.Select()` function is used to

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

What about Validator.VerifyRecord? It's called on both put and get in the js version.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

Ah nevermind, you cover this below. Still might not hurt to mention validation here.

This comment has been minimized.

Copy link
@raulk

raulk Nov 12, 2018

Author Member

👍


*WIP (raulk): lookup timeout.*

1. If we have collected `Q` or more answers, we cancel outstanding requests, return `best`, and we notify the peers holding an outdated value (`Po`) of the best value we discovered, by sending `PUT_VALUE(K, best)` messages. _Return._

This comment has been minimized.

Copy link
@jhiesey

jhiesey Nov 8, 2018

There's also a termination condition where we run out of peers to query without getting Q answers

This comment has been minimized.

Copy link
@raulk

raulk Nov 12, 2018

Author Member

Good call!

kad-dht/README.md Outdated Show resolved Hide resolved
@vasco-santos
Copy link
Member

left a comment

Seems a good initial version of this spec! Thanks @raulk

Left just a comment

an answer.

Should the responses be different, the `Validator.Select()` function is used to
resolve the conflict and select the _best_ result.

This comment has been minimized.

Copy link
@vasco-santos

vasco-santos Nov 13, 2018

Member

We should notice here that the Validator.Select() is associated with a specific namespace libp2p/go-libp2p-record/validator.go#L56.

Moreover, I have been thinking about this for a while in JS land. In case we do not have a Validator.Select() for the namespace of a key being used, shouldn't we fallback to a default Select function? In JS land, when using arbitrary keys with unknown namespaces it failed to get the record. We ended up changing to selecting the first record, but I don't know if that is the best approach.

I believe the same happens with Validate().

This comment has been minimized.

Copy link
@anacrolix

anacrolix Dec 21, 2018

Can validation be deferred to consumers of the DHT? It's not really a requirement to participate in it?

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

It's not strictly necessary, no. But it would be nice if nodes can throw out clearly bogus records instead of storing them. So this should be suggested but optional for now.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

That is, it would be nice if the node that receives a PUT_VALUE can do some sanity checking.

This comment has been minimized.

Copy link
@anacrolix

anacrolix Jan 10, 2019

Hm, per the comment on the refactor proposal, the DHT node implementation could call a registered handler on receiving a PUT_VALUE, which does whatever it wishes with the data.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Jan 11, 2019

Yes, that's the idea. We could move this validation outside the DHT itself.

The `addProvider` handler behaves differently across implementations:
* in js-libp2p-kad-dht, the sender is added as a provider unconditionally.
* in go-libp2p-kad-dht, it is added once per instance of that peer in the
`providerPeers` array.

This comment has been minimized.

Copy link
@tomaka

tomaka Nov 14, 2018

Member

That doesn't really say what the difference between implementations is.
Also the idea of the specs is to remove these differences.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

Right, this is a bug.

CID. For each provider `PeerInfo` that matches the sender's id and contains one
or more multiaddrs, that provider info is added to the peerbook and the peer is
added as a provider for the CID.
* `PING() -> ()` Tests connectivity to destination node. Currently never sent.

This comment has been minimized.

Copy link
@tomaka

tomaka Nov 14, 2018

Member

Shouldn't that be removed entirely?

This comment has been minimized.

Copy link
@raulk

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

I'm inclined to leave this in the spec for now, as handlers are implemented in both JS and Go, and it may turn out to be useful soon as an end-to-end connection test. If we decide we don't want it then let's remove it from the code too.


## Interfaces

The libp2p Kad DHT implementation satisfies the routing interfaces:

This comment has been minimized.

Copy link
@tomaka

tomaka Nov 14, 2018

Member

Is that really relevant to a spec? That looks pretty specific to Go to me, or to a specific set of programming languages that are capable of fulfilling it. In particular the Rust code has no interest in following these interfaces.

This comment has been minimized.

Copy link
@raulk

raulk Nov 14, 2018

Author Member

I had second thoughts when dumping the interface here, as it should be non-normative as you say. However, it helps bind things together, as it specifies the public calls supported by this component along with its inputs and outputs, i.e. the expected public surface of this component. Also, each exposed behaviour is specified at some point in the doc.

IMHO, we do need to capture an abstract interface outline, and I used Go nomenclature and copied our existing one. I'm open to changing this.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

We have separate repos defining at least two of these interfaces abstractly; see https://github.com/libp2p/interface-peer-routing https://github.com/libp2p/interface-content-routing

Admittedly this isn't required for interoperability, but I think we should at least suggest relevant interfaces for the DHT's public API.

This comment has been minimized.

Copy link
@anacrolix

anacrolix Jan 10, 2019

I think shoehorning the DHT node implementation into these external interfaces is causing it to take on unnecessary complexity. Using the DHT for routing (to fit the interface) should be trivial to provide with a type that wraps the DHT node. This will free up some very bizarre methods and requirements that currently exist directly on the node implementation. Perhaps the spec should say that "The DHT node implementation may implement or provide the required features to implement the following interfaces:" or something to that effect.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Jan 11, 2019

I'm inclined to agree, yes

@tomaka

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

We should also mention how the substreams that use the protocol behave.
Is it expected to open one substream per request? Or should implementations make an effort to only use one at a time and send requests one by one? Should endpoints close their substream after a successful response, or can they continue sending more requests?
(note: I actually have these questions right now for Rust, as I have no clue)

kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
@tomaka

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

Kademlia is by far the protocol for which I've suffered the most when implementing into Rust, because of all the differences between the well-documented Kademlia algorithm and the way it is implemented in libp2p. I think we should focus more on the latter and not copy-paste what people can already find on Google.

@tomaka

This comment has been minimized.

Copy link
Member

commented Nov 14, 2018

Also, writing the Kademlia specs is a tremendous task. I don't expect a complete spec to be less than ~2k lines, and expecting a single person to write them is very optimistic.

@raulk

This comment has been minimized.

Copy link
Member Author

commented Nov 14, 2018

@tomaka

Kademlia is by far the protocol for which I've suffered the most when implementing into Rust, because of all the differences between the well-documented Kademlia algorithm and the way it is implemented in libp2p. I think we should focus more on the latter and not copy-paste what people can already find on Google.

That was the spirit of this spec: to focus on differential areas vs. regurgitating how Kademlia works (as conveyed in the spec intro). Hence it covers provider records, public key records, conflict resolution, peer correction, etc. which is specific to libp2p Kad DHT.

I don't expect a complete spec to be less than ~2k lines

Could you enumerate what other aspects are worth covering? Aspects that are unique in libp2p. We don't want to clone the Kademlia and friends literature.


Regarding all the data model comments, there's a request in the doc for @jhiesey to replace these descriptions with the protobuf.

@tomaka

This comment has been minimized.

Copy link
Member

commented Nov 15, 2018

Could you enumerate what other aspects are worth covering? Aspects that are unique in libp2p. We don't want to clone the Kademlia and friends literature.

Well, the 2k would include actually covering Kademlia, which I think we should do anyway, just not urgently.

I think there should be more explanation as to what actually happens on the network, rather than just dumping a protobuf definition file.
For example, which fields need to be field for which kind of RPC query? The format of the fields (bytes doesn't mean much)? Should nodes relay multiaddresses in their raw format, or produce error/ignore the ones it can't decode? Is there a limit to the number of multiaddresses that a peerinfo should contain?
That's just from the top of my head.
Also, the reason why I didn't implement record store in rust-libp2p at the time is that the Record definition was way too vague to be helpful. I'd expect more help from a specs.

@raulk

This comment has been minimized.

Copy link
Member Author

commented Nov 17, 2018

@tomaka I agree that the RPC section needs redoing. The idea is to copy the protobufs, as these are normative for interoperability, and explain how each field is used and serialised (especially for the bytes types, which can be anything). @jhiesey, are you planning to tackle this?

I do recognise you've implemented this from scratch, and therefore your feedback is valuable. However, in all honesty, I don't see the value in reinventing the wheel and re-specifying the Kademlia baseline in this spec. I rather make it pre-required reading (like I've done), and build on top of it.

In a nutshell, I think of this spec as a diff between baseline Kademlia and our implementation, which:

  • deviates from Kademlia in some aspects, e.g. the way we manage buckets, the lack of PINGs, etc. (this needs specifying!)
  • cherry-picks ideas from Coral, mainlineDHT, etc.

Maybe you can compile a list of all the areas you tripped over, and we can make sure to cover them?

Also, Kademlia is abstract, in the sense that it doesn't specify wire messages, RPCs, timeouts, etc. So our spec should define those aspects very clearly.

@mgoelzer mgoelzer referenced this pull request Nov 19, 2018
2 of 16 tasks complete
@tomaka

This comment has been minimized.

Copy link
Member

commented Nov 28, 2018

cc the second part of this comment: #111 (comment)

@raulk raulk referenced this pull request Nov 28, 2018
8 of 13 tasks complete
@raulk

This comment has been minimized.

Copy link
Member Author

commented Dec 17, 2018

@jhiesey tagged you in the comments that need your attention; let's try to push this through! ;-)

@jhiesey

This comment has been minimized.

Copy link

commented Dec 21, 2018

@raulk sorry for not getting to this before now! Will work on this today.


The concurrency of node and value lookups are limited by parameter `α`, with a
default value of 3. This implies that each lookup process can perform no more
than 3 inflight requests, at any given time.

This comment has been minimized.

Copy link
@anacrolix

anacrolix Dec 21, 2018

I'm not sure why the spec has this. In real-world implementations, a concurrency factor much, much higher is required to be reasonably fast.

This comment has been minimized.

Copy link
@jhiesey

jhiesey Dec 21, 2018

You're right, this probably doesn't belong in the spec. However, with the implementation of libp2p/go-libp2p-kad-dht#146 landing, the number of outgoing requests is multiplied by a factor of the number of paths (currently default of 10). That will change the behavior in practice.

We'll have to do some testing to determine what this should be set to.

@raulk what do you think?

@jhiesey

This comment has been minimized.

Copy link

commented Dec 21, 2018

I've addressed the issues I see. Left some more comments too.

@jhiesey

This comment has been minimized.

Copy link

commented Jan 10, 2019

What's the status on this @raulk? Anything I can do to help?

bytes value = 2;
// Note: These fields were removed from the Record message
// hash of the authors public key

This comment has been minimized.

Copy link
@anacrolix

anacrolix Jan 10, 2019

At the network/RPC level, none of these last 3 fields exist in the Record, should they be removed from a spec?

This comment has been minimized.

Copy link
@jhiesey

jhiesey Jan 11, 2019

Yes they should be removed

@jhiesey

This comment has been minimized.

Copy link

commented Jan 11, 2019

After reading @anacrolix 's feedback on this and my refactor proposal, I think we should simplify this DHT node spec substantially and move a bunch of the hairier stuff into separate discovery modules with their own specs.

@Mikerah

This comment has been minimized.

Copy link
Contributor

commented Feb 6, 2019

I noticed that some of the links the the bibliography were behind Springer's paywall. It would be awesome to provide the links to these papers from the author's website for example. I think this would increase the accessibility of the spec.

kad-dht/README.md Show resolved Hide resolved
@richardschneider richardschneider referenced this pull request Apr 20, 2019
3 of 4 tasks complete

These are the requirements for each `MessageType`:
* `FIND_NODE`: `key` must be set in the request. `closerPeers` is set in the
response; for an exact match exactly one `Peer` is returned; otherwise `ncp`

This comment has been minimized.

Copy link
@jacobheun

jacobheun Apr 26, 2019

Contributor

ncp should be k (20) per kademlia and not 6, correct?

@cmr

This comment has been minimized.

Copy link

commented Jun 25, 2019

S/Kademlia requires either "crypto puzzles" (proofs of work) or a centralized authority issuing certificates to prevent sybil and eclipse attacks from generating lots of node IDs. With node ID's being so easy to generate, how is this DHT protected against DoS abuse? (See section 4.1 of the S/Kademlia paper)

@Warchant
Copy link

left a comment

Protocol Id is not specified. Implementations use /ipfs/kad/1.0.0. IMO should drop ipfs prefix, thus lets use /kad/1.0.0

healthy throughout time. It runs once on startup, then periodically with a
configurable frequency (default: 5 minutes).

On every run, we generate a random node ID and we look it up via the process

This comment has been minimized.

Copy link
@Warchant

Warchant Jul 22, 2019

we generate a random node ID

Specify how node ID is generated.
Is it the same as peer ID?

Kademlia paper [0]. Peer IDs are normalised through the SHA256 hash function.

For recap, `dXOR(sha256(id1), sha256(id2))` is the number of common leftmost
bits between SHA256 of each peer IDs. The `dXOR` between us and a peer X

This comment has been minimized.

Copy link
@bertrandfalguiere

bertrandfalguiere Aug 7, 2019

Hi. I'm trying to write a bit of doc about the DHT here: ipfs/docs#240 (still heavy WIP)

The dXOR definition differs from the Kademlia paper, which just uses id1.XOR(id2)

With spec's definition, we don't have the properties of a distance

  1. dXOR(x,x) =0 (dXOR(x,x) = len(x))
  2. x != y => dXOR(x,y)>0 (see x= 00 and y = 01)
  3. dXOR(x,y)+dXOR(y,z) >= dXOR(x,z) (see x=z=00, y=10 )

We don't have either 4) dXOR(x,y) = dXOR(x,z) => x=z (see x=11, y =01, z= 00)

I guess implementations uses "dXOR = 256 - nb of common leftmost bits" to keep 1), 2) and 3)?
Or am I missing something?

This comment has been minimized.

Copy link
@Stebalien

Stebalien Aug 8, 2019

Contributor

This spec is currently incorrect. The actual distance is just the one in the kademlia paper: id1 xor id2. This section is confusing that with the how a peer's bucket is calculated. That uses the number of shared bits:

Bucket 0: no shared bits.
Bucket 1: 1 shared bit.
Bucket 2: 2 shared bits.
...
Last bucket: everything else

This comment has been minimized.

Copy link
@Stebalien

Stebalien Aug 8, 2019

Contributor

(fixed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.