New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding how the DHT is implemented #1396

Closed
daviddias opened this Issue Jun 19, 2015 · 14 comments

Comments

Projects
None yet
8 participants
@daviddias
Copy link
Member

daviddias commented Jun 19, 2015

I'm focusing on figuring out how the DHT impl in go-ipfs works so I can bring it to the node-ipfs side. This will be a list of questions for things I stumble upon and don't understand right away. It might happen that I get to answer myself, nevertheless, thought it would be better to do the process logged in an issue than just having ephemeral discussions on IRC. Here it goes :)

  • Providers - If I understand correctly, a "providing" is the process of telling the node responsible for a given hash of a file, that I have 1-n blocks of that file. (Ref: ipfs/notes#15)
  • Providers - How often is this process made? Is there any criteria to keep connections with providers rendezvous points for each file? Do we the DHT dance to find the point each time?
  • Everything is stored in memory, but, as @jbenet said, "go-ipfs dht imply will change its ad-hoc value storage for structure records according to iprs as in ipfs/specs#7", I'm not sure if I understand fully this statement. What is really a structure record?
  • What are the only exposed API calls from the DHT? Can someone make me a quick diagram/story of how BitSwap uses the DHT

Notes:

  • I have to make sure I'm using the same nomenclature as everyone else for things, please correct/enforce that in what I write :)
@jbenet

This comment has been minimized.

Copy link
Member

jbenet commented Jun 19, 2015

Take a look at http://static.benet.ai/t/ipfs.pdf too -- it may help explain some. (but note, it's outdated now, so the specs + code tell the full story).

  • Providers - If I understand correctly, a "providing" is the process of telling the node responsible for a given hash of a file, that I have 1-n blocks of that file. (Ref: ipfs/notes#15)

Both coral (1. and 2.1) and in mainlineDHT, use DHTs as a place to store -- not the value, but -- pointers to peers who have the actual value.

IPFS uses the DHT in the same way (a DSHT, as coral calls it). When a node advertises a block available for download, IPFS stores a record in the DHT with its own Peer.ID. This is termed "providing". the node becomes a "provider". Requesters who wish to retrieve the content, query the DHT (or DSHT) and need only to retrieve a subset of providers, not all of them. (this works better with huge DHTs, and latency-aware DHTs like coral).

We provide once per block, because every block (even sub-blocks) are independently addressable by their hash. (yes, this is expensive, but we can mitigate the cost with better DHT + record designs, bloom filters, and more)

There is an optimistic optimization -- which is that if a node is storing a node that is the parent (root/ancestor) of other nodes, then it is much more likely to also be storing the children. So when a requester attempts to pull down a large dag, it first queries the DHT for providers of the root. Once the requester finds some and connects directly to retrieve the blocks, bitswap will optimistically send them the "wantlist", which will usually obviate any more dht queries for that dag. we haven't measured this to be true yet -- we need to -- but in practice it seems to work quite well, else we wouldnt see as quick download speeds. (one way to look at it, is "per-dag swarms that overlap", but it's not a fully correct statement as having a root doesn't necessarily mean a node has any or all children.)

  • Providers - How often is this process made? Is there any criteria to keep connections with providers rendezvous points for each file? Do we the DHT dance to find the point each time?
  • we provide a block upon adding it (later on we'll be able to optionally mark blocks to not be provided)
  • and we reprovide everything periodically. Right now it's rare, (0.5 * dht record timeout). (~12hrs).

The IPFS DHT is definitely the most resource intensive part of the IPFS network, but it has barely been optimized. Coral-style clustering, improvements based on data locality, better record coalescing, and so on, are low hanging fruit that can dramatically improve the performance. It's also not been measured to be a problem at all yet.

  • Everything is stored in memory, but, as @jbenet said, "go-ipfs dht imply will change its ad-hoc value storage for structure records according to iprs as in ipfs/specs#7", I'm not sure if I understand fully this statement. What is really a structure record?

currently the DHT Records (the values stored at other nodes) are just a trivial value in a protobuf. they're not even signed. this will change shortly to construct Record objects, which will be proper "merkledag" objects following the format specified in the IPRS spec linked above. (that spec is a mess of abstraction upon abstraction, i'll ground it with actual code). the advantage of them being objects like any other is that they can link to other things (including values they refer to, other peer IDs, the public keys needed to verify the signature, etc). the value of IPFS is in linking things, so this can drastically simplify the logic of storing and transferring all the needed pieces.

  • What are the only exposed API calls from the DHT? Can someone make me a quick diagram/story of how BitSwap uses the DHT

Take a look at the Routing interface. Bitswap only knows about Routing. And it only uses the Provide/FindProviders calls: https://github.com/ipfs/go-ipfs/blob/master/exchange/bitswap/network/ipfs_impl.go#L105-L143

Another thing to note:

IPNS uses the Put/GetValue calls, which are not "sloppy", because these records mean small values stored directly in the DHT (not sloppy in that we must find a definitive value. in practice, what we do is retrieve records from several peers, compare them and take the best we find, hence the records have a comparison function (e.g. "latest", etc)).

Since the IPNS records (as any dht records) will now be IPFS objects themselves, we could actually treat these as blocks like any other, and really only use the dht for [Providing/FindingProviders], but we get much faster name resolution + much less weight on the dht this way.)

@jbenet

This comment has been minimized.

Copy link
Member

jbenet commented Jun 19, 2015

Another distinction worth noting:

  • IPFS Objects are merkledag nodes (nodes in a graph linked with hashes): https://github.com/ipfs/go-ipfs/tree/master/merkledag -- they have structure.
  • IPFS Blocks are serialized byte sequences that represent IPFS Objects (i.e. serialized byte buffers). it's useful to talk about them as "blocks" in Bitswap and other things that do not care about what is being stored. (it is also possible to store arbitrary stuff using ipfs block put/get as ipfs block does not check for proper IPFS Object formatting. this may be very good or bad, we haven't decided yet 😄)
@daviddias

This comment has been minimized.

Copy link
Member Author

daviddias commented Jun 19, 2015

Thank you @jbenet !

Is there a succinct explanation of why every method in the interface gets a context.Context(https://github.com/ipfs/go-ipfs/blob/master/Godeps/_workspace/src/golang.org/x/net/context/context.go) as first arg?

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Jun 19, 2015

On Fri, Jun 19, 2015 at 03:36:37AM -0700, David Dias wrote:

Is there a succinct explanation of why every method in the interface
gets a context.Context…

See 1 and (more generically) 2. For languages that aren't Go, you
probably don't need them (just use whatever your language prefers for
canceling and timing out workers).

@daviddias

This comment has been minimized.

Copy link
Member Author

daviddias commented Jun 19, 2015

Bootstrap handles:
1 - Initialisation of data structures that will support the DHT
2 - Registers the listeners that will be handling the queries to the DHT
3 - Queries the network for some random peers, so that we fill our kbucket
Any thing missing from this list?

"The Bootstrap process will run a number of queries each time, and run every time signal fires.", what is this 'signal'?
Update: Got it, it is a clock signal, each 10 mins - https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L78

@daviddias

This comment has been minimized.

Copy link
Member Author

daviddias commented Jun 19, 2015

https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L32-L36 confused, shouldn't the value be way higher than 1?

@daviddias

This comment has been minimized.

Copy link
Member Author

daviddias commented Jun 19, 2015

https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L135 Once we find a peer during the bootstrap process, nothing is done with that peer reference, does it mean that is part of the FindPeer function to update the kbucket if it happens to be a closer peer or if the kbucket doesn't have enough peers? (didn't manage to validate this hipotheses in the code)

Where do we feed the bootstrap peers so that we can leverage DHT from start?

@whyrusleeping

This comment has been minimized.

Copy link
Member

whyrusleeping commented Jun 19, 2015

@diasdavid all the bootstrap method is doing is building up a bunch of initial connections to seed the kbucket table.

@jbenet

This comment has been minimized.

Copy link
Member

jbenet commented Jun 19, 2015

@diasdavid

https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L32-L36 confused, shouldn't the value be way higher than 1?

thats "how many random queries" to run against the dht to discover peers per "bootstrap event".

does it mean that is part of the FindPeer function to update the kbucket if it happens to be a closer peer or if the kbucket doesn't have enough peers? (didn't manage to validate this hipotheses in the code)

every time we get a message from a peer we consider them for the routing table https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_net.go#L129-L132 -- so running a query will update everyone we find on the path. note that the "peer" found with this FindPeer call is random and thus very unlikely to be real. Basically, we issue random queries to sample the dht space.

Where do we feed the bootstrap peers so that we can leverage DHT from start?

sorry, that's super unclear. the DHT signs itself up for notifications from the Network[0] and every time we connect to a new peer, we consider them for the DHT routing table too. (once we support peers without DHT, we should verify they can speak DHT first :)) We first connect to bootstrap peers in a different part, the "core" bootstrap[2, 3], which will be put into a "discovery" module soon, along with mdns discovery [4].

basically, "Discovery" optimistically finds peers across a few systems (e.g. mdns, or a well-known list of bootstrap nodes in our config) to go from 0-n peers. "DHT Bootstrap" is bootstrapping a well-formed DHT by issuing queries. it will cause more connections, but requires at least 1 live connection already (provided by discovery)

@daviddias

This comment has been minimized.

Copy link
Member Author

daviddias commented Jun 26, 2015

Working on the spec here - ipfs/specs#14

@muneeb-ali

This comment has been minimized.

Copy link
Contributor

muneeb-ali commented Jun 30, 2015

Thanks @jbenet for the explanation and @diasdavid for starting work on a spec. I'd love to see a doc/spec of the DHT implementation as well. We're currently working on similar problems and can exchange notes.

@em-ly em-ly added the discussion label Aug 25, 2016

@tensojka

This comment has been minimized.

Copy link

tensojka commented Feb 21, 2018

Could you @jbenet explain how the DHT has changed since 2015? This is still the top result for ipfs dht data locality. Does IPFS take advantage of data locality (yet? if not, why not?)?

Coral-style clustering, improvements based on data locality, better record coalescing, and so on, are low hanging fruit that can dramatically improve the performance.

Is the "low hanging fruit" implemented now? I feel like IPNS performance is significantly hampering its usability.

I rememeber that back in time, when there was a vision that IPFS will eventually replace HTTP, a huge "pro" of IPFS was that one would be able to download blocks from nodes that are close to the user.

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Feb 21, 2018

Does IPFS take advantage of data locality (yet? if not, why not?)?

Unfortunately, not yet. We've had quite a few other high priority issues.

I feel like IPNS performance is significantly hampering its usability.

We know. We've implemented IPNS over pubsub to alleviate this a bit for subsequent lookups but this feature is still experimental. We're also working on drastically reducing the amount of time it takes to connect to new nodes (and working on parallelizing this process) which should further improve the DHT. However, all of these improvements will take time.

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented Feb 21, 2018

Note: We've moved support discussions this to the forums so I'm going to close this issue.

@Stebalien Stebalien closed this Feb 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment