Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move IPFS/libp2p specific components to ipfs/boxo and/or libp2p/go-libp2p-kad-dht #34

Closed
guillaumemichel opened this issue Jun 28, 2023 · 7 comments
Labels
scope/required Feature required to match go-libp2p-kad-dht

Comments

@guillaumemichel
Copy link
Contributor

All modules that are specific to the IPFS DHT (e.g that cannot/shouldn't be used in other DHT networks/implementations) should move to ipfs/boxo.

These modules include:

  • IPFS Server mechanism, that is only handling IPFS requests (basicserver can remain in this repo for testing purposes).
  • IPFSv1 message module, including the IPFS protobuf message format and helpers.

The IpfsDHT struct should be defined directly in ipfs/boxo. This includes instantiating a new Libp2pEndpoint, building a RoutingTable, defining the server's behavior and the message format, interacting with the query mechanism (to decide when each of the queries should terminate). IPFS constants (e.g bucket size, number of closer peers to return, IPFS DHT protocol ID, etc.) should be defined directly in ipfs/boxo.

Note that consumers of the current go-libp2p-kad-dht repository, will become consumers of ipfs/boxo/kad-dht, and NOT consumers of go-kademlia directly.

What should stay in go-kademlia:

  • Provider Store implementation, it should be made generic so that other implementations can use a data store.
  • Different routing tables (e.g FullRT, ClientRT, LazyRT etc.) because even though they are built to serve in the IPFS DHT, they are generic components that could be used in other Kademlia implementations.
  • Libp2p Endpoint can be used by other Kademlia implementations

The goal of this separation is to get the ground ready for the Composable DHT.

@guillaumemichel guillaumemichel added the scope/required Feature required to match go-libp2p-kad-dht label Jun 28, 2023
@iand
Copy link
Contributor

iand commented Jun 28, 2023

Thanks for this, I was planning to ask you to expand on your thinking here, so having this issue is really useful.

I think it would be good to work some of this into the design documentation. Currently the design has an IPFS DHT section that you could update to be more explicit about the boundaries between this repo's goals and IPFS-specific goals.

I note that peer routing is currently in that IPFS DHT section but do you agree that its a feature that is generally useful across all kad deployments?

@aschmahmann
Copy link

All modules that are specific to the IPFS DHT (e.g that cannot/shouldn't be used in other DHT networks/implementations) should move to ipfs/boxo.

These modules include:

  • IPFS Server mechanism, that is only handling IPFS requests (basicserver can remain in this repo for testing purposes).
  • IPFSv1 message module, including the IPFS protobuf message format and helpers.

Note that consumers of the current go-libp2p-kad-dht repository, will become consumers of ipfs/boxo/kad-dht, and NOT consumers of go-kademlia directly.

As has been mentioned previously both of these are related to the libp2p DHT spec (https://github.com/libp2p/specs/tree/c733210b3a6c042d01f6b39f23c0c9a3a20d3e88/kad-dht) not to the IPFS Public DHT specifically.

For some of the things specific to the IPFS Public DHT that should probably live in boxo look at libp2p/go-libp2p-kad-dht#597 and linked issues. It includes things like the protocol name(s), put/get validators, the network constants like k, the record expiration times, routing table refresh intervals, etc.

@guillaumemichel IIUC moving all the components to boxo is also inconsistent with your comment in libp2p/go-libp2p-kad-dht#846 (comment), where a libp2p DHT user (who is not using the IPFS Public DHT) reasonably wants to keep using their DHT without bringing in IPFS dependencies.

If you don't want any libp2p components in this repo, then this likely means creating a barebones libp2p dht using this implementation as an alternative client/server implementation in go-libp2p-kad-dht. However, note that this means that it is likely that many PRs to modify DHT behavior will end up as multiple PRs with the associated overhead of bubbling as has been flagged previously as the cost of having a separate repo here.

@guillaumemichel
Copy link
Contributor Author

The confusion between the IPFS DHT and the libp2p is expected to be addressed by the Composable DHT. Until then, we need to be very careful with naming and dependencies generally.

  • libp2p DHT implementation: a Kademlia implementation defining a message format, a server behavior and offering the following RPCs: FIND_PEER, PUT_PROVIDER, GET_PROVIDERS, PUT_VALUE, GET_VALUE.
  • IPFS DHT Implementation: an instantiation of the libp2p DHT implementation, using custom parameters (such as bucket size, protocol identifier, routing table refresh interval, etc.)
  • It is possible to instantiate a new libp2p DHT network by using a dedicated protocol ID and a set of bootstrap nodes. (like Celestia and others are doing)
  • IPFS DHT network: the swarm of peers running the IPFS DHT implementation (using the IPFS DHT protocol ID). AFAIU libp2p applications making use of a DHT, but not having a dedicated DHT network use the IPFS DHT network. As long as this is true, default libp2p DHT network == IPFS DHT network. So libp2p peer routing depends on the IPFS network, and also the IPFS DHT implementation (boxo), because we don't want to have different bucket sizes or refresh intervals in the same network for now. It would be possible to change this by having distinct bootstrap peers (not connected to nodes in the IPFS DHT network) for the libp2p DHT network, but it may be insecure if this DHT network isn't well populated.

@aschmahmann I agree with everything you wrote. go-kademlia is a generic Kademlia implementation (genericity is required, not to build BitTorrent implementations, but to build new features such as the Composable DHT, the Double Hash DHT, and generally facilitate the improvement process of the IPFS DHT). For this reason, and as it doesn't depend on libp2p other than being a possible transport, go-kademlia should not be the libp2p DHT implementation. However the libp2p DHT implementation (e.g go-libp2p-kad-dht) should depend on go-kademlia. And finally the IPFS DHT implementation (e.g boxo) should depend on the libp2p DHT implementation.

The libp2p DHT implementation should define the IpfsDHT (or Libp2pDHT?) struct, the server behavior (request handling), and message format. The IPFS DHT implementation should only define parameters of the libp2p DHT implementation, such as protocol ID, bucket size, refresh interval etc. So the IPFS DHT implementation would be an instantiation of the libp2p DHT implementation, itself depending on go-kademlia for the Kademlia routing logic.

You are right that we should pay attention to libp2p/go-libp2p-kad-dht#846, but I doubt we will be able to tackle the weird dependency chain (libp2p DHT network -> IPFS DHT network -> IPFS DHT implementation -> libp2p DHT implementation) before the Composable DHT.

However, note that this means that it is likely that many PRs to modify DHT behavior will end up as multiple PRs with the associated overhead of bubbling as has been flagged previously as the cost of having a separate repo here.

Yes, it is indeed not ideal. go-kademlia's goal is to solve the Kademlia routing, and to expose a simple interface to its consumers. This interface is simple and generic, allowing the caller to control some parts of the behavior, or directly implementing its modules implementing the defined interfaces in the same repo. The Kademlia routing interface is not expected to change in the future, so once the repo is functional, its interfaces are not expected to change. The next potential change would come with the Composable DHT (if go-kademlia is transformed to be the new Composable DHT implementation). Alternatively, the Composable DHT could be another repository depending on go-kademlia.

Alternatively, if we don't want to have 3 DHT repos (go-kademlia, go-libp2p-kad-dht and boxo/kad-dht), we could merge the libp2p DHT implementation with go-kademlia. One module of go-kademlia could the the libp2p DHT implementation. We could add more implementations, for instance a simulation implementation that we are using to test the protocol, and example implementations showing how to make use of the go-kademlia repo. So the libp2p DHT implementation would be an example of how to use go-kademlia.

@guillaumemichel guillaumemichel changed the title Move IPFS specific components to ipfs/boxo Move IPFS/libp2p specific components to ipfs/boxo and/or libp2p/go-libp2p-kad-dht Jun 30, 2023
@iand
Copy link
Contributor

iand commented Jun 30, 2023

Proposal

Split functionality over 3 repos:

  • The focus of go-kademlia remains a generic Kademlia toolkit that can be configured for use by different networks. It provides a more maintainable, better performing and extensible foundation for new ideas like the composable DHT.

  • Keep go-libp2p-kad-dht as the home of the libp2p dht, but refactor it to be built in terms of go-kademlia. The result of this refactoring becomes version 2 (go-libp2p-kad-dht/v2)

  • Create a package in boxo (for example: routing/dht) that contains the configuration of the libp2p dht for IPFS.

Outcome:

  • IPFS applications interacting with the IPFS DHT network (e.g. Kubo) can use boxo
  • Projects using libp2p for non-IPFS dhts (e.g. Celestia and others) can use go-libp2p-kad-dht/v2 directly with new parameters

Tasks

  • create a v2-develop branch in go-libp2p-kad-dht to track the refactor, eventually to become v2
  • rename go-libp2p-kad-dht/IpfsDHT to KadDHT in v2-develop
  • make KadDHT configurable with protocol name, bootstrappers, validators, network constants, expiration times, routing table refresh intervals etc.
  • message formats and server behaviour are implemented in go-libp2p-kad-dht/v2
  • refactor go-kademlia to remove dependency on go-libp2p. Functional dependencies such as Libp2pEndpoint move to go-libp2p-kad-dht/v2 whereas constant/type dependencies like Connectedness are replaced with local equivalents.
  • go-kademlia focuses on kademlia algorithm implementation, searches, event queue management, peer routing
  • Create a DHT type in boxo that instantiates KadDHT with IPFS specific options. This is low configuration with sensible defaults. Applications requiring more control can use KadDHT directly.

@guillaumemichel
Copy link
Contributor Author

@iand it makes a lot of sense to me!

A few minor remarks:

  • We may want to keep the KadDHT module, and Libp2pEndpoint, server, message format, etc. in go-kademlia (e.g in the example folder while we are actively working on them, as the interfaces may slightly change during the development. If we already split the code, we would have to do one PR in each repo when updating an interface. Once we are happy with the KadDHT implementation, we can move it and its associated modules to go-libp2p-kad-dht/v2.
  • IMO having examples for modules in go-kademlia can be useful, especially if they are generic enough. For instance Libp2pEndpoint seems generic enough and could be used in Kademlia implementations other than KadDHT as a message endpoint. And generally, I think it is good to have examples for how to implement an interface in the same repo.
  • KadDHT could also be named Libp2pDHT

@iand
Copy link
Contributor

iand commented Jun 30, 2023

Agree that prototyping in the example folder can make sense (although practically speaking go.work files make cross-module development trivial)

Libp2pDHT seems redundant to me since it's in a libp2p repository/module. libp2p/go-libp2p-kad-dht#337 suggests naming it Kad. I think KadDHT makes it clear that it's a Kademlia DHT rather than something like Chord 😄

@iand
Copy link
Contributor

iand commented Jul 11, 2023

Closing as resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope/required Feature required to match go-libp2p-kad-dht
Projects
Archived in project
Development

No branches or pull requests

4 participants