New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the expected use case for ipfs-cluster? #538

Closed
cannium opened this Issue Sep 14, 2018 · 6 comments

Comments

Projects
None yet
3 participants
@cannium

cannium commented Sep 14, 2018

I think for a while and cannot seem to find a very suitable use case for ipfs-cluster. Below are my considerations, correct where I'm wrong:

  • Inside data center: Network inside datacenter is of high quality, so libp2p seems not very necessary. Also, we already have many mature distributed storage solutions in data center.
  • At home: It's rare for a home user to manage so many storage nodes.
  • Between data centers: Raft would suffer from performance issues, as at least 2 RTTs are required for a commit. We might have no better choice for consensus algorithm, though.(But why strong consistency is required between data centers, to store information like pins?)
  • P2P storage network management: Peers come and go in a P2P network, again raft seems not handle this kind of situation well.
  • Filecoin incubation: maybe, but what Filecoin requires has not been settled yet.

So for me, the design goals of this project are ambiguous. I'd like to see a use case that solves a real problem.

@lanzafame

This comment has been minimized.

Collaborator

lanzafame commented Sep 14, 2018

Inside data center: Network inside datacenter is of high quality, so libp2p seems not very necessary.

libp2p isn't about network quality and it when you talk about ipfs/libp2p, nothing is ever just inside the datacenter.

Also, we already have many mature distributed storage solutions in data center.

True, but they don't talk ipfs nor do they have native ways of participating in a p2p network.

Between data centers: Raft would suffer from performance issues, as at least 2 RTTs are required for a commit. We might have no better choice for consensus algorithm, though.

It doesn't go too badly, but agreed it isn't ideal. Main reason for raft is that it is easy to implement and got us off the ground fast. We are looking at other consensus algorithms.

(But why strong consistency is required between data centers, to store information like pins?)

The pinset is the purpose of IPFS Cluster, making sure that those pins are replicated across a certain number of peers is what it is all about.

P2P storage network management: Peers come and go in a P2P network, again raft seems not handle this kind of situation well.

Don't disagree with you here but apart from the reasons stated above about raft, the main use case at the moment for cluster is to provide more stable storage providers to the p2p network.

From the website:

Facilitating the conservation and replication of data (pinsets) across multiple nodes
Supporting the handling of large volumes, where a full DAG does not fit in a single IPFS node

Hope that clarifies things.

@cannium

This comment has been minimized.

cannium commented Sep 14, 2018

I can still see some contradictory points

  • If the purpose is to handle large files, I think ipfs alone could handle it. What's the additional benefits provided by ipfs-cluster?
  • If the purpose is to accurately manage pinsets, strong consensus algorithms are unavoidable, which have different assumptions about nodes than p2p network.
@lanzafame

This comment has been minimized.

Collaborator

lanzafame commented Sep 17, 2018

If the purpose is to handle large files, I think ipfs alone could handle it. What's the additional benefits provided by ipfs-cluster?

Sure, but this means either running a single large ipfs node (doesn't guarantee replicaiton beyond that node) or manually managing a cluster of ipfs nodes.
Manually managing a cluster of ipfs nodes, means to get guaranteed replications across the nodes, you have to run ipfs pin add /ipfs/<hash> on each node separately. Then how do you keep track of that hash? Removing the file is the same repetitive process. How do you go about tracking a hash that you only want to be replicated on some of the nodes?

If the purpose is to accurately manage pinsets, strong consensus algorithms are unavoidable, which have different assumptions about nodes than p2p network.

p2p doesn't require transient peers, always joining and leaving. Think about it more as a process-addressable network instead of an ip-addressable network, though a peer id does get resolved to an ip, the ip is not the identifier. The concept gets covered in this video.

@lanzafame

This comment has been minimized.

Collaborator

lanzafame commented Sep 19, 2018

@cannium I was wondering whether I had been able to answer your original question? or if you had any more? Just so you are aware, we understand that we need to work on documenting these more high-level aspects of the project, we are just strapped for time currently. But these conversations do help us to know what we need to cover when we come to writing up this documentation, so thank you 🙂.

@cannium

This comment has been minimized.

cannium commented Sep 19, 2018

Actually I'm still confused. By your definition ipfs-cluster should work, but maybe a concrete use case is more persuasive.

@hsanjuan

This comment has been minimized.

Collaborator

hsanjuan commented Oct 25, 2018

I'm closing this as it was answeredand there haven't been updates for a while. https://discuss.ipfs.io/ is probably a better place for non-technical, general questions.

@hsanjuan hsanjuan closed this Oct 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment