New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making IPFS accessible for distributed archival. #210

Open
20zinnm opened this Issue Jan 2, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@20zinnm

20zinnm commented Jan 2, 2017

At the Climate Mirror project, we're looking to use IPFS for distributing and archiving climate data. However, the computing power we possess is not enough to ensure the availability of the data. I've had the great pleasure to talk with @flyingzumwalt about the applicability of IPFS to the Climate Mirror project, but one of the key things I need is to make helping accessible to everybody and their dog. The main thing we need for our use-case:

  1. Because most people don't have 2TB drives casually laying around, we need to allow people to host subsets of the overall data. We want to use ipfs-ringpin to offer a "climate" pin list, but if someone has 2GB available for the project, it should determine the rarest blocks out of all the pins and fetch those up to capacity. This "if not enough space, get the rarest" is part of what I'd consider an "archival" mode in the ipfs client--p2p for the purpose of archiving, not for streaming a movie.

Thank you for building the internet of the future, now let's store some climate data :)

@flyingzumwalt

This comment has been minimized.

flyingzumwalt commented Jan 2, 2017

cc @hsanjuan @jbenet @whyrusleeping this is an interesting use case for ipfs-cluster. Does anyone know of an easy way to inspect the dht in order to identify the "rarest" blocks from a list of hashes? Is it feasible?

@20zinnm

This comment has been minimized.

20zinnm commented Jan 2, 2017

@flyingzumwalt I think this fits ipfs-ringpin better, but cluster is useful for the team resources.

@flyingzumwalt

This comment has been minimized.

flyingzumwalt commented Jan 2, 2017

Also - @20zinnm makes a good point that we should figure out the relationship between ipfs-ringpin and ipfs-cluster

@20zinnm

This comment has been minimized.

20zinnm commented Jan 2, 2017

My understanding thus far is that ipfs-cluster allows for distribution of a pin list across nodes in the cluster, like one pin list that multiple machines coordinate to fulfill, whereas ringpin lets you copy someone else's pin list for yourself, like some sort of IPFS celebrity social media.

EDIT: Accidentally closed.

@20zinnm 20zinnm closed this Jan 2, 2017

@20zinnm 20zinnm reopened this Jan 2, 2017

@hsanjuan

This comment has been minimized.

hsanjuan commented Jan 3, 2017

if someone has 2GB available for the project, it should determine the rarest blocks out of all the pins and fetch those up to capacity

Currently this is not a usecase supported by ipfs-ringpin or ipfs-cluster, but ipfs-cluster aims to support that sort of pinning. We have a specific user-story on pin rings too: ipfs/ipfs-cluster#7 too. It would be super useful if you can add more comments that help us shape this feature.

I should add that you could probably use ipfs-ringpin publish (using ipns to publish a list of published content) with cluster. The main advantage is that cluster would automatically pin new stuff, rather than asking all nodes to refresh their lists.

@20zinnm

This comment has been minimized.

20zinnm commented Jan 3, 2017

@hsanjuan can people join and leave the cluster with rebalancing? And how connected do clusters need to be (within the same network? same country?)

@hsanjuan

This comment has been minimized.

hsanjuan commented Jan 3, 2017

@20zinnm cluster should implement rebalancing at some point yes.

Connected peers can just be in the same places IPFS peers are..

@20zinnm

This comment has been minimized.

20zinnm commented Jan 3, 2017

@hsanjuan so is it possible to make an "open" cluster where people can "join" the cluster and help redundantly store the pins? If we can develop a mechanism to determine the rarity of blocks, cluster nodes could be assigned blocks based on how rare they are and how reliable a node has been? I was thinking of a cluster ledger via a blockchain, but it could also be authoritative (whichever node is oldest is the leader).

By "open" I mean anyone on the internet.

@flyingzumwalt

This comment has been minimized.

flyingzumwalt commented Jan 6, 2017

Another issue relevant to this discussion: @jbenet's proposal of an ipfs-pack format and associated tooling for datasets on ipfs: #205

@flyingzumwalt

This comment has been minimized.

flyingzumwalt commented Jan 7, 2017

@hsanjuan

This comment has been minimized.

hsanjuan commented Jan 8, 2017

@20zinnm It would be great if you can open an issue in ipfs-cluster (or use ipfs/ipfs-cluster#7) explaining how your dream-tool would seem and we come up with a list of implementable features. Cluster uses a consensus algorithm (raft) to keep a consistent vision of what is pinned across all nodes. There are challenges in making participating nodes dynamic, and in discovering and rebalancing content which is not pinned by enough nodes (specially if those nodes are showing up and going randomly). But we can go little by little in building that functionality and your ideas and feedback would be appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment