Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call for Participants: Giant Files on IPFS Initiative #209

Closed
flyingzumwalt opened this issue Dec 5, 2016 · 5 comments
Closed

Call for Participants: Giant Files on IPFS Initiative #209

flyingzumwalt opened this issue Dec 5, 2016 · 5 comments

Comments

@flyingzumwalt
Copy link
Contributor

flyingzumwalt commented Dec 5, 2016

Note: If you urgently need to decentralize copies of your giant files ASAP, ahead of the timeline on this project, please let us know. Some organizations have approached us who need to get their large datasets onto IPFS urgently, before Q1 and before we even have time to set up a formal initiative or test plan. If you have this need, please reach out to us. We will help you act quickly.

We're looking for a handful of organizations who have giant files (Terabyte plus). We are forming an initiative to test using IPFS to move around those giant files. This initiative will be structured as a community collaboration involving stakeholders from multiple organizations. It will formally begin in Q1 2017.

If you represent an organization that has giant files, or if you know of organizations that want help with their giant files, please read on.

Who Should Participate?

We are gathering stakeholders who:

  • Have giant files (ie. 1 terabyte or bigger) and are interested in using IPFS to move around those files
  • Have giant files that are not sensitive data to use in testing
  • Want to participate in the testing process
  • Willing to do participate in a public collaboration (public github repo, public communications except when dealing with sensitive info)

What We Will be Doing

Stakeholders will have a say in the actual structure of this collaboration. The basic idea will be to

  • Conduct Experiments on a "Testbed" Network
  • Provide Use Cases and User Stories
  • Provide Feedback about User Experience

Design and Conduct Experiments

The general idea of the initiative will be: Together, we will design and conduct a series of experiments that will test how well IPFS supports your needs. In their simplest form, these experiments will involve writing giant files to IPFS, passing them to other organizations who are participating in the test, and validating the results.

Provide Use Cases and User Stories

We will also collect use cases from participants and craft user stories around them. We are especially looking for use cases that describe the challenges that organizations currently face when dealing with giant files. We are also actively seeking examples of the access-control needs you face.

These use cases, and the user stories based on them, will provide guidance when we develop new features and functionality over the coming months.

Provide Feedback about User Experience

In addition to gathering use cases to drive new functionality, we also want feedback about the experience of using the current alpha version of IPFS. For example:

  • Is our documentation helpful? What is missing from the docs?
  • Did you encounter any bugs?
  • Were there parts of the User Experience that were especially enjoyable?
  • Are there parts of the User Experience that were especially bad?

Clarify Value Propositions

So far, we know that if you have giant files, IPFS can help you:

  • Distribute data
    • HTTP doesn't work for distributing giant tiles. People with giant files tend to either mail hard drives or dump files on ftp servers.
    • make sure other people can get the data securely
    • spread the burden of distributing data
  • Increase Visibility of engagement with your data
  • Increase Visibility of derivative data (cleaned up versions, results of analysis, merged datasets, extensions to a dataset, etc)
  • Preserve Data -- IPFS makes it easy to pursue a variety of storng Preservation Models
  • What else?

We hope this initiative will allow us to clarify our understanding of value propositions like these so that we can support an explosion of innovation on decentralized technologies like IPFS.

How to Get Involved

If you want to get involved, or if you have questions about this initiative, please contact @flyingzumwalt by sending an email to contact at protocol dot ai.

@flyingzumwalt flyingzumwalt changed the title Call for Participants: Giant Files Initiative Call for Participants: Giant Files on IPFS Initiative Dec 5, 2016
@flyingzumwalt
Copy link
Contributor Author

FYI: I originally posted an inactive email address as the contact email for this initiative. The information is accurate now.

@lukasheinrich
Copy link

The experiments at the Large Hadron Collider at CERN generate many PBs of data. Normally this is access-controlled to the members of the collaboration, but increasingly we release portions of the data to the public. These are hosted on distributed storage solutions developed by CERN (EOS - http://eos.web.cern.ch/ ) and often accessed via protocols such as XrootD (http://xrootd.org/). But this data is public so it would be a nice project to try to store this on IPFS.

@flyingzumwalt
Copy link
Contributor Author

@lukasheinrich do you know of any people or organizations who actively want to put LHC data on IPFS and might want to participate in these tests? Is anyone dissatisfied with the existing techniques and looking for an alternate approach? I'm focused on finding stakeholders who need IPFS so we can understand why they need it and how they want to use it.

@lukasheinrich
Copy link

@flyingzumwalt I think for most use cases we are covered by our internal technologies, but this requires some familiarity with these tools. Not sure if CERN IT is looking at IPFS explicitly, tagging some people who might know more (or may be interested to comment @tiborsimko @jblomer @jirikuncar @RaoOfPhysics)

@flyingzumwalt
Copy link
Contributor Author

Related: ipfs/notes#218 - a possible strategy for optimizing these transfers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants