Skip to content
This repository has been archived by the owner. It is now read-only.

Replication on IPFS -- Or, the Backing-Up Content Model #47

Closed
jbenet opened this Issue Sep 27, 2015 · 12 comments

Comments

Projects
None yet
10 participants
@jbenet
Copy link
Member

jbenet commented Sep 27, 2015

Some of the most frequently asked questions about IPFS are around "how does IPFS guarantee content sticks around" and "how do you ensure I do not download bad things". The short answer is: IPFS doesn't by itself download things you don't ask it to. Thus, backing up content must be done a layer on top of IPFS, with ipfs-cluster, Filecoin, or similar protocols.

Important Design Goals for Content Distribution:

  • IPFS has as a strict requirement that content be able to move as fast as the underlying network permits. this rules out designs like freenet's and other oblivious storage platforms, as the base case. They're just way too slow for most of IPFS use cases. That said, these can be implemented trivially with the use of privacy focused transports (like Tor), content encryption, and so on.
  • IPFS has as a design requirement that nodes be able to only store and/or distribute content they explicitly want to store and/or distribute. This means that computers that run IPFS nodes do not have to host "other people's stuff", which is a very important thing when you consider that lots of content in the internet is -- in some for or other -- illegal under certain jurisdictions.
  • IPFS nodes will be able to express policies, and subscribe to network allow/denylists and policies that express content storage and distribution requirements. This way, users and groups can express what content should or should not be stored and/or distributed. This is required by users to (a) comply with legal constraints in their respective countries, (b) required by users with stricter codes of conduct (i.e. content that is legal but undesired by a group -- e.g. a childrens website).

Question and Answers:

  • Q: When I add content, what happens?
    A: It is stored in your local node, and made available to other nodes in your network, via advertising it on the routing system (i.e. the IPFS-DHT). The content is not sent to other nodes until they explicitly request it, though of course some content may already exist in the system (content-addressing).
  • Q: Can peers tell what I have?
    A: In some modes yes, in others no. Peers who request content being advertised from a node can retrieve it and thus see that the node indeed had that content. These advertisements will be configurable through policies in the future, to give users better control over what is published to whom. Obscuring content altogether is addressed a layer above raw ipfs, through the use of (a) encryption and capabilities, (b) transport + routing systems with stronger privacy guarantees, and (c) peer authentication and trust models.
  • Q: Will i store other people's stuff?
    A: No, by default IPFS will not download anything your node doesn't explicitly ask for. This is a strict design constraint. In order to build group archiving, and faster distribution, protocols are layered on top that may download content for the network, but these are optional and built on top of basic IPFS. Examples include bitswap agents, ipfs-cluster, and Filecoin.
  • Q: but bitswap says it may download stuff for others, to do better?
    A: yes, this is an extension of bitswap, not implemented yet, and will be either opt-in, or easy to opt-out and following the denylists (to avoid downloading bad bits).
  • Q: how can i ensure something remains online?
    A: you can do this by keeping one or several ipfs nodes online pinning the content you're interested in backing up, the more ipfs nodes pinning content, the better redundancy you get. Tools such as ipfs-persistence-consortium, pincoop, and ipfs-cluster on top of ipfs allow you to share the costs of bandwidth with other people or organizations. Then, protocols like Filecoin will allow you to just pay the network to do it for you (i.e. similar to how people pay "the cloud companies", but here you're paying the network itself). (Filecoin is not live yet)

    Work in Progress

@longears

This comment has been minimized.

Copy link

longears commented Oct 31, 2015

So right now I could discover a random IPFS node, enumerate its hashes, and download the content?

This means if I want to back up my own secret files I need to either

  • encrypt them before they go into IPFS
  • or run my own private IPFS network by changing the bootstrap nodes

For the second case, is communication between IPFS nodes encrypted on the wire or should I encrypt my files anyway to avoid eavesdroppers?

I get that IPFS is mostly designed for public content. Just want to understand what precautions to take for secret content with IPFS as it is today. It sounds like "don't tell people your hashes" is not enough. :)

@jbenet

This comment has been minimized.

Copy link
Member Author

jbenet commented Nov 1, 2015

enumerate its hashes,

can't quite enumerate the hashes, but can maybe find provider records for the hashes they're willing to serve.

This means if I want to back up my own secret files I need to either

  • encrypt them before they go into IPFS
  • or run my own private IPFS network by changing the bootstrap nodes

That's right. Though we'll have encryption built in soon.

For the second case, is communication between IPFS nodes encrypted on the wire or should I encrypt my files anyway to avoid eavesdroppers?

encrypted. but not yet audited, so beware. we'll upgrade our security advertisements as we test + audit the pieces. it's better to claim less for now. -- though I consider it already safer than most HTTP (and even some HTTPS) traffic already, given HTTP traffic is not encrypted (and HTTPS traffic is not integrity checked at all!!)

I get that IPFS is mostly designed for public content. Just want to understand what precautions to take for secret content with IPFS as it is today.

It isn't designed for public content-- it's designed for private content too. we just dont have encryption in yet. But yes, definitely pre-encrypt anything personal.

It sounds like "don't tell people your hashes" is not enough. :)

certainly not!

@randomshinichi

This comment has been minimized.

Copy link

randomshinichi commented Feb 1, 2016

The content is not sent to other nodes until they explicitly request it, though of course some content may already exist in the system (content-addressing).

Does this mean that unless someone else explicitly downloads a hash of my file, I will be the only one who has a complete copy of the file on IPFS, and all the other nodes may only have a few blocks of my file?

@lgierth

This comment has been minimized.

Copy link
Member

lgierth commented Feb 1, 2016

Does this mean that unless someone else explicitly downloads a hash of my file, I will be the only one who has a complete copy of the file on IPFS, and all the other nodes may only have a few blocks of my file?

Yes. Nodes won't fetch anything unless told to.

@dimitarvp

This comment has been minimized.

Copy link

dimitarvp commented Jan 7, 2017

I'm wondering if there's an enhanced BitTorrent-like mode planned for IPFS -- for example if I add a publicly-accessible and legal big blob of data, the IPFS network will automatically ensure that at least 2 other nodes have my content as well. This could probably be called "redundancy guarantee mode" or something, and it won't be the default mode in which your node will be running. You'll have to go out of your way to activate it.

One of the uses I would have for IPFS is exactly this: redundancy and cooperation. Say we host our own Dropbox-like directories, fully encrypted (or DB backups, or fully cached smaller websites, or legally owned and DRM'd movies/music, etc). For this to work, part of the network volunteers would donate disk space and machine uptime for it. I know that I certainly will volunteer if IPFS gains this capability.

(Example: Storj, and maybe MaidSafe as well.)

I am well aware this isn't the original goal of IPFS. I am simply wondering if anybody amongst the designers or implementors ever had this idea.

@hsanjuan

This comment has been minimized.

Copy link

hsanjuan commented Jan 13, 2017

@dimitarvp See https://github.com/ipfs/ipfs-cluster/ , particularly user-story issues. Feel welcome to add your own user-story or to contribute to an existing one, as these will shape ipfs-cluster development.

@dimitarvp

This comment has been minimized.

Copy link

dimitarvp commented Jan 13, 2017

@hsanjuan Thank you.

@flyingzumwalt

This comment has been minimized.

Copy link
Contributor

flyingzumwalt commented May 23, 2017

@arni077

This comment has been minimized.

Copy link

arni077 commented Jul 9, 2018

@lgierth when you mean "download a hash of my file" then it means a node just visited to an ipfs website? or to download it a file you have to pin it?

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Jul 9, 2018

Visiting will download it (well, at least some of it).

@arni077

This comment has been minimized.

Copy link

arni077 commented Jul 9, 2018

@Stebalien in order to be a node i have to run a daemon on the command line?

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Jul 9, 2018

Either that or use ipfs-desktop.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.