IPFS & SWARM

also here https://ethereum.stackexchange.com/questions/2138/what-is-the-difference-between-swarm-and-ipfs

SWARM -- IPFS

The question about the similarities/differences between Ethereum's swarm and IPFS and their future plans including various levels of collaboration comes up very regularly. I will try to do justice to this great interest by giving a rather detailed answer (This article will be heavily edited and provide even more technical detail over time as I make more use of my scattered unedited notes).

Disclaimer: As I am the main author of swarm this document is written from my personal perspective and is thus inevitably biased from my point of view. It is a reflection of my understanding of IPFS, which is incomplete or possibly incorrect and subject to change in the future.

tl;dr Swarm and IPFS

The two projects can be classified as the "same, same but different" - the big picture perspective and high level vision connects the two projects under various banners of the next webz.
The 2 projects are close allies and very open to collaboration on both technical and marketing fronts.
In order to push the highly overlapping agenda of both projects, and to most effectively benefit from both projects resources, various degrees of integration of the actual technologies are proposed.

Within this document, after summarizing the strikingly aligned motivational context and high-level similarities between the two projects, I will discuss the differences with regards to organizational, ideological and technological aspects of both projects. Finally, I will propose multiple potential avenues for collaboration and technology integration spelling out pros and cons of each project.

Similarities

tl;dr Swarm and IPFS both offer comprehensive solutions for an efficient decentralized storage layer for the future of the next generation internet. Both high level goals and technology used are very similar. Both swarm and IPFS systems aspire to provide:

A generic decentralized distributed storage solution.
Content delivery protocol.

This is carried out by creating a network of cooperating nodes each running a client conforming to a rigorously defined communication protocol used for storage and retrieval of arbitrary content. Leveraging individual participant's surplus storage and bandwidth, the network nodes collectively provide a serverless hosting platform.

Both projects:

Aspire to offer a layer of (monetary) incentivization for participating nodes to encourage healthy operations and/or insurance/reassurance :) to users as well as to provide compensation for resource use.
Use some sort of block storage model where larger documents are chopped up and the pieces can be fetched in parallel.
Provide integrity protection by content addressing (also for encrypted partial content).
Both projects offer URL schemes and decentralized domain name resolution.
Transparent and efficient mapping of file system directories to sets of storage objects.

As a result both are in principle ideally suited for replacing the data layer of the current broken internet and serve as storage layers for the web3 vision (alongside other kin ventures, notably zeronet, Maidsafe, i2p, storj, etc.) with the usual must-have properties of distributed document storage.

Low-latency retrieval.
Efficient auto-scaling (content caching).
Reliable, fault-tolerant operation, resistant to node's disconnections, intermittent availability by redundant storage.
Zero-downtime.
Censorship-resistant.
Potentially permanent versioned archive of content.

Differences

tl;dr Subtle but important differences in both projects design will likely keep the two projects steady and separate on their own relative tracks. Since the big picture and the high level solution are so magically aligned, the differences can be found elsewhere. I will group them under:

(A) Development status/popularity/user base. (B) Philosophical/ethical/political. (C) Lower-level technicalities.

(A) Status

tl;dr: IPFS is much further along in code maturity, scaling, adoption, community engagement and interaction with a dedicated developer community. Yet swarm's place in the Ethereum ecosystem translates to inherent infrastructural advantage.

both IPFS and swarm are fully open source and the reference implementations are written in the Go language (swarm has an outdated java version, IPFS has javascript)
both IPFS and swarm are alpha software before their production release
IPFS has been proven to scale quite reasonably, swarm is just starting to be tested on larger scale (though swarm is built on top of devp2p, the Ethereum p2p networking layer which itself barely need testing)
IPFS has had the product open for longer, and recruited a decent userbase, Swarm has not really come out yet, the POC release series just started this year
IPFS has a lot of material out, videos, good docs, papers. Swarm has 2 devcontalks, scattered docs and 2 papers (first 2 in the ethersphere orange paper series) about the incentives to be published mid April! And a swarm guide is in the making
IPFS has a working network (no incentivization), Swarm just recently launched the first stage of developer testnet
IPFS already serves as a working solution for real-world businesses and is supported by an enthusiastic user-base
swarm benefits from strong synergy with Ethereum, its promising ecosystem, live network of users and its organisational background in the form of reliable continued funding from the non-profit foundation. IPFS also has reliable funding sources and also used and supported by members of the Ethereum community.

Despite strong voices from the community disapproving of reinventing the wheel, swarm as a comprehensive in-house solution suffered and survived the toughest of times: the austerity measures of last autumn 2015 due to the financial difficulties of the foundation slowed the development. The favourable circumstances in 2016 made our original vision realistic once again and development has seen a new surge likely further testified by expanding the dev team. Admittedly biased, I am convinced that building our own hand tailored system is a winning ticket to enable such a pivotal component of web3 to quickly and flexibly adapt and co-evolve with Ethereum (EVM), its governance and funding aligned with that of Ethereum.

It is crucial that swarm's privileged infrastructural/organisational status should not be by itself the deciding factor in the predominant adoption among available alternatives when web3 comes to the masses. My intention is that users' choice be based on inherent merits of the particular technology and that the selection is not unduly restricted by arbitrary choices/limitations of Ethereum (e.g., use of devp2p network layer, see below).

Conversely, by bringing more and more discussions about our roadmap to the public, we aspire to counterbalance IPFS's advantage due to longer time being around. Maturity has a rightful place in choosing a technology if you need it now, so the discussion here is relevant to developers with medium-to-long-term plans. Hopefully by the time both projects are out with production-ready release candidates, the differences in this section become insignificant to let features, efficiency and ease of use dominate the evaluation.

(B) philosophical

tl;dr: no critical mismatch but different enough to predict and justify parallel evolution of the 2 projects.

Advice In many places of the world the cartels of information copyrighting or advocates of restricted freedom of information have the resources to come after you. If the cause of total transparency and unimpeded support for freedom of information is important to you (be it on moral or opportunistic grounds), consider supporting swarm.

Swarm is very specifically meant to be part of the Ethereum ecosystem. From the outset, it was always conceived of as one of the three pillars of the next webz, and alongside Ethereum and Whisper define the holy trinity of web3 components. Its development is guided and inspired by Ethereum's needs (most importantly the need of hosting dapps, contract source/metadata and the blockchain/state/etc). It is developed in the context of Ethereum's capabilities (including potential limitations) and as long as funded by the foundation is guaranteed to cater for specific uses arising in the Ethereum ecosystem.

Meanwhile IPFS is a unifying solution catering for integrating many existing protocols. In this respect

Swarm has a very strong anti-censorship stance. It incentivizes content agnostic collective storage (block propagation/distribution scheme). Implements plausible deniability with implausible accountability through a combination of obfuscation and double masking (not currently done). IPFS believes that wider adoption warrants compromising on censorship by providing tools for blacklisting, source-filtering though using these is entirely voluntary.

(C) technicalities

tl;dr:

swarm's core storage component as an immutable content addressed chunkstore rather than a generic DHT (distributed hash table).
you can upload to swarm, use it as cloud hosting, in ipfs you can only register/publish content already on your hard drive.
the two systems use different network communications layer and peer management protocol
swarm has deep integration with the Ethereum blockchain and the incentive system benefits from both smart contracts as well as the semi-stable peerpool. Filecoin, a planned incentivised network over IPFS aims to use its altcoin blockchain, with proof of retrievability as part of mining. The consequences of these choices are far reaching.

These properties plays a big role in the low level differences.

devp2p vs libp2p:

swarm heavily relies on the Ethereum p2p network, using ethereum's devp2p (protocol multiplexing, message interleaving by framing, encryption, authentication, handshake and protocol message API standard, peer connection management support, node discovery) and leverages its robustness and most notably inherits its (audited and widely-praised) security properties.

IPFS uses libp2p network layer, a similarly advanced generic p2p solution. It is an in-house development based on the mainline bittorrent dht implementation that stood the test of time but improved by state-of-the-art optmisations. For historical accuracy, it seems that devp2p is heavily inspired by libp2p (Devcon0 IPFS talk in nov 2014 Berlin and earlier exchange between Juan Benet (IPFS) and Gav Wood & Alex Leverington (ETH))

The Ethereum devp2p provides a semipermanent connection pool over TCP. As a result of the ethereum ecosystem, many nodes are commited long term. These properties turn out to support relatively novel solutions in both incentivisation and storage/retrieval.

Swarm, just like IPFS, implements key-based routing based on xor logarithmic distance (applied to the shared address space of node-ids and content hashes), however swarm uses a hybrid flavour of forwarding kademlia: rather than iterative lookups and filtering performed by the originator of a request relying on a larger pool of peers, swarm recursively outsource the successive steps of lookup and use only a smaller pool of active connections. Further aspects of this algorithm are overly technical for the scope of this note.

Swarm is content addressed chunk archive while IPFS is more akin to bittorrent with a content addressed DHT (distributed hash table). A DHT is a distributed index which decentralised storage solutions use to look up content addressed data. While this data is usually (IPFS) metadata about downloading the content, in swarm its the content itself. Note that the DHT is just one available protocol in IPFS (IPFS's layered design is highly modular). This strict interpretation of an immutable content addressed chunkstore is a major design feature of swarm, which together with devp2p allow swarm to do:

efficient pairwise accounting offchain (used for fair incentivisation of bandwidth as well as immediate settlement of insured storage)
smoother automatic scaling on popular content
quasi-anonymous browsing
efficient collective auditing of integrity (on rarely accessed content) offchain

Juan commented here that the kademlia DHT is just an optional routing component in IPFS and that it can actually do all these things. (My homework to figure out whether and how).

I believe both IPFS and swarm will offer streaming of encrypted content with integrity protection (even on partial content).

Incentives

Filecoin is a sister project of IPFS, it adds an incentivisation layer to IPFS and relies on its own altchain. Proof of retrievability "mining" on the filecoin blockchain is a scheme providing ongoing compensation to storers for preserving content. Random audits as part of the proof of work task are responded to with proof of retrievability and the winning miners get compensated accordingly. Such a system has inherent limitations: IPFS can only implement positive incentives and relies on collective responsibilty.

Swarm exploits the full capabilities of smart contracts to handle registered nodes with deposit to stake. This allows for punative measures as deterrents. Swarm provides a scheme to track responsibilities making storers individually accountable for particular content.

IPFS has no guarantee of storage, while swarm enforces content-agnostic behaviour and offers content-specific levels of security flexibly adjustable by the users. Juan commented here that they have been adding something this to Filecoin which will also have a smart-contract blockchain, but these are as yet unpublished ideas and plans.

Swarm will implement efficient automated collective auditing of rarely-accessed content off chain with last-resort litigation on the blockchain as part of content insurance (a crucial feature). Using a pairwise accounting protocol and delayed micropayments off-chain swarm offers substantial savings on transaction costs while maintaining security. IPFS+filecoin's reliance on competitive proof of custody mining means excessive use of the blockchain and an inherently redundant use of resources for normal operation.

With pairwise accounting, delayed payments, and collective audits all off-chain, swarm relies a lot less heavily on the blockchain restricting its use only to registration and last-resort litigation.

Manifests

Finally the swarm 'manifest' concept (universal routing table/key-value index with integrity protection) allows for

modeling hierarchical file systems on the cloud
serverless servers with routing table and a principled system of metadata (content type, encryption and insurance info etc)
implementing arbitrary DHTs within swarm, so it can support "sidechains" or the db component of traditional webapps (c. mysql in LAMP stack etc)

Juan comments here that IPFS can do all these things too. Admittedly both projects are bit handwavy here without working code or doc...

Integration and collaboration

tl;dr though the big picture aligns well, there are challenges in integrating the two projects. A few proposals are outlined offering pros and cons of each.

Juan of IPFS has long been a regular at Ethereum events, the two communities seem to be mutually appreciative. The similarity of the two endeavors both in technical detail and with respect to the generic objective led many to the question if the two efforts could and should be somehow unified or coordinated. This has led some to believe that swarm is 'just an incentive layer on top of arbitrary decentralised storage'. This misconception may have been encouraged also by the fact that during the tough times last autumn, the foundation had to limit the scope of the project and Vitalik (quite wisely) decided that development effort should be shifted to include incentivisation but exclude R&D for parts that are taken care of by IPFS. The motivation for these restrictions no longer exist. The philosophical/marketing differences, organisational affiliations, business models and technical dependencies make it very unlikely the two projects become one any time soon.

Integration of IPFS into swarm

After a lot of thinking and going through IPFS docs and code i felt like committing to some level of integration. What follows is a preliminary list of various - admittedly rather speculative - approaches.

implement the IPFS plugin into swarm as an implementation of the cloud store abstraction of its network protocol
- Pros:
  - implementing this requires the least development effort
  - As I see now, this approach would allow the swarm incentive system to drive IPFS nodes.
  - This approach promises ways to test the correctness and benchmark the efficiency of the two systems
- Cons:
  - unclear if the swarm chunk-based storage enforced on IPFS style routing and retrieval would have any performance gain
  - unclear if an ethereum-based incentive on top of IPFS would impact the IPFS user base etc.
  - unclear if using IPFS libp2p (alongside devp2p) adds security risk and or makes network traffic monitoring/load balancing harder
  - unclear if this (ab)use of an IPFS component lends itself to realistic way of coordinating the parallel development of IPFS and Swarm (here i am thinking of planning software updates etc). Juan comments here that he thinks it would be just fine ;)
This solution is very very likely to be pursued due to its optimal gain/effort ratio
work out and implement a simpler Ethereum based incentive layer better suited for IPFS.
- Pros:
  - integrity of the entire mechanics of IPFS is preserved
  - the parallel development of IPFS and Swarm by separate teams on different schedules makes it somewhat more realistic to maintain long term than the first proposal
- Cons:
  - Such an incentive layer (for IPFS) does not yet exist and frankly I personally find it hard to see it done properly any time soon,
  - The couple of crude ideas i had for a workable incentive scheme for IPFS would entail compromise on the desired spec for a permanent web incentive system
some crazy ideas for the lulz:
- implement filecoin as a swarm sidechain
- consider migrating ethereum to libp2p and use IPFS with all its glory
Integrating swarm into IPFS as a subprotocol
- Pros:
  - it's kind of cool
- Cons:
  - real benefits unclear
Mounting IPFS over the transport layer (RLPx) of devp2p - This has been suggested to me very recently by Juan and we have not fully captured and investigated the prerequisites or consequences of this
- Pros:
  - IPFS could potentially be used in all its glory
  - the devp2p integrity would be preserved
- Cons:
  - ??
Cross-polination of ideas and snatching code: With all the above stronger forms of collaboration failing, I still see a possible synergistic effect mutually benefiting the two projects
- go implementation allows quick adoption of relevant ideas now and in the future without pushing for strings attached
- both IPFS and Ethereum being quite sexy projects, they strengthen each other's PR and increase the chances of success
- given the incredible traction and rate of innovation in the crypto 2.0 / web3 space, it is highly likely that some of my proposals above will change and new opportunities present themselves.
All in all there is a solid basis to continue being friends and use each other's resources in whatever (morally sound) way to promote the shared objective of a free, private, resource-efficient serverless web.

Next step April 27-28 Berlin IPFS-Swarm synergistic entanglement & interplanetary cross-pollination

All the opinions and mistakes are mine only and I welcome feedback.

Resources

The original stackoverflow question: http://ethereum.stackexchange.com/questions/2138/what-is-the-difference-between-swarm-and-ipfs

IPFS/SWARM on reddit:

IPFS

Swarm

ÐΞVcon talks on swarm

ETHERSPHERE orange papers

Viktor Trón, Aron Fischer, Daniel A Nagy and Zsolt Felföldi: swap, swear and swindle: incentive system for swarm. pdf|html|bibtex entry
Viktor Trón, Aron Fischer, Daniel Varga: smash-proof: auditable storage for swarm secured by masked audit secret hash. pdf|html|bibtex entry

code and status

follow swarm

@ethershere on twitter
gitter swarm room
swarm on swarm: bzz://swarm public gateway (thanks to @TerekJudi|@uwaterloo) once testnet is public

Provide feedback

Saved searches

Use saved searches to filter your results more quickly