# Chapter 30: Decentralized Storage

---

Blockchains excel at storing state and executing logic, but they are not designed for storing large files—images, videos, documents, or even extensive metadata. Storing such data on-chain would be prohibitively expensive and inefficient. This is where **decentralized storage** networks come in. They provide a way to store data off-chain while maintaining the benefits of decentralization: censorship resistance, verifiability, and persistence. In this chapter, we'll explore the leading decentralized storage solutions—IPFS, Filecoin, and Arweave—and learn how to integrate them into your DApps for storing NFTs, user content, and application data.

---

## 30.1 The Need for Decentralized Storage

### 30.1.1 Limitations of On-Chain Storage

Storing data directly on a blockchain like Ethereum has severe limitations:

- **Cost**: Writing data to Ethereum costs gas. For example, storing 1 KB costs roughly 640,000 gas (at 20 gwei, that's ~$25 depending on ETH price). Storing an image of 1 MB would cost millions of dollars.
- **Scalability**: Even if cost were not an issue, the block gas limit restricts how much data can be included in a single block. Large files would bloat the blockchain and make it harder for nodes to sync.
- **Privacy**: On-chain data is public to everyone, which is not always desirable.

Thus, applications that need to store files—NFT metadata, profile pictures, documents, etc.—must look elsewhere.

### 30.1.2 Decentralized vs. Centralized Storage

Centralized storage (AWS S3, Google Cloud) is cheap and fast, but it introduces a single point of failure and trust. The provider can delete, censor, or alter data, and the service may go down.

Decentralized storage aims to replicate the benefits of blockchains—no central control, data integrity, and availability—while offering low-cost, scalable storage.

**Comparison:**

| Feature | Centralized (AWS S3) | Decentralized (IPFS/Filecoin) |
|---------|----------------------|-------------------------------|
| **Control** | Single company | Community / protocol |
| **Censorship resistance** | Low (provider can remove data) | High (data replicated across peers) |
| **Cost** | Low (pay-per-use) | Very low to zero (incentive-based) |
| **Speed** | Fast (CDN) | Variable (depends on peers) |
| **Persistence** | As long as you pay | Can be permanent (Arweave) or incentive-driven (Filecoin) |
| **Verifiability** | Trust-based (you trust the hash?) | Content-addressed (hash ensures integrity) |

Decentralized storage is essential for truly decentralized applications.

---

## 30.2 IPFS (InterPlanetary File System)

IPFS is a peer-to-peer hypermedia protocol for storing and sharing data in a distributed file system. It's not a blockchain; it's a network protocol where files are addressed by their content, not their location.

### 30.2.1 Content Addressing

In traditional HTTP, you address a file by its location (URL: `https://example.com/cat.jpg`). If the server moves the file or goes down, the link breaks. In IPFS, files are addressed by their cryptographic hash, called a **CID** (Content Identifier). For example: `ipfs://QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco`

**Properties of content addressing:**
- **Integrity**: If the file changes, the hash changes, so you always get exactly what you requested.
- **Deduplication**: If two users upload the same file, it's stored only once on the network.
- **Persistence**: As long as at least one node hosts the file, it remains accessible.

**How to get a file from IPFS:**
- Use an IPFS gateway: `https://ipfs.io/ipfs/QmXoypizj...`
- Run your own IPFS node and fetch via `ipfs get <CID>`.

### 30.2.2 IPFS Architecture

IPFS consists of several components:

- **Libp2p**: A modular network stack for peer-to-peer applications. Handles peer discovery, transport, and security.
- **IPLD (InterPlanetary Linked Data)**: A data model for content-addressed data structures, allowing linking between CIDs.
- **Bitswap**: A protocol for exchanging data blocks between peers.
- **MerkleDAG**: The data structure underlying IPFS; files are split into blocks and linked in a DAG.

```
IPFS Stack:
┌─────────────────┐
│   Applications  │  (browsers, DApps)
├─────────────────┤
│   IPFS API      │  (HTTP API, CLI)
├─────────────────┤
│   IPLD          │  (data model, linking)
├─────────────────┤
│   Libp2p        │  (networking, peer discovery)
├─────────────────┤
│   Transport     │  (TCP, QUIC, WebSockets)
└─────────────────┘
```

When you add a file to IPFS, it is split into blocks (typically 256 KB each). Each block is hashed, and the hashes are arranged in a DAG. The root CID points to the top-level block, which contains the list of child CIDs.

### 30.2.3 Uploading and Retrieving Files

**Using the IPFS CLI:**
```bash
# Add a file
ipfs add myimage.jpg
# Output: added Qm... myimage.jpg

# Retrieve via gateway (if node running)
ipfs cat Qm... > copy.jpg

# Pin the file (ensure it stays on your node)
ipfs pin add Qm...
```

**Using JavaScript (ipfs-http-client):**
```javascript
import { create } from 'ipfs-http-client'

const ipfs = create({ url: 'https://ipfs.infura.io:5001' })

async function uploadFile(file) {
  const { cid } = await ipfs.add(file)
  console.log('CID:', cid.toString())
  return cid
}

async function retrieveFile(cid) {
  const chunks = []
  for await (const chunk of ipfs.cat(cid)) {
    chunks.push(chunk)
  }
  return Buffer.concat(chunks)
}
```

### 30.2.4 Pinning Services

IPFS is a peer-to-peer network; files are only available as long as at least one node hosts them. If you stop your node, your files may disappear. **Pinning services** are third parties that promise to keep your files online, often for a fee.

Popular pinning services:
- **Pinata**: Easy-to-use API, web interface.
- **Infura IPFS**: Free tier with rate limits.
- **Web3.Storage**: Free (backed by Filecoin).
- **NFT.Storage**: Specialized for NFTs, free (backed by Filecoin).

**Example using Pinata:**
```javascript
const formData = new FormData()
formData.append('file', file)

const response = await axios.post('https://api.pinata.cloud/pinning/pinFileToIPFS', formData, {
  headers: {
    'Content-Type': `multipart/form-data`,
    pinata_api_key: process.env.PINATA_API_KEY,
    pinata_secret_api_key: process.env.PINATA_SECRET_KEY
  }
})
const cid = response.data.IpfsHash
```

Pinning services ensure your files are always available via their gateways and the IPFS network.

---

## 30.3 Filecoin

Filecoin is a decentralized storage network built on top of IPFS that adds an incentive layer. Miners earn FIL tokens by providing storage space, and users pay FIL to store data. It ensures data is stored persistently and retrievably.

### 30.3.1 Incentive Layer for IPFS

While IPFS provides the protocol for content addressing and retrieval, it lacks incentives to keep data alive. Filecoin adds a blockchain and economic incentives:

- **Storage miners**: Commit storage space and prove they are storing data correctly via cryptographic proofs (Proof-of-Replication and Proof-of-Spacetime).
- **Clients**: Pay miners to store data for a certain duration.
- **Retrieval miners**: Provide fast retrieval (not yet fully implemented).

Filecoin uses IPFS under the hood; files are addressed by CIDs, and storage deals are recorded on the Filecoin blockchain.

### 30.3.2 Storage Market

The storage market is where clients and miners negotiate deals.

- **Deal**: A client proposes a deal to store a file with a specific miner for a duration, offering payment.
- **Sector**: Miners aggregate files into sectors (large batches) to prove storage.
- **Proofs**: Miners submit proofs (PoRep, PoSt) to the chain to verify they are storing the data.

**Process:**
1. Client imports data into a Filecoin client (e.g., Lotus).
2. Client proposes a deal to a miner.
3. Miner accepts, and data is transferred.
4. Miner seals the data into a sector and generates a proof.
5. Miner periodically submits proofs; if they fail, they are slashed.
6. After deal expires, data may be deleted unless renewed.

### 30.3.3 Retrieval Market

Retrieval is a separate market where miners (or specialized retrieval miners) compete to serve data quickly. This is often done off-chain, with micropayments. Retrieval deals are not yet as mature as storage deals.

**Integration with IPFS:** Filecoin nodes can also serve data via IPFS, making retrieval efficient.

**Example: Using Web3.Storage (which uses Filecoin)**
```javascript
import { Web3Storage } from 'web3.storage'

const client = new Web3Storage({ token: process.env.WEB3STORAGE_TOKEN })

async function storeFiles(files) {
  const cid = await client.put(files)
  console.log('Stored CID:', cid)
  return cid
}
```

Web3.Storage automatically pins to IPFS and stores a copy on Filecoin, providing long-term persistence.

---

## 30.4 Arweave

Arweave is a decentralized storage network focused on **permanent** data storage. Users pay a one-time fee to store data forever. It achieves this through a novel structure called the **blockweave** and an economic model that funds miners indefinitely.

### 30.4.1 Permanent Storage

Unlike Filecoin, where you pay recurring fees, Arweave aims for **permanent, one-payment** storage. The fee covers the cost of storing the data forever, based on the assumption that storage costs decrease over time and that the endowment will generate enough yield to pay miners.

**Use cases:**
- Permanent archives (e.g., historical records, academic papers).
- Immutable websites (permaweb).
- NFT metadata (ensuring it never disappears).

### 30.4.2 Blockweave Structure

Arweave uses a data structure called **blockweave**, which is similar to a blockchain but with a twist: each new block links not only to the previous block but also to a random previous block (the "recall block"). This incentivizes miners to store historical data to be able to mine new blocks.

**Key concepts:**
- **Proof of Access**: Miners must prove they have access to a random recall block from the weave, ensuring that data is widely replicated.
- **Wilders**: Miners that store a lot of data are more likely to win blocks.
- **Endowment**: Transaction fees go into an endowment that pays miners over time, with interest generated from the endowment.

### 30.4.3 Profit Sharing Tokens (PSTs)

Arweave introduces Profit Sharing Tokens (PSTs), which allow developers to earn revenue from their applications. When a user pays a transaction fee to interact with an app, a portion can be distributed to PST holders. This creates a sustainable business model for permaweb apps.

### 30.4.4 Integrating Arweave

**Using Arweave JS SDK:**
```javascript
import Arweave from 'arweave'

const arweave = Arweave.init({
  host: 'arweave.net',
  port: 443,
  protocol: 'https'
})

async function uploadData(data) {
  const transaction = await arweave.createTransaction({ data })
  transaction.addTag('Content-Type', 'text/plain')
  await arweave.transactions.sign(transaction, jwk) // jwk is wallet key
  const response = await arweave.transactions.post(transaction)
  console.log('Transaction ID:', transaction.id)
  return transaction.id
}

async function retrieveData(txId) {
  const data = await arweave.transactions.getData(txId, { decode: true })
  return data
}
```

**Bundlr Network** provides a simpler way to upload to Arweave with multiple currency options (pay in ETH, SOL, etc.) and faster uploads.

---

## 30.5 Integrating Decentralized Storage

Now let's see how to integrate these storage solutions into a DApp.

### 30.5.1 Storing NFT Metadata

NFTs typically have metadata (JSON) and an image. Both should be stored off-chain. The ERC-721 `tokenURI` function returns a URL pointing to the metadata. That metadata should be stored on IPFS/Filecoin/Arweave to ensure immutability and availability.

**Example metadata (JSON):**
```json
{
  "name": "My NFT #1",
  "description": "A unique digital collectible",
  "image": "ipfs://QmZbH...",
  "attributes": [
    { "trait_type": "Background", "value": "Blue" }
  ]
}
```

**Steps:**
1. Upload image to IPFS, get CID.
2. Create metadata JSON with `image` field pointing to the image CID.
3. Upload metadata JSON to IPFS, get CID.
4. Mint NFT with `tokenURI = ipfs://metadataCID`.

### 30.5.2 Building a Decentralized Website (Permaweb)

Arweave hosts permanent websites. The site's HTML, CSS, JS are uploaded as a bundle. The resulting Arweave transaction ID becomes the site's URL via gateways like `arweave.net/<txid>`.

**Example using Arweave Deploy:**
```bash
arweave deploy ./dist --key-file wallet.json
```

You can also use **Akord** or **ArDrive** for easy uploads.

### 30.5.3 Code Examples

**Uploading NFT metadata to IPFS via Pinata (Node.js):**
```javascript
const axios = require('axios')
const FormData = require('form-data')

async function uploadJSONToIPFS(jsonData) {
  const url = 'https://api.pinata.cloud/pinning/pinJSONToIPFS'
  const response = await axios.post(url, jsonData, {
    headers: {
      'Content-Type': 'application/json',
      pinata_api_key: process.env.PINATA_API_KEY,
      pinata_secret_api_key: process.env.PINATA_SECRET_KEY
    }
  })
  return response.data.IpfsHash
}

// Usage
const metadata = {
  name: 'MyNFT',
  description: 'Awesome NFT',
  image: 'ipfs://QmImageHash'
}
const cid = await uploadJSONToIPFS(metadata)
console.log('Metadata CID:', cid)
```

**Uploading to Filecoin via Web3.Storage:**
```javascript
import { Web3Storage } from 'web3.storage'

const client = new Web3Storage({ token: 'your-token' })

async function uploadToFilecoin(file) {
  const cid = await client.put([file], {
    name: 'my-file',
    maxRetries: 3
  })
  return cid
}
```

**Retrieving from Arweave in a frontend:**
```javascript
async function getArweaveData(txId) {
  const response = await fetch(`https://arweave.net/${txId}`)
  const data = await response.json()
  return data
}
```

---

## Chapter Summary

```
┌─────────────────────────────────────────────────────────────────┐
│                    CHAPTER 30 SUMMARY                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Decentralized storage is essential for DApps to store files    │
│  without relying on centralized servers.                        │
│                                                                 │
│  IPFS:                                                          │
│    • Content-addressed (CID)                                   │
│    • Peer-to-peer, no incentives                               │
│    • Pinning services ensure availability                      │
│                                                                 │
│  Filecoin:                                                      │
│    • Incentive layer on top of IPFS                            │
│    • Miners earn FIL for storing data                          │
│    • Proofs (PoRep, PoSt) ensure integrity                     │
│                                                                 │
│  Arweave:                                                       │
│    • Permanent storage (pay once)                              │
│    • Blockweave and Proof of Access                            │
│    • Permaweb for immutable websites                           │
│                                                                 │
│  Integration:                                                   │
│    • NFT metadata and images stored off-chain                  │
│    • Use libraries (ipfs-http-client, web3.storage, arweave-js)│
│    • Choose based on persistence needs: temporary (IPFS only),  │
│      incentivized (Filecoin), or permanent (Arweave)           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
```

**Next Chapter Preview:** Chapter 31 – Indexing and Querying Blockchain Data. We'll explore The Graph protocol, building subgraphs, and alternative indexing solutions to efficiently query on-chain data.