Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot retrieve content by addressing with SHA256 (raw binary?) #10426

Closed
3 tasks done
ghost opened this issue May 18, 2024 · 1 comment
Closed
3 tasks done

Cannot retrieve content by addressing with SHA256 (raw binary?) #10426

ghost opened this issue May 18, 2024 · 1 comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@ghost
Copy link

ghost commented May 18, 2024

Checklist

Installation method

built from source

Version

Kubo version: 0.29.0-dev
Repo version: 15
System version: amd64/linux
Golang version: go1.22.2

Config

{
  "API": {
    "HTTPHeaders": {}
  },
  "Addresses": {
    "API": "/ip4/127.0.0.1/tcp/5001",
    "Announce": [],
    "AppendAnnounce": [],
    "Gateway": "/ip4/127.0.0.1/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001",
      "/ip4/0.0.0.0/udp/4001/quic-v1",
      "/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
      "/ip6/::/udp/4001/quic-v1",
      "/ip6/::/udp/4001/quic-v1/webtransport"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
    "/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
    "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    "/ip4/104.131.131.82/udp/4001/quic-v1/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ"
  ],
  "DNS": {
    "Resolvers": {}
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": true
    }
  },
  "Experimental": {
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "OptimisticProvide": false,
    "OptimisticProvideJobsPoolSize": 0,
    "P2pHttpProxy": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "DeserializedResponses": null,
    "DisableHTMLErrors": null,
    "ExposeRoutingAPI": null,
    "HTTPHeaders": {},
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": ""
  },
  "Identity": {
    "PeerID": "12D3KooWQpMv9ZuXEaBHQEmBUKc1FZdXzbQGwqFmtV5AJuSGSs2y"
  },
  "Internal": {},
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": [],
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {},
  "Routing": {
    "AcceleratedDHTClient": false,
    "Methods": null,
    "Routers": null
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {},
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "RelayClient": {},
    "RelayService": {},
    "ResourceMgr": {},
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

Description

After a file is added to kubo with ipfs add, it cannot be retrieved from the API using the added file's hash.

For example

# Create test file
dd if=/dev/zero of=3MB bs=3M count=1

ipfs add 3MB
# added QmekBGp284QtNZoJg812EYkv3w55dNGt5Jxwjbg6g2ZRin 3MB

# Request using returned CID
curl localhost:8080/ipfs/QmekBGp284QtNZoJg812EYkv3w55dNGt5Jxwjbg6g2ZRin
# <a href="http://bafybeihtyhuhusxdrdxfrghtodthjwzroluaqm7zafjyb2a2e4c6h3uba4.ipfs.localhost:8080/">Moved Permanently</a>.

curl http://bafybeihtyhuhusxdrdxfrghtodthjwzroluaqm7zafjyb2a2e4c6h3uba4.ipfs.localhost:8080/
# Warning: Binary output can mess up your terminal. Use "--output -" to tell 
# Warning: curl to output it to your terminal anyway, or consider "--output 
# Warning: <FILE>" to save to a file.

##################
# Request using SHA256
sha256sum 3M
# bbd05cf6097ac9b1f89ea29d2542c1b7b67ee46848393895f5a9e43fa1f621e5  3MB

curl localhost:8080/ipfs/f01551220bbd05cf6097ac9b1f89ea29d2542c1b7b67ee46848393895f5a9e43fa1f621e5
# <a href="http://bafkreif32bopmcl2zgy7rhvctusufqnxwz7oi2cihe4jl5nj4q72d5rb4u.ipfs.localhost:8080/">Moved Permanently</a>.

curl http://bafkreif32bopmcl2zgy7rhvctusufqnxwz7oi2cihe4jl5nj4q72d5rb4u.ipfs.localhost:8080/
# just hangs

The resulting base32 CIDs are also different

name type CID
original dag-pb bafybeihtyhuhusxdrdxfrghtodthjwzroluaqm7zafjyb2a2e4c6h3uba4
from SHA256 raw bafkreif32bopmcl2zgy7rhvctusufqnxwz7oi2cihe4jl5nj4q72d5rb4u

Edit: Doesn't work for raw binary with blake3 hashsum (f01551e200471c2e7ccc927709c1e41e299804f1c2d2c2b757ff5afd5a3172bd68b9bccc2)

Edit2: Specifically, it fails when trying to find the a block at this point https://github.com/ipfs/go-ds-flatfs/blob/3b1c91bc3097ec7702347d8a419269ce88e450b8/flatfs.go#L658 (found using breakpoints). Maybe there's something wrong with CID parsing? dunno....

@ghost ghost added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 18, 2024
@lidel
Copy link
Member

lidel commented May 28, 2024

I think the short answer is that you are using made-up CID to retrieve data,
which has no providers, so it hangs while looking for them. If you use the one returned by ipfs add it will work.

TLDR:

  • Kubo does not support blocks bigger than 1-2MiB as only those can be exchanged over Bitswap. Files imported via ipfs add are chunked into smaller blocks that are always under this limit.
  • SHA256 in the CID produced by ipfs add is not guaranteed to be the hash of raw bytes.
  • If the file is CIDv0, the raw data will be wrapped in dag-pb protobuf.
  • If the data is bigger than 1MiB, the returned CID contains the hash of the top (root) dag-pb block of the UnixFS DAG, that was created by chunking data into smaller blocks. The top (root) node links to child nodes, thus you can walk the graph and do incremental retrieval/verification, and get deduplication for free.

I'm closing this (there is no bug in Kubo), but posting some resources for learning how IPFS handles files below.

The video course at:

And related docs:

For further support, try reaching out to https://discuss.ipfs.tech/c/help/13 (we limit github to bugs and feature requests), thanks!

@lidel lidel closed this as not planned Won't fix, can't repro, duplicate, stale May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
No open projects
Status: No status
Development

No branches or pull requests

1 participant