Skip to content

Create IPIP with Gateway spec for partial CAR exports #348

Closed
@lidel

Description

@lidel

Context

ipfs/kubo#8758 adds support for CAR export via Gateway.
It exports entire dag as a CAR stream, which does not cover all use cases.

For example, thin clients may want to export unixfs directory root block + its immediate children, or progressively fetch a big DAG from multiple gateway endpoints.

Why we need selector support

  • Verifiable HTTP Gateway Responses (Verifiable HTTP Gateway Responses in-web-browsers#128)
    • for mobile web browsers (content integrity without battery drain caused by full p2p)
      • mobile browser should be able to traverse huge unixfs directory tree without having to fetch everything (only root block + root blocks of immediate children are needed for generating useful dir listing)
    • for IoT devices and other thin clients
      • fetching bigger DAGs progressively, load-balancing/falling back if some gateways are too slow/unreliable – makes HTTP more useful and pushes back the moment when an expensive p2p retrieval has to be spawned

Scope

  • query param
  • HTTP header
  • TBD configurable size budget for CAR stream + UnixFS downloads
  • TBD allow selectors everywhere? (UnixFS? dag-cbor/json?)

Proposed design (A) 💢

The go-car library supports passing selectors, the idea is to add a parameter to do just that.

We have to URL-escape selector somehow, either way,
so the choice is between encodeURIComponent and multibase encoding:

Text (JSON) representation:

/ipfs/{cid}?format=car&selector.json=encodeURIComponent({json serialization of selector})

Binary (CBOR) representation:

/ipfs/{cid}?format=car&selector.cbor=multibase({cbor serialization of selector})

Proposed design (B) 💢

/ipfs/{cid}?format=car&selector={cid2}

Here {cid2} is a CID representing selector data. It could be dag-cbor, dag-json.
Small ones could be inlined (with identity hash), bigger ones could be fetched once and reused efficiently.

Proposed design (C) 🤏

Over time, we realized this is the most useful and safest way.
No selector CIDs, only predefined, most useful "partial CAR export scope" parameters for now:

/ipfs/{cid}/some/subpath/file?format=car&dag-depth=1&include-path=true
  • depth=1 means "root+direct children only" – good for fetching UnixFS dir listing with file sizes / types, or splitting bigger DAGs into partial retrievals over multiple gateways / threads
  • with-path will also include blocks for all parent nodes on the content path (/ipfs/{cid}/some/subpath, /ipfs/{cid}/some, and /ipfs/{cid}) – allows light clients to save round trips and take everything in single request-response.
  • leaves and bytes proposed by Hannah Create IPIP with Gateway spec for partial CAR exports #348 (comment)

Proposed design (D) 🙏

Better ideas would be really welcome here 👀
Please comment below.


My initial thought was to have "single way of passing selectors", but if you find each approach brings value to different use cases, we could support both.

👉 NOTE: whatever we come up with here, we most likely want Kubo to support the same convention in ipfs dag CLI (and RPC API at /api/v0/dag/*) – see ipfs/kubo#8239

Metadata

Metadata

Labels

P1High: Likely tackled by core team if no one steps upeffort/hoursEstimated to take one or several hourskind/enhancementA net-new feature or an improvement to an existing featurestatus/blockedUnable to be worked further until needs are met

Type

No type

Projects

Relationships

None yet

Development

No branches or pull requests

Issue actions