Description
Context
ipfs/kubo#8758 adds support for CAR export via Gateway.
It exports entire dag as a CAR stream, which does not cover all use cases.
For example, thin clients may want to export unixfs directory root block + its immediate children, or progressively fetch a big DAG from multiple gateway endpoints.
Why we need selector support
- Verifiable HTTP Gateway Responses (Verifiable HTTP Gateway Responses in-web-browsers#128)
- for mobile web browsers (content integrity without battery drain caused by full p2p)
- mobile browser should be able to traverse huge unixfs directory tree without having to fetch everything (only root block + root blocks of immediate children are needed for generating useful dir listing)
- for IoT devices and other thin clients
- fetching bigger DAGs progressively, load-balancing/falling back if some gateways are too slow/unreliable – makes HTTP more useful and pushes back the moment when an expensive p2p retrieval has to be spawned
- for mobile web browsers (content integrity without battery drain caused by full p2p)
Scope
- query param
- HTTP header
- TBD configurable size budget for CAR stream + UnixFS downloads
- TBD allow selectors everywhere? (UnixFS? dag-cbor/json?)
Proposed design (A) 💢
The go-car library supports passing selectors, the idea is to add a parameter to do just that.
We have to URL-escape selector somehow, either way,
so the choice is between encodeURIComponent and multibase encoding:
Text (JSON) representation:
/ipfs/{cid}?format=car&selector.json=encodeURIComponent({json serialization of selector})
Binary (CBOR) representation:
/ipfs/{cid}?format=car&selector.cbor=multibase({cbor serialization of selector})
Proposed design (B) 💢
/ipfs/{cid}?format=car&selector={cid2}
Here {cid2}
is a CID representing selector data. It could be dag-cbor, dag-json.
Small ones could be inlined (with identity hash), bigger ones could be fetched once and reused efficiently.
Proposed design (C) 🤏
Over time, we realized this is the most useful and safest way.
No selector CIDs, only predefined, most useful "partial CAR export scope" parameters for now:
/ipfs/{cid}/some/subpath/file?format=car&dag-depth=1&include-path=true
depth=1
means "root+direct children only" – good for fetching UnixFS dir listing with file sizes / types, or splitting bigger DAGs into partial retrievals over multiple gateways / threadswith-path
will also include blocks for all parent nodes on the content path (/ipfs/{cid}/some/subpath
,/ipfs/{cid}/some
, and/ipfs/{cid}
) – allows light clients to save round trips and take everything in single request-response.leaves
andbytes
proposed by Hannah Create IPIP with Gateway spec for partial CAR exports #348 (comment)
Proposed design (D) 🙏
Better ideas would be really welcome here 👀
Please comment below.
My initial thought was to have "single way of passing selectors", but if you find each approach brings value to different use cases, we could support both.
👉 NOTE: whatever we come up with here, we most likely want Kubo to support the same convention in ipfs dag
CLI (and RPC API at /api/v0/dag/*
) – see ipfs/kubo#8239