From 1d9ec9c991b40ec1f3cdb058dacb0d2c4d3f9037 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Tue, 18 Oct 2022 09:59:44 -0400 Subject: [PATCH 01/28] feat: Delegated Routing HTTP API --- IPIP/0000-delegated-routing-http-api.md | 88 +++++++++++++++++++++ routing/DELEGATED_ROUTING_HTTP.md | 101 ++++++++++++++++++++++++ 2 files changed, 189 insertions(+) create mode 100644 IPIP/0000-delegated-routing-http-api.md create mode 100644 routing/DELEGATED_ROUTING_HTTP.md diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md new file mode 100644 index 00000000..f0d9f4c0 --- /dev/null +++ b/IPIP/0000-delegated-routing-http-api.md @@ -0,0 +1,88 @@ +# IPIP 0000: Delegated Routing HTTP API + +- Start Date: 2022-10-18 +- Related Issues: + - (add links here) + +## Summary + +This IPIP specifies an HTTP API for delegated routing. + +## Motivation + +Idiomatic and first-class HTTP support for delegated routing is an important requirement for large content routing providers, +and supporting large content providers is a key strategy for driving down IPFS latency. +These providers must handle high volumes of traffic and support many users, so leveraging industry-standard tools and services +such as HTTP load balancers, CDNs, reverse proxies, etc. is a requirement. +To maximize compatibility with standard tools, IPFS needs an HTTP API specification that uses standard HTTP idioms and payload encoding. +The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing was an experimental attempt at this, +but it has resulted in a very unidiomatic HTTP API which is difficult to implement and is incompatible with many existing tools. +The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated routing HTTP API. + +Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP, +and this spec could then be rewritten in the IDL, maintaining backwards compatibility. + +## Detailed design + +See the [API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP. + +## Design rationale +To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: + +- Reframe methods are encoded inside messages + - This prevents URL-based pattern matching on methods + - Configuring different caching strategies for different methods + - Configuring reverse proxies on a per-method basis + - Routing methods to specific backends + - Method-specific reverse proxy config such as timeouts + - Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter + - This was initially done by URL-escaping the raw bytes + - Not possible to consume correctly using standard JavaScript + - Shipped in Kubo 0.16 + - Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective + - Added complexity of "Cacheable" methods supporting both POSTs and GETs +- The required streaming support and message groups add a lot of implementation complexity but isn’t very useful + - Ex for FindProviders, the response is buffered anyway for ETag calculation + - There are no limits on response sizes nor ways to impose limits and paginate + - This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later +- The Identify method is not implemented because it is not currently useful +- Client and server implementations are difficult to write correctly, because of the non-standard wire formats and conventions +- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51), and is currently maintained by IPFS Stewards who are already over-committed with other priorities +- Only the HTTP transport has been designed and implemented, so it's unclear if the existing design will work for other transports, and what their use cases and requirements are + +So this API proposal makes the following changes: + +- The API is defined in HTTP directly +- "Methods" and cache-relevant parameters are pushed into the URL path +- Streaming support is removed, and optional pagination is added, which limits the response size and provides a scalable mechanism for iterating over arbitrarily-large collections + - We might add streaming support w/ chunked-encoded responses in the future, but it's currently not an important feature for the use cases that an HTTP API will be used for +- Bodies are encoded using standard JSON or CBOR, instead of using IPLD codecs +- The "Identify" method and "message groups" are removed + +### User benefit + +The cost of building and operating content routing services will be much lower, as developers will be able to reuse existing industry-standard tooling. +This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS netowrk +and increasing data availability. + +### Compatibility + +#### Backwards Compatibility +IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. +Because the existing Reframe spec can't be safely used in JavaScript, the experimental support for Reframe in Kubo will be removed in the next release, +and delegated routing will subsequently use this HTTP API. We may decide to re-add Reframe support in the future once these issues have been resolved. + +#### Forwards Compatibility +Standard HTTP mechanisms for forward compatibility are used--the API is versioned using a version number in the path. The `Accept` and `Content-Type` headers are used for content type negotiation. new methods will result in new paths, and parameters can be added using either new query parameters or new fields in the request/response body. Certain parts of bodies are labeled as "opaque bytes", which are passed through by the implementation, with no schema enforcement. + +### Security + +None + +### Alternatives + +This *is* an alternative. + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). diff --git a/routing/DELEGATED_ROUTING_HTTP.md b/routing/DELEGATED_ROUTING_HTTP.md new file mode 100644 index 00000000..a24cf6a2 --- /dev/null +++ b/routing/DELEGATED_ROUTING_HTTP.md @@ -0,0 +1,101 @@ +# ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Routing HTTP API + +**Author(s)**: +- Gus Eggert + +**Maintainer(s)**: + +* * * + +**Abstract** + +"Delegated routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated routing. + +# Organization of this document + +- [Introduction](#introduction) +- [Spec](#spec) + - [Interaction Pattern](#interaction-pattern) + - [Cachability](#cachability) + - [Transports](#transports) + - [Protocol Message Overview](#protocol-message-overview) + - [Known Methods](#known-methods) +- [Method Upgrade Paths](#method-upgrade-paths) +- [Implementations](#implementations) + +# API Specification +By default, the Delegated Routing HTTP API uses the `application/json` content type. Clients and servers may optionally negotiate other content types such as `application/cbor`, `application/vnd.ipfs.rpc+dag-json`, etc. using the standard `Accept` and `Content-Type` headers. + +- `GET /v1/providers/{CID}` + - Reframe equivalent: FindProviders + - Response + + ```json + { + "Providers": [ + { + "PeerID": "...", + "Multiaddrs": ["...", "..."] + "Protocols": [ + { + "Codec": 2320, + "Payload": + } + ] + } + ] + "NextPageToken": "" + } + ``` + + - Default limit: 100 providers + - Optional query parameters + - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of multicodec IDs such as `2304,2320`, + - `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `/quic` or `/tls/ws`. +- `GET /v1/providers/hash/{multihash}` + - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a multihash +- `GET /v1/ipns/{ID}` + - Reframe equivalent: GetIPNS + - Response + - record bytes +- `POST /v1/ipns/{ID}` + - Reframe equivalent: PutIPNS + - Body + - record bytes + - No need for idempotency +- `PUT /v1/providers/{CID}` + - Reframe equivalent: Provide + - Body + + ```json + { + "Keys": ["cid1", "cid2"], + "Timestamp": 1234, + "AdvisoryTTL": 1234, + "Signature": "multibase bytes", + "Provider": { + "Peer": { + "ID": "peerID", + "Addrs": ["multiaddr1", "multiaddr2"] + }, + "Protocols": [ + { + "Codec": 1234, + "Payload": + } + ] + } + } + ``` + + - Idempotent +- `GET /v1/ping` + - This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap + - Returns 200 once the server is ready to accept requests + - An alternate approach is w/ an orchestration dance in the server by not listening on the socket until the dependencies are ready, but this makes the “dance” easier to implement +- Pagination + - Responses with collections of results must have a default limit on the number of results that will be returned in a single response + - Servers may optionally implement pagination by responding with an opaque page token which, when provided as a subsequent query parameter, will fetch the next page of results. + - Clients may continue paginating until no `NextPageToken` is returned. + - Clients making calls that return collections may limit the number of per-page results returned with the `limit` query parameter, i.e. `GET /v1/providers/{CID}?limit=10` + - Additional filtering/sorting operations may be defined on a per-path basis, as needed From 65d178b98fd33a5337241d3c066155085e2f9c6d Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Wed, 19 Oct 2022 13:55:29 -0400 Subject: [PATCH 02/28] changes based on feedback * Add detailed signature procedure for PUT /v1/providers * Add error codes * Remove pagination and limit field * Many other small changes based on feedback --- IPIP/0000-delegated-routing-http-api.md | 42 ++++++--- routing/DELEGATED_ROUTING_HTTP.md | 116 ++++++++++++++---------- 2 files changed, 97 insertions(+), 61 deletions(-) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md index f0d9f4c0..db66acf2 100644 --- a/IPIP/0000-delegated-routing-http-api.md +++ b/IPIP/0000-delegated-routing-http-api.md @@ -20,48 +20,57 @@ but it has resulted in a very unidiomatic HTTP API which is difficult to impleme The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated routing HTTP API. Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP, -and this spec could then be rewritten in the IDL, maintaining backwards compatibility. +and implementations of this spec could then be rewritten in the IDL, maintaining backwards compatibility. ## Detailed design -See the [API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP. +See the [Delegated Routing HTTP API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP. ## Design rationale To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: -- Reframe methods are encoded inside messages - - This prevents URL-based pattern matching on methods +- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) are encoded inside messages + - This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations: - Configuring different caching strategies for different methods - Configuring reverse proxies on a per-method basis - Routing methods to specific backends - Method-specific reverse proxy config such as timeouts - Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter - This was initially done by URL-escaping the raw bytes - - Not possible to consume correctly using standard JavaScript + - Not possible to consume correctly using standard JavaScript (see [edelweiss#61](https://github.com/ipld/edelweiss/issues/61)) - Shipped in Kubo 0.16 - Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective - Added complexity of "Cacheable" methods supporting both POSTs and GETs -- The required streaming support and message groups add a lot of implementation complexity but isn’t very useful +- The required streaming support and message groups add a lot of implementation complexity, but streaming does not work for cachable methods sent over HTTP - Ex for FindProviders, the response is buffered anyway for ETag calculation - There are no limits on response sizes nor ways to impose limits and paginate - This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later - The Identify method is not implemented because it is not currently useful + - This is because Reframe's ambition is to be generic catch-all bag of methods across protocols, while delegated routing use case only requires a subset of its methods. - Client and server implementations are difficult to write correctly, because of the non-standard wire formats and conventions -- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51), and is currently maintained by IPFS Stewards who are already over-committed with other priorities + - Example: [bug reported by implementer](https://github.com/ipld/edelweiss/issues/62), and [another one](https://github.com/ipld/edelweiss/issues/61) +- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51-L100), and is currently maintained by IPFS Stewards who are already over-committed with other priorities - Only the HTTP transport has been designed and implemented, so it's unclear if the existing design will work for other transports, and what their use cases and requirements are + - This means Reframe can't be trusted to be transport-agnostic until there is at least second transport implemented (e.g. as a reframe-over-libp2p protocol). So this API proposal makes the following changes: -- The API is defined in HTTP directly -- "Methods" and cache-relevant parameters are pushed into the URL path -- Streaming support is removed, and optional pagination is added, which limits the response size and provides a scalable mechanism for iterating over arbitrarily-large collections +- The Delegated Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts +- "Method names" and cache-relevant parameters are pushed into the URL path +- Streaming support is removed, and default response size limits are added along with an optional `limit` parameter for clients to specify response sizes - We might add streaming support w/ chunked-encoded responses in the future, but it's currently not an important feature for the use cases that an HTTP API will be used for + - Pagination could be added to this in the future, if needed - Bodies are encoded using standard JSON or CBOR, instead of using IPLD codecs +- JSON uses human-friendly string encodings of common data types + - CIDs are encoded as CIDv1 strings with a multibase prefix (e.g. base32), for consistency with CLIs, browsers, and [gateway URLs](https://docs.ipfs.io/how-to/address-ipfs-on-web/) + - Multiaddrs use the [human-readable format](https://github.com/multiformats/multiaddr#specification) that is used in existing tools and Kubo CLI commands such as `ipfs id` or `ipfs swarm peers` + - Byte array values, such as signatures, are multibase-encoded strings (with an `m` prefix indicating Base64) - The "Identify" method and "message groups" are removed ### User benefit The cost of building and operating content routing services will be much lower, as developers will be able to reuse existing industry-standard tooling. +They no longer need to learn Reframe-specific concepts to consume or expose the API. This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS netowrk and increasing data availability. @@ -69,11 +78,18 @@ and increasing data availability. #### Backwards Compatibility IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. -Because the existing Reframe spec can't be safely used in JavaScript, the experimental support for Reframe in Kubo will be removed in the next release, -and delegated routing will subsequently use this HTTP API. We may decide to re-add Reframe support in the future once these issues have been resolved. +Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, +the experimental support for Reframe in Kubo will be removed in the next release and delegated routing will subsequently use this HTTP API. +We may decide to re-add Reframe support in the future once these issues have been resolved. #### Forwards Compatibility -Standard HTTP mechanisms for forward compatibility are used--the API is versioned using a version number in the path. The `Accept` and `Content-Type` headers are used for content type negotiation. new methods will result in new paths, and parameters can be added using either new query parameters or new fields in the request/response body. Certain parts of bodies are labeled as "opaque bytes", which are passed through by the implementation, with no schema enforcement. +Standard HTTP mechanisms for forward compatibility are used: +- The API is versioned using a version number in the path +- The `Accept` and `Content-Type` headers are used for content type negotiation +- New methods will result in new paths +- Parameters can be added using either new query parameters or new fields in the request/response body. + +Certain parts of bodies are labeled as "{ ... }", which are opaque JSON values passed through by the implementation, with no schema enforcement. ### Security diff --git a/routing/DELEGATED_ROUTING_HTTP.md b/routing/DELEGATED_ROUTING_HTTP.md index a24cf6a2..030c869d 100644 --- a/routing/DELEGATED_ROUTING_HTTP.md +++ b/routing/DELEGATED_ROUTING_HTTP.md @@ -24,78 +24,98 @@ - [Implementations](#implementations) # API Specification -By default, the Delegated Routing HTTP API uses the `application/json` content type. Clients and servers may optionally negotiate other content types such as `application/cbor`, `application/vnd.ipfs.rpc+dag-json`, etc. using the standard `Accept` and `Content-Type` headers. +The Delegated Routing HTTP API uses the `application/json` content type by default. Clients and servers *should* support `application/cbor`, which can be negotiated using the standard `Accept` and `Content-Type` headers. +## Common Data Types: + +- CIDs are always encoded using a [multibase](https://github.com/multiformats/multibase)-encoded [CIDv1](https://github.com/multiformats/cid#cidv1). +- Multiaddrs are encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification) +- Peer IDs are encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) +- Multibase bytes are encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64. + +## API - `GET /v1/providers/{CID}` - Reframe equivalent: FindProviders - Response ```json { - "Providers": [ - { - "PeerID": "...", - "Multiaddrs": ["...", "..."] - "Protocols": [ - { - "Codec": 2320, - "Payload": - } - ] - } - ] - "NextPageToken": "" + "Providers": [ + { + "PeerID": "12D3K...", + "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], + "Protocols": [ + { + "Codec": 2320, + "Payload": { ... } + } + ] + } + ] } ``` - Default limit: 100 providers - Optional query parameters - - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of multicodec IDs such as `2304,2320`, - - `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `/quic` or `/tls/ws`. -- `GET /v1/providers/hash/{multihash}` - - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a multihash + - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of [multicodec codes](https://github.com/multiformats/multicodec/blob/master/table.csv) in decimal form such as `2304,2320` + - `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `460,478` (`/quic` and `/tls/ws`) + - Servers should treat the multicodec codes used in the `transfer` and `transport` parameters as opaque, and not validate them, for forwards compatibility +- `GET /v1/providers/hashed/{multihash}` + - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a [multihash](https://github.com/multiformats/multihash/) - `GET /v1/ipns/{ID}` - Reframe equivalent: GetIPNS + - `ID`: multibase-encoded bytes - Response - record bytes -- `POST /v1/ipns/{ID}` +- `POST /v1/ipns` - Reframe equivalent: PutIPNS - Body - - record bytes - - No need for idempotency -- `PUT /v1/providers/{CID}` - - Reframe equivalent: Provide - - Body - ```json { - "Keys": ["cid1", "cid2"], - "Timestamp": 1234, - "AdvisoryTTL": 1234, - "Signature": "multibase bytes", - "Provider": { - "Peer": { - "ID": "peerID", - "Addrs": ["multiaddr1", "multiaddr2"] - }, - "Protocols": [ - { - "Codec": 1234, - "Payload": - } - ] - } + "Records": [ + { + "ID": "multibase bytes", + "Record": "multibase bytes" + } + ] } ``` - + - Not idempotent (this doesn't really make sense for IPNS) + - Default limit of 100 records per request +- `PUT /v1/providers` + - Reframe equivalent: Provide + - Body + ```json + { + "Signature": "multibase bytes", + "Payload": { + "Keys": ["cid1", "cid2"], + "Timestamp": 1234, + "AdvisoryTTL": 1234, + "Signature": "multibase bytes", + "Provider": { + "PeerID": "12D3K...", + "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], + "Protocols": [ + { + "Codec": 1234, + "Payload": { ... } + } + ] + } + } + } + ``` + - `Signature` is a multibase-encoded signature of the encoded bytes of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload`. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. - Idempotent + - Default limit of 100 keys per request - `GET /v1/ping` - - This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap + - This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap, and light clients who want to probe multiple HTTP endpoints and use the fastest one - Returns 200 once the server is ready to accept requests - An alternate approach is w/ an orchestration dance in the server by not listening on the socket until the dependencies are ready, but this makes the “dance” easier to implement -- Pagination +- Limits - Responses with collections of results must have a default limit on the number of results that will be returned in a single response - - Servers may optionally implement pagination by responding with an opaque page token which, when provided as a subsequent query parameter, will fetch the next page of results. - - Clients may continue paginating until no `NextPageToken` is returned. - - Clients making calls that return collections may limit the number of per-page results returned with the `limit` query parameter, i.e. `GET /v1/providers/{CID}?limit=10` - - Additional filtering/sorting operations may be defined on a per-path basis, as needed + - Pagination and/or dynamic limit configuration may be added to this spec in the future, once there is a concrete requirement +- Error Codes + - A 404 must be returned if a resource was not found + - A 501 must be returned if a method is not supported From 4c024dd78c7cab370baf99f0d90b7b72c2ba2d61 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Wed, 19 Oct 2022 14:25:05 -0400 Subject: [PATCH 03/28] fix some formatting --- routing/DELEGATED_ROUTING_HTTP.md | 66 ++++++++++++++++--------------- 1 file changed, 34 insertions(+), 32 deletions(-) diff --git a/routing/DELEGATED_ROUTING_HTTP.md b/routing/DELEGATED_ROUTING_HTTP.md index 030c869d..c8c27fe8 100644 --- a/routing/DELEGATED_ROUTING_HTTP.md +++ b/routing/DELEGATED_ROUTING_HTTP.md @@ -64,7 +64,7 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a [multihash](https://github.com/multiformats/multihash/) - `GET /v1/ipns/{ID}` - Reframe equivalent: GetIPNS - - `ID`: multibase-encoded bytes + - `ID`: multibase-encoded bytes - Response - record bytes - `POST /v1/ipns` @@ -72,12 +72,12 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau - Body ```json { - "Records": [ - { - "ID": "multibase bytes", - "Record": "multibase bytes" - } - ] + "Records": [ + { + "ID": "multibase bytes", + "Record": "multibase bytes" + } + ] } ``` - Not idempotent (this doesn't really make sense for IPNS) @@ -85,37 +85,39 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau - `PUT /v1/providers` - Reframe equivalent: Provide - Body - ```json - { - "Signature": "multibase bytes", - "Payload": { - "Keys": ["cid1", "cid2"], - "Timestamp": 1234, - "AdvisoryTTL": 1234, - "Signature": "multibase bytes", - "Provider": { - "PeerID": "12D3K...", - "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], - "Protocols": [ - { - "Codec": 1234, - "Payload": { ... } - } - ] - } - } - } - ``` + ```json + { + "Signature": "multibase bytes", + "Payload": { + "Keys": ["cid1", "cid2"], + "Timestamp": 1234, + "AdvisoryTTL": 1234, + "Signature": "multibase bytes", + "Provider": { + "PeerID": "12D3K...", + "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], + "Protocols": [ + { + "Codec": 1234, + "Payload": { ... } + } + ] + } + } + } + ``` - `Signature` is a multibase-encoded signature of the encoded bytes of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload`. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. - Idempotent - Default limit of 100 keys per request - `GET /v1/ping` - - This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap, and light clients who want to probe multiple HTTP endpoints and use the fastest one - Returns 200 once the server is ready to accept requests - - An alternate approach is w/ an orchestration dance in the server by not listening on the socket until the dependencies are ready, but this makes the “dance” easier to implement -- Limits + +## Limits + - Responses with collections of results must have a default limit on the number of results that will be returned in a single response - Pagination and/or dynamic limit configuration may be added to this spec in the future, once there is a concrete requirement -- Error Codes + +## Error Codes + - A 404 must be returned if a resource was not found - A 501 must be returned if a method is not supported From 0acdb01c0c4ffb8aba830a6af4e5536a70229ef3 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 15:35:54 -0400 Subject: [PATCH 04/28] remove unused signature field --- routing/DELEGATED_ROUTING_HTTP.md | 1 - 1 file changed, 1 deletion(-) diff --git a/routing/DELEGATED_ROUTING_HTTP.md b/routing/DELEGATED_ROUTING_HTTP.md index c8c27fe8..5297e897 100644 --- a/routing/DELEGATED_ROUTING_HTTP.md +++ b/routing/DELEGATED_ROUTING_HTTP.md @@ -92,7 +92,6 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau "Keys": ["cid1", "cid2"], "Timestamp": 1234, "AdvisoryTTL": 1234, - "Signature": "multibase bytes", "Provider": { "PeerID": "12D3K...", "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], From f7b44374826ee8ddf6962dab7df94b4ca1849364 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 15:37:48 -0400 Subject: [PATCH 05/28] rename to "delegated content routing" and remove IPNS --- ...P.md => DELEGATED_CONTENT_ROUTING_HTTP.md} | 39 ++----------------- 1 file changed, 3 insertions(+), 36 deletions(-) rename routing/{DELEGATED_ROUTING_HTTP.md => DELEGATED_CONTENT_ROUTING_HTTP.md} (73%) diff --git a/routing/DELEGATED_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md similarity index 73% rename from routing/DELEGATED_ROUTING_HTTP.md rename to routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 5297e897..d7cd56ca 100644 --- a/routing/DELEGATED_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -1,4 +1,4 @@ -# ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Routing HTTP API +# ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Content Routing HTTP API **Author(s)**: - Gus Eggert @@ -9,22 +9,10 @@ **Abstract** -"Delegated routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated routing. - -# Organization of this document - -- [Introduction](#introduction) -- [Spec](#spec) - - [Interaction Pattern](#interaction-pattern) - - [Cachability](#cachability) - - [Transports](#transports) - - [Protocol Message Overview](#protocol-message-overview) - - [Known Methods](#known-methods) -- [Method Upgrade Paths](#method-upgrade-paths) -- [Implementations](#implementations) +"Delegated content routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated routing. # API Specification -The Delegated Routing HTTP API uses the `application/json` content type by default. Clients and servers *should* support `application/cbor`, which can be negotiated using the standard `Accept` and `Content-Type` headers. +The Delegated Content Routing Routing HTTP API uses the `application/json` content type by default. Clients and servers *should* support `application/cbor`, which can be negotiated using the standard `Accept` and `Content-Type` headers. ## Common Data Types: @@ -35,7 +23,6 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau ## API - `GET /v1/providers/{CID}` - - Reframe equivalent: FindProviders - Response ```json @@ -62,26 +49,6 @@ The Delegated Routing HTTP API uses the `application/json` content type by defau - Servers should treat the multicodec codes used in the `transfer` and `transport` parameters as opaque, and not validate them, for forwards compatibility - `GET /v1/providers/hashed/{multihash}` - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a [multihash](https://github.com/multiformats/multihash/) -- `GET /v1/ipns/{ID}` - - Reframe equivalent: GetIPNS - - `ID`: multibase-encoded bytes - - Response - - record bytes -- `POST /v1/ipns` - - Reframe equivalent: PutIPNS - - Body - ```json - { - "Records": [ - { - "ID": "multibase bytes", - "Record": "multibase bytes" - } - ] - } - ``` - - Not idempotent (this doesn't really make sense for IPNS) - - Default limit of 100 records per request - `PUT /v1/providers` - Reframe equivalent: Provide - Body From 13d695cadd55939692bdc5dcfdd54444bd12cc9f Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 15:42:15 -0400 Subject: [PATCH 06/28] use multibase-encoded payload for Provide --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index d7cd56ca..f8390596 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -55,7 +55,12 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte ```json { "Signature": "multibase bytes", - "Payload": { + "Payload": "multibase bytes" + } + ``` + - `Payload` is a multibase-encoded string containing a JSON object with the following schema: + ```json + { "Keys": ["cid1", "cid2"], "Timestamp": 1234, "AdvisoryTTL": 1234, @@ -69,10 +74,9 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte } ] } - } - } - ``` - - `Signature` is a multibase-encoded signature of the encoded bytes of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload`. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. + } + ``` + - `Signature` is a multibase-encoded signature of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. - Idempotent - Default limit of 100 keys per request - `GET /v1/ping` From e3e744a74a3c8c1009ce266bdb3a3557cb3d287f Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 15:50:34 -0400 Subject: [PATCH 07/28] sign the hash of the payload --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index f8390596..de20131f 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -76,7 +76,7 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte } } ``` - - `Signature` is a multibase-encoded signature of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. + - `Signature` is a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. - Idempotent - Default limit of 100 keys per request - `GET /v1/ping` From 451b1e901ee97ac517645ac6efce687709a7c1fd Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 16:13:16 -0400 Subject: [PATCH 08/28] add timestamp type --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 1 + 1 file changed, 1 insertion(+) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index de20131f..b2145372 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -20,6 +20,7 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte - Multiaddrs are encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification) - Peer IDs are encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) - Multibase bytes are encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64. +- Timestamps are Unix millisecond epoch timestamps ## API - `GET /v1/providers/{CID}` From 27d23e875d3339f9ff5071ff8209a9c4307a44bc Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 16:23:35 -0400 Subject: [PATCH 09/28] adjust provider record --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index b2145372..972a57ce 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -30,8 +30,10 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte { "Providers": [ { - "PeerID": "12D3K...", - "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], + "Peer": { + "ID": "12D3K...", + "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."] + } "Protocols": [ { "Codec": 2320, @@ -66,8 +68,10 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte "Timestamp": 1234, "AdvisoryTTL": 1234, "Provider": { - "PeerID": "12D3K...", - "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."], + "Peer": { + "ID": "12D3K...", + "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."] + } "Protocols": [ { "Codec": 1234, From a9984a96ecb64b74122d616f0a3c52ee72253dc5 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 20 Oct 2022 16:37:10 -0400 Subject: [PATCH 10/28] specify /ping not ready status code --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 972a57ce..1f2ce199 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -85,7 +85,7 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte - Idempotent - Default limit of 100 keys per request - `GET /v1/ping` - - Returns 200 once the server is ready to accept requests + - Returns 200 once the server is ready to accept requests, otherwise returns 503 ## Limits From fce070f391f11ae45fbd7919668c08b8d46c2979 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 21 Oct 2022 12:55:30 -0400 Subject: [PATCH 11/28] add note about non-identity-multihashed peer IDs --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 1f2ce199..aa1431e3 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -82,8 +82,9 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte } ``` - `Signature` is a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. + - Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself. Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block over IPFS). If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme. - Idempotent - - Default limit of 100 keys per request + - Default limit of 100 keys per request - `GET /v1/ping` - Returns 200 once the server is ready to accept requests, otherwise returns 503 @@ -95,4 +96,4 @@ The Delegated Content Routing Routing HTTP API uses the `application/json` conte ## Error Codes - A 404 must be returned if a resource was not found - - A 501 must be returned if a method is not supported + - A 501 must be returned if a method is not supported From fff68c3ecfb8827650a5e90fbcac36a4b8336c30 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 13:45:17 -0500 Subject: [PATCH 12/28] rework API and schema based on feedback --- IPIP/0000-delegated-routing-http-api.md | 96 ++++---- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 271 ++++++++++++++++------ 2 files changed, 251 insertions(+), 116 deletions(-) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md index db66acf2..56a08d37 100644 --- a/IPIP/0000-delegated-routing-http-api.md +++ b/IPIP/0000-delegated-routing-http-api.md @@ -1,4 +1,4 @@ -# IPIP 0000: Delegated Routing HTTP API +# IPIP 0000: Delegated Content Routing HTTP API - Start Date: 2022-10-18 - Related Issues: @@ -6,71 +6,75 @@ ## Summary -This IPIP specifies an HTTP API for delegated routing. +This IPIP specifies an HTTP API for delegated content routing. ## Motivation Idiomatic and first-class HTTP support for delegated routing is an important requirement for large content routing providers, -and supporting large content providers is a key strategy for driving down IPFS latency. +and supporting large content providers is a key strategy for driving down IPFS content routing latency. These providers must handle high volumes of traffic and support many users, so leveraging industry-standard tools and services such as HTTP load balancers, CDNs, reverse proxies, etc. is a requirement. To maximize compatibility with standard tools, IPFS needs an HTTP API specification that uses standard HTTP idioms and payload encoding. -The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing was an experimental attempt at this, +The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing is an experimental attempt at this, but it has resulted in a very unidiomatic HTTP API which is difficult to implement and is incompatible with many existing tools. -The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated routing HTTP API. +The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated content routing HTTP API. Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP, and implementations of this spec could then be rewritten in the IDL, maintaining backwards compatibility. +We expect this API to be extended beyond "content routing" in the future, so additional IPIPs may rename this to something more general such as "Delegated Routing HTTP API". + ## Detailed design -See the [Delegated Routing HTTP API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP. +See the [Delegated Content Routing HTTP API spec](../routing/DELEGATED_CONTENT_ROUTING_HTTP.md) included with this IPIP. ## Design rationale To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: -- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) are encoded inside messages - - This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations: - - Configuring different caching strategies for different methods - - Configuring reverse proxies on a per-method basis - - Routing methods to specific backends - - Method-specific reverse proxy config such as timeouts - - Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter - - This was initially done by URL-escaping the raw bytes - - Not possible to consume correctly using standard JavaScript (see [edelweiss#61](https://github.com/ipld/edelweiss/issues/61)) - - Shipped in Kubo 0.16 - - Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective - - Added complexity of "Cacheable" methods supporting both POSTs and GETs -- The required streaming support and message groups add a lot of implementation complexity, but streaming does not work for cachable methods sent over HTTP - - Ex for FindProviders, the response is buffered anyway for ETag calculation - - There are no limits on response sizes nor ways to impose limits and paginate - - This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later +- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) are encoded inside IPLD-encoded messages + - This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations: + - Configuring different caching strategies for different methods + - Configuring reverse proxies on a per-method basis + - Routing methods to specific backends + - Method-specific reverse proxy config such as timeouts + - Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter + - This was initially done by URL-escaping the raw bytes + - Not possible to consume correctly using standard JavaScript (see [edelweiss#61](https://github.com/ipld/edelweiss/issues/61)) + - Shipped in Kubo 0.16 + - Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective, and has already made it difficult to manually construct requests or interpret logs + - Added complexity of "Cacheable" methods supporting both POSTs and GETs +- The required streaming support and message groups add a lot of implementation complexity, but streaming does not currently work for cachable methods sent over HTTP + - Ex for FindProviders, the response is buffered anyway for ETag calculation + - There are no limits on response sizes nor ways to impose limits and paginate + - This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later - The Identify method is not implemented because it is not currently useful - - This is because Reframe's ambition is to be generic catch-all bag of methods across protocols, while delegated routing use case only requires a subset of its methods. + - This is because Reframe's ambition is to be a generic catch-all bag of methods across protocols, while delegated routing use case only requires a subset of its methods. - Client and server implementations are difficult to write correctly, because of the non-standard wire formats and conventions - - Example: [bug reported by implementer](https://github.com/ipld/edelweiss/issues/62), and [another one](https://github.com/ipld/edelweiss/issues/61) + - Example: [bug reported by implementer](https://github.com/ipld/edelweiss/issues/62), and [another one](https://github.com/ipld/edelweiss/issues/61) - The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51-L100), and is currently maintained by IPFS Stewards who are already over-committed with other priorities - Only the HTTP transport has been designed and implemented, so it's unclear if the existing design will work for other transports, and what their use cases and requirements are - - This means Reframe can't be trusted to be transport-agnostic until there is at least second transport implemented (e.g. as a reframe-over-libp2p protocol). + - This means Reframe can't be trusted to be transport-agnostic until there is at least a second transport implemented (e.g. as a reframe-over-libp2p protocol) +- There's naming confusion around "Reframe, the protocol" and "Reframe, the set of methods" So this API proposal makes the following changes: -- The Delegated Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts +- The Delegated Content Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts nor IPLD +- There is a clear distinction between the RPC protocol (HTTP) and the API (Deleged Content Routing) - "Method names" and cache-relevant parameters are pushed into the URL path -- Streaming support is removed, and default response size limits are added along with an optional `limit` parameter for clients to specify response sizes - - We might add streaming support w/ chunked-encoded responses in the future, but it's currently not an important feature for the use cases that an HTTP API will be used for - - Pagination could be added to this in the future, if needed -- Bodies are encoded using standard JSON or CBOR, instead of using IPLD codecs -- JSON uses human-friendly string encodings of common data types - - CIDs are encoded as CIDv1 strings with a multibase prefix (e.g. base32), for consistency with CLIs, browsers, and [gateway URLs](https://docs.ipfs.io/how-to/address-ipfs-on-web/) - - Multiaddrs use the [human-readable format](https://github.com/multiformats/multiaddr#specification) that is used in existing tools and Kubo CLI commands such as `ipfs id` or `ipfs swarm peers` - - Byte array values, such as signatures, are multibase-encoded strings (with an `m` prefix indicating Base64) -- The "Identify" method and "message groups" are removed +- Streaming support is removed, and default response size limits are added along with an optional `pageLimit` parameter for clients to specify response sizes + - We will add streaming support in a subsequent IPIP, but we are trying to minimize the scope of this IPIP to what is immediately useful +- Bodies are encoded using idiomatic JSON, instead of using IPLD codecs, and are compatible with OpenAPI specifications +- The JSON uses human-readable string encodings of common data types + - CIDs are encoded as CIDv1 strings with a multibase prefix (e.g. base32), for consistency with CLIs, browsers, and [gateway URLs](https://docs.ipfs.io/how-to/address-ipfs-on-web/) + - Multiaddrs use the [human-readable format](https://github.com/multiformats/multiaddr#specification) that is used in existing tools and Kubo CLI commands such as `ipfs id` or `ipfs swarm peers` + - Byte array values, such as signatures, are multibase-encoded strings (with an `m` prefix indicating Base64) +- The "Identify" method and "message groups" are not included +- The "GetIPNS" and "PutIPNS" methods are not included ### User benefit -The cost of building and operating content routing services will be much lower, as developers will be able to reuse existing industry-standard tooling. -They no longer need to learn Reframe-specific concepts to consume or expose the API. +The cost of building and operating content routing services will be much lower, as developers will be able to maximally reuse existing industry-standard tooling. +Users will not need to learn a new RPC protocol and tooling to consume or expose the API. This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS netowrk and increasing data availability. @@ -79,17 +83,21 @@ and increasing data availability. #### Backwards Compatibility IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, -the experimental support for Reframe in Kubo will be removed in the next release and delegated routing will subsequently use this HTTP API. -We may decide to re-add Reframe support in the future once these issues have been resolved. +the experimental support for Reframe in Kubo will be deprecated in the next release and delegated content routing will subsequently use this HTTP API. +We may decide to re-add Reframe support in the future once these issues have been resolved.- #### Forwards Compatibility Standard HTTP mechanisms for forward compatibility are used: -- The API is versioned using a version number in the path -- The `Accept` and `Content-Type` headers are used for content type negotiation -- New methods will result in new paths +- The API is versioned using a version number prefix in the path +- The `Accept` and `Content-Type` headers are used for content type negotiation, allowing for backwards-compatible additions of new MIME types, hypothetically such as: + - `application/cbor` for binary-encoded responses + - `application/x-ndjson` for streamed responses + - `application/octet-stream` if the content router can provide the content/block directly +- New paths+methods can be introduced in a backwards-compatible way - Parameters can be added using either new query parameters or new fields in the request/response body. +- Provider records are both opaque and versioned to allow evolution of schemas and semantics for the same transfer protocol -Certain parts of bodies are labeled as "{ ... }", which are opaque JSON values passed through by the implementation, with no schema enforcement. +As a proof-of-concept, the tests for the initial implementation of this HTTP API were successfully tested with a libp2p transport using [libp2p/go-libp2p-http](https://github.com/libp2p/go-libp2p-http), demonstrating viability for also using this API over libp2p. ### Security @@ -97,7 +105,7 @@ None ### Alternatives -This *is* an alternative. +TODO ### Copyright diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index aa1431e3..462f0359 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -9,91 +9,218 @@ **Abstract** -"Delegated content routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated routing. +"Delegated content routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated content routing. # API Specification -The Delegated Content Routing Routing HTTP API uses the `application/json` content type by default. Clients and servers *should* support `application/cbor`, which can be negotiated using the standard `Accept` and `Content-Type` headers. +The Delegated Content Routing Routing HTTP API uses the `application/json` content type by default. + +As such, human-readable encodings of types are preferred. This spec may be updated in the future with a compact `application/cbor` encoding, in which case compact encodings of the various types would be used. ## Common Data Types: -- CIDs are always encoded using a [multibase](https://github.com/multiformats/multibase)-encoded [CIDv1](https://github.com/multiformats/cid#cidv1). -- Multiaddrs are encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification) -- Peer IDs are encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) -- Multibase bytes are encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64. +- CIDs are always string-encoded using a [multibase](https://github.com/multiformats/multibase)-encoded [CIDv1](https://github.com/multiformats/cid#cidv1). +- Multiaddrs are string-encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification) +- Peer IDs are string-encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) +- Multibase bytes are string-encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64. - Timestamps are Unix millisecond epoch timestamps +Until required for business logic, servers should treat these types as opaque strings, and should preserve unknown JSON fields. + +### Versioning +This API uses a standard version prefix in the path, such as `/v1/...`. If a backwards-incompatible change must be made, then the version number should be increased. + +### Provider Records +A provider record contains information about a content provider, including the transfer protocol and any protocol-specific information useful for fetching the content from the provider. + +The information required to write a record to a router (*"write" provider records*) may be different than the information contained when reading provider records (*"read" provider records*). + +For example, indexers may require a signature in `bitswap` write records for authentication of the peer contained in the record, but the read records may not include this authentication information. + +Both read and write provider records have a minimal required schema as follows: + +```json +{ + "Protocol": "", + ... +} +``` + +where: + +- `Protocol` is the name of a transfer protocol + - these typically map to multicodec codes (such as bitswap => 2304) + - these don't use multicodec codes directly to allow for versioning of provider records, e.g. to allow for "bitswap-v2" provider records (which would also map to 2304 code but with different schema/semantics) +- `...` denotes opaque JSON, which may contain information specific to the transfer protocol + +Specifications for some transfer protocols are provided in the "Transfer Protocols" section. + + ## API -- `GET /v1/providers/{CID}` - - Response - - ```json +### `GET /routing/v1/providers/{CID}` +- Response codes + - `200`: the response body contains 0 or more records + - `404`: must be returned if no matching records are found + - `422`: request does not conform to schema or semantic constraints +- Response Body +```json +{ + "Providers": [ { - "Providers": [ - { - "Peer": { - "ID": "12D3K...", - "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."] - } - "Protocols": [ - { - "Codec": 2320, - "Payload": { ... } - } - ] - } - ] + "Protocol": "", + ... } - ``` + ] +} +``` - - Default limit: 100 providers - - Optional query parameters - - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of [multicodec codes](https://github.com/multiformats/multicodec/blob/master/table.csv) in decimal form such as `2304,2320` - - `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `460,478` (`/quic` and `/tls/ws`) - - Servers should treat the multicodec codes used in the `transfer` and `transport` parameters as opaque, and not validate them, for forwards compatibility -- `GET /v1/providers/hashed/{multihash}` - - This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a [multihash](https://github.com/multiformats/multihash/) -- `PUT /v1/providers` - - Reframe equivalent: Provide - - Body - ```json +- Default limit: 100 providers +- Optional query parameters + - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of transfer protocol names such as `transfer=bitswap,filecoin-graphsync` + - `transport` for provider records with multiaddrs, only return records with multiaddrs explicitly supporting the passed transport protocols, encoded as decimal multicodec codes such as `transport=460,478` (`/quic` and `/tls/ws` respectively) +- Implements pagination according to the Pagination section + +Each object in the `Providers` list is a *read provider record*. + +- `PUT /routing/v1/providers` + - Response Codes + - `200`: the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records) + - `400`: + - Request Body +```json +{ + "Keys": ["cid1", "cid2"], + "Providers": [ { - "Signature": "multibase bytes", - "Payload": "multibase bytes" + "Protocol": "bitswap", + ... } - ``` - - `Payload` is a multibase-encoded string containing a JSON object with the following schema: - ```json - { - "Keys": ["cid1", "cid2"], - "Timestamp": 1234, - "AdvisoryTTL": 1234, - "Provider": { - "Peer": { - "ID": "12D3K...", - "Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."] - } - "Protocols": [ - { - "Codec": 1234, - "Payload": { ... } - } - ] - } - } - ``` - - `Signature` is a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload using the public key from the Peer ID. If the verification fails, the server must return a 403 status code. - - Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself. Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block over IPFS). If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme. - - Idempotent - - Default limit of 100 keys per request -- `GET /v1/ping` - - Returns 200 once the server is ready to accept requests, otherwise returns 503 - -## Limits - - - Responses with collections of results must have a default limit on the number of results that will be returned in a single response - - Pagination and/or dynamic limit configuration may be added to this spec in the future, once there is a concrete requirement + ] +} +``` + +Each object in the `Providers` list is a *write provider record*. + + - Response Body +```json +{ + "ProvideResults": [ + { ... } + ] +} +``` + - `ProvideResults` is a list of results in the same order as the `Providers` in the request, and the schema of each object is determined by the `Protocol` of the corresponding write object (called "Write Provider Records Response" in the Known Transfer Protocols section) + - This may contain output information such as TTLs, errors, etc. + - It is undefined whether the server will allow partial results +- The work for processing each provider record should be idempotent so that it can be retried without excessive cost in the case of full or partial failure of the request +- Default limit of 100 keys per request +- Implements pagination according to the Pagination section + +## Pagination + +APIs that return collections of results should support pagination as follows: + +- If there are more results, then a `NextPageToken` field should include an opaque string value, otherwise it should be undefined +- The value of `NextPageToken` can be specified as the value of a `pageToken` query parameter to fetch the next page + - Character set is restricted to the regex `[a-zA-Z0-9-_.~]+`, since this is intended to be used in URLs +- The client continues this process until `NextPageToken` is undefined or doesn't care to continue +- A `pageLimit` query parameter specifies the maximum size of a single page + +### Implementation Notes +Servers are required to return *at most* `pageLimit` results in a page. It is recommended for pages to be as dense as possible, but it is acceptable for them to return any number of items in the closed interval [0, pageLimit]. This is dependent on the capabilities of the backing database implementation. For example, a query specifying a `transfer` filter for a rare transfer protocol should not *require* the server to perform a very expensive database query for a single request. Instead, this is left to the server implementation to decide based on the constraints of the database. + +Implementations should encode into the token whatever information is necessary for fetching the next page. This could be a base32-encoded JSON object like `{"offset":3,"limit":10}`, an object ID of the last scanned item, etc. ## Error Codes - - A 404 must be returned if a resource was not found - - A 501 must be returned if a method is not supported +- `501`: must be returned if a method/path is not supported +- `429`: may be returned to indicate to the caller that it is issuing requests too quickly +- `400`: must be returned if an unknown path is requested + +## Known Transfer Protocols +This section contains a non-exhaustive list of known transfer protocols (by name) that may be supported by clients and servers. + +### bitswap +Multicodec code: 0x0900 + +#### Read Provider Records + +```json +{ + "Protocol": "bitswap", + "ID": "12D3K...", + "Addrs": ["/ip4/..."] +} +``` + +- `ID`: the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md) to contact +- `Addrs`: a list of known multiaddrs for the peer + - This list may be incomplete or incorrect and should only be treated as *hints* to improve performance by avoiding extra peer lookups + +The server should respect a passed `transport` query parameter by filtering against the `Multiaddrs` list. + + +#### Write Provider Records + +```json +{ + "Protocol": "bitswap", + "Signature": "", + "Payload": "" +} +``` + +- `Signature`: a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON + - Servers may ignore this field if they do not require signature verification +- `Payload`: a multibase-encoded JSON object which conforms with the following schema: +```json +{ + "Keys": ["cid1", "cid2"], + "Timestamp": 0, + "AdvisoryTTL": 0, + "PeerID": "12D3K...", + "Multiaddrs": ["/ip4/..."], +} +``` + - `Keys` is a list of the CIDs being provided + - `Timestamp` is the current time + - `AdvisoryTTL` is the time by which the caller expects the server to keep the record available + - If this value is unknown, the caller may use a value of 0 + +A 403 response code should be returned if the signature check fails. + +Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself. Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS). If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme. + +#### Write Provider Records Response +```json +{ + "AdvisoryTTL": 0 +} +``` + +- `AdvisoryTTL` is the time at which the server expects itself to drop the record + - If less than the `AdvisoryTTL` in the request, then the client should re-issue the request by that point + - If greater than the `AdvisoryTTL` in the request, then the server expects the client to be responsible for the content for up to that amount of time (TODO: this is ambiguous) + - If 0, the server makes no claims about the lifetime of the record + + +### filecoin-graphsync +Multicodec code: 0x0910 + +#### Read Provider Records + +```json +{ + "Protocol": "filecoin-graphsync", + "PieceCID": "", + "VerifiedDeal": true, + "FastRetrieval": true +} +``` + +- `PieceCID`: +- `VerifiedDeal`: +- `FastRetrieval`: + +#### Write Provider Records + +There is currently no specified schema. From 11f4ca5b4e26dcbee55327273d2f125a2d4fb370 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 13:48:32 -0500 Subject: [PATCH 13/28] formatting fix --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 462f0359..5400a464 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -108,9 +108,9 @@ Each object in the `Providers` list is a *write provider record*. ] } ``` - - `ProvideResults` is a list of results in the same order as the `Providers` in the request, and the schema of each object is determined by the `Protocol` of the corresponding write object (called "Write Provider Records Response" in the Known Transfer Protocols section) - - This may contain output information such as TTLs, errors, etc. - - It is undefined whether the server will allow partial results +- `ProvideResults` is a list of results in the same order as the `Providers` in the request, and the schema of each object is determined by the `Protocol` of the corresponding write object (called "Write Provider Records Response" in the Known Transfer Protocols section) + - This may contain output information such as TTLs, errors, etc. + - It is undefined whether the server will allow partial results - The work for processing each provider record should be idempotent so that it can be retried without excessive cost in the case of full or partial failure of the request - Default limit of 100 keys per request - Implements pagination according to the Pagination section From 39c467eba505e6b341ba63dd22652c602fe78530 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 13:50:51 -0500 Subject: [PATCH 14/28] use a JSON string for payload, no reason to base-encode --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 5400a464..833902dd 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -171,25 +171,29 @@ The server should respect a passed `transport` query parameter by filtering agai - `Signature`: a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON - Servers may ignore this field if they do not require signature verification -- `Payload`: a multibase-encoded JSON object which conforms with the following schema: +- `Payload`: a string containing a serialized JSON object which conforms with the following schema: ```json { "Keys": ["cid1", "cid2"], "Timestamp": 0, "AdvisoryTTL": 0, - "PeerID": "12D3K...", - "Multiaddrs": ["/ip4/..."], + "ID": "12D3K...", + "Addrs": ["/ip4/..."], } ``` - `Keys` is a list of the CIDs being provided - `Timestamp` is the current time - `AdvisoryTTL` is the time by which the caller expects the server to keep the record available - If this value is unknown, the caller may use a value of 0 + - `ID` is the peer ID that was used to sign the record + - `Addrs` is a list of string-encoded multiaddrs A 403 response code should be returned if the signature check fails. Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself. Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS). If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme. +The `Payload` field is a string, not a proper JSON object, to prevent its contents from being accidentally parsed and re-encoded by intermediaries, which may change the order of JSON fields and thus cause the record to fail validation. + #### Write Provider Records Response ```json { From 87ff0acadb87c97a6f5245ac274b29fdd5cd18de Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 13:53:36 -0500 Subject: [PATCH 15/28] s/Multiaddrs/Addrs --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 833902dd..fc26f905 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -156,7 +156,7 @@ Multicodec code: 0x0900 - `Addrs`: a list of known multiaddrs for the peer - This list may be incomplete or incorrect and should only be treated as *hints* to improve performance by avoiding extra peer lookups -The server should respect a passed `transport` query parameter by filtering against the `Multiaddrs` list. +The server should respect a passed `transport` query parameter by filtering against the `Addrs` list. #### Write Provider Records From 96d55d0bf02a7df987b2dc4f1f560d8e29600b8e Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 13:58:32 -0500 Subject: [PATCH 16/28] properly distinguish Reframe HTTP transport from Reframe --- IPIP/0000-delegated-routing-http-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md index 56a08d37..638e69dc 100644 --- a/IPIP/0000-delegated-routing-http-api.md +++ b/IPIP/0000-delegated-routing-http-api.md @@ -31,7 +31,7 @@ See the [Delegated Content Routing HTTP API spec](../routing/DELEGATED_CONTENT_R ## Design rationale To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: -- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) are encoded inside IPLD-encoded messages +- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) using the HTTP transport are encoded inside IPLD-encoded messages - This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations: - Configuring different caching strategies for different methods - Configuring reverse proxies on a per-method basis From 4264a2d2c130a6cd6b2935ea281a3c6edc6b3367 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Fri, 11 Nov 2022 14:13:26 -0500 Subject: [PATCH 17/28] remove dangling status code --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 1 - 1 file changed, 1 deletion(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index fc26f905..cace2a35 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -84,7 +84,6 @@ Each object in the `Providers` list is a *read provider record*. - `PUT /routing/v1/providers` - Response Codes - `200`: the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records) - - `400`: - Request Body ```json { From 0f49dcffcb3811901a36c7037fa34406f7687030 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Tue, 15 Nov 2022 11:14:15 -0500 Subject: [PATCH 18/28] add -v1 suffix to filecoin-graphsync protocol name --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index cace2a35..57937617 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -75,7 +75,7 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco - Default limit: 100 providers - Optional query parameters - - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of transfer protocol names such as `transfer=bitswap,filecoin-graphsync` + - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of transfer protocol names such as `transfer=bitswap,filecoin-graphsync-v1` - `transport` for provider records with multiaddrs, only return records with multiaddrs explicitly supporting the passed transport protocols, encoded as decimal multicodec codes such as `transport=460,478` (`/quic` and `/tls/ws` respectively) - Implements pagination according to the Pagination section @@ -206,14 +206,14 @@ The `Payload` field is a string, not a proper JSON object, to prevent its conten - If 0, the server makes no claims about the lifetime of the record -### filecoin-graphsync +### filecoin-graphsync-v1 Multicodec code: 0x0910 #### Read Provider Records ```json { - "Protocol": "filecoin-graphsync", + "Protocol": "filecoin-graphsync-v1", "PieceCID": "", "VerifiedDeal": true, "FastRetrieval": true From 7238e637d35ebf33f0000537322bd4768e1893c6 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Tue, 15 Nov 2022 11:17:32 -0500 Subject: [PATCH 19/28] Add ID and Addrs fields to filecoin-graphsync-v1 read record --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 57937617..4c709c42 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -214,12 +214,16 @@ Multicodec code: 0x0910 ```json { "Protocol": "filecoin-graphsync-v1", + "ID": "12D3K...", + "Addrs": ["/ip4/..."], "PieceCID": "", "VerifiedDeal": true, "FastRetrieval": true } ``` +- `ID`: the peer ID of the provider +- `Addrs`: a list of known multiaddrs for the peer - `PieceCID`: - `VerifiedDeal`: - `FastRetrieval`: From e823d9ebd5bd1efa65fd7d537342156b02364f4d Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 22 Nov 2022 20:07:49 +0100 Subject: [PATCH 20/28] docs(http-routing): CORS and Web Browsers --- IPIP/0000-delegated-routing-http-api.md | 12 ++++++++---- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 22 +++++++++++++++++++++- 2 files changed, 29 insertions(+), 5 deletions(-) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md index 638e69dc..f85ab5ea 100644 --- a/IPIP/0000-delegated-routing-http-api.md +++ b/IPIP/0000-delegated-routing-http-api.md @@ -1,8 +1,8 @@ -# IPIP 0000: Delegated Content Routing HTTP API +# IPIP-337: Delegated Content Routing HTTP API - Start Date: 2022-10-18 - Related Issues: - - (add links here) + - https://github.com/ipfs/specs/pull/337 ## Summary @@ -29,6 +29,7 @@ We expect this API to be extended beyond "content routing" in the future, so add See the [Delegated Content Routing HTTP API spec](../routing/DELEGATED_CONTENT_ROUTING_HTTP.md) included with this IPIP. ## Design rationale + To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: - Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) using the HTTP transport are encoded inside IPLD-encoded messages @@ -81,12 +82,14 @@ and increasing data availability. ### Compatibility #### Backwards Compatibility + IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, the experimental support for Reframe in Kubo will be deprecated in the next release and delegated content routing will subsequently use this HTTP API. We may decide to re-add Reframe support in the future once these issues have been resolved.- #### Forwards Compatibility + Standard HTTP mechanisms for forward compatibility are used: - The API is versioned using a version number prefix in the path - The `Accept` and `Content-Type` headers are used for content type negotiation, allowing for backwards-compatible additions of new MIME types, hypothetically such as: @@ -101,11 +104,12 @@ As a proof-of-concept, the tests for the initial implementation of this HTTP API ### Security -None +- TODO: cover user privacy +- TODO: parsing best practices: what are limits (e.g., per message / field)? ### Alternatives -TODO +- Reframe (general-purpose RPC) was evaluated, see "Design rationale" section for rationale why it was not selected. ### Copyright diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 4c709c42..d5ac2efa 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -1,4 +1,6 @@ -# ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Content Routing HTTP API +# Delegated Content Routing HTTP API + +![wip](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Content Routing HTTP API **Author(s)**: - Gus Eggert @@ -125,6 +127,7 @@ APIs that return collections of results should support pagination as follows: - A `pageLimit` query parameter specifies the maximum size of a single page ### Implementation Notes + Servers are required to return *at most* `pageLimit` results in a page. It is recommended for pages to be as dense as possible, but it is acceptable for them to return any number of items in the closed interval [0, pageLimit]. This is dependent on the capabilities of the backing database implementation. For example, a query specifying a `transfer` filter for a rare transfer protocol should not *require* the server to perform a very expensive database query for a single request. Instead, this is left to the server implementation to decide based on the constraints of the database. Implementations should encode into the token whatever information is necessary for fetching the next page. This could be a base32-encoded JSON object like `{"offset":3,"limit":10}`, an object ID of the last scanned item, etc. @@ -135,6 +138,23 @@ Implementations should encode into the token whatever information is necessary f - `429`: may be returned to indicate to the caller that it is issuing requests too quickly - `400`: must be returned if an unknown path is requested +## CORS and Web Browsers + +Browser interoperability requires implementations to support +[CORS](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS). + +JavaScript client running on a third-party Origin must be able to send HTTP +request to the endpoints defined in this specification, and read the received +values. This means HTTP server implementing this API must (1) support +[CORS preflight requests](https://developer.mozilla.org/en-US/docs/Glossary/Preflight_request) +sent as HTTP OPTIONS, and (2) always respond with headers that remove CORS +limits, allowing every website to query the API for results: + +```plaintext +Access-Control-Allow-Origin: * +Access-Control-Allow-Methods: GET, PUT, POST, DELETE, OPTIONS +``` + ## Known Transfer Protocols This section contains a non-exhaustive list of known transfer protocols (by name) that may be supported by clients and servers. From 19fff936c51a8daca6e70b20e20dad7e8c94668e Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Wed, 7 Dec 2022 10:51:31 -0500 Subject: [PATCH 21/28] Decouple schema from protocol in records --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 30 ++++++++++++++--------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index d5ac2efa..19e2aca3 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -43,15 +43,17 @@ Both read and write provider records have a minimal required schema as follows: ```json { "Protocol": "", + "Schema": "", ... } ``` where: -- `Protocol` is the name of a transfer protocol - - these typically map to multicodec codes (such as bitswap => 2304) - - these don't use multicodec codes directly to allow for versioning of provider records, e.g. to allow for "bitswap-v2" provider records (which would also map to 2304 code but with different schema/semantics) +- `Protocol` is the multicodec name of the transfer protocol +- `Schema` denotes the schema to use for encoding/decoding the record + - This is separate from the `Protocol` to allow this HTTP API to evolve independently of the transfer protocol + - Implementations should switch on this when parsing records, not on `Protocol` - `...` denotes opaque JSON, which may contain information specific to the transfer protocol Specifications for some transfer protocols are provided in the "Transfer Protocols" section. @@ -69,6 +71,7 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco "Providers": [ { "Protocol": "", + "Schema": "", ... } ] @@ -89,10 +92,10 @@ Each object in the `Providers` list is a *read provider record*. - Request Body ```json { - "Keys": ["cid1", "cid2"], "Providers": [ { - "Protocol": "bitswap", + "Protocol": "", + "Schema": "bitswap", ... } ] @@ -158,14 +161,16 @@ Access-Control-Allow-Methods: GET, PUT, POST, DELETE, OPTIONS ## Known Transfer Protocols This section contains a non-exhaustive list of known transfer protocols (by name) that may be supported by clients and servers. -### bitswap -Multicodec code: 0x0900 +### Bitswap +Multicodec name: `transport-bitswap` +Schema: `bitswap` #### Read Provider Records ```json { - "Protocol": "bitswap", + "Protocol": "transport-bitswap", + "Schema": "bitswap", "ID": "12D3K...", "Addrs": ["/ip4/..."] } @@ -182,7 +187,8 @@ The server should respect a passed `transport` query parameter by filtering agai ```json { - "Protocol": "bitswap", + "Protocol": "transport-bitswap", + "Schema": "bitswap", "Signature": "", "Payload": "" } @@ -226,14 +232,16 @@ The `Payload` field is a string, not a proper JSON object, to prevent its conten - If 0, the server makes no claims about the lifetime of the record -### filecoin-graphsync-v1 -Multicodec code: 0x0910 +### Filecoin Graphsync +Multicodec name: `transport-graphsync-filecoinv1` +Schema: `graphsync-filecoinv1` #### Read Provider Records ```json { "Protocol": "filecoin-graphsync-v1", + "Schema": "graphsync-filecoinv1", "ID": "12D3K...", "Addrs": ["/ip4/..."], "PieceCID": "", From 1aac44ceb39dcf71490b9b329a10c681ca8fbf83 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 16 Jan 2023 14:20:37 +0100 Subject: [PATCH 22/28] ipip-337: apply suggestions from review Co-authored-by: Antonio Navarro Perez Co-authored-by: Masih H. Derkani --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 19e2aca3..385184fe 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -86,10 +86,10 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco Each object in the `Providers` list is a *read provider record*. -- `PUT /routing/v1/providers` - - Response Codes - - `200`: the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records) - - Request Body +### `PUT /routing/v1/providers` +- Response Codes + - `200`: the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records) +- Request Body ```json { "Providers": [ @@ -240,7 +240,7 @@ Schema: `graphsync-filecoinv1` ```json { - "Protocol": "filecoin-graphsync-v1", + "Protocol": "transport-graphsync-filecoinv1", "Schema": "graphsync-filecoinv1", "ID": "12D3K...", "Addrs": ["/ip4/..."], From acc397b8a2adc00081f5af0f410625731a6d2214 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 16 Jan 2023 14:24:59 +0100 Subject: [PATCH 23/28] chore: fix typo Co-authored-by: Masih H. Derkani --- IPIP/0000-delegated-routing-http-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0000-delegated-routing-http-api.md index f85ab5ea..2325a734 100644 --- a/IPIP/0000-delegated-routing-http-api.md +++ b/IPIP/0000-delegated-routing-http-api.md @@ -76,7 +76,7 @@ So this API proposal makes the following changes: The cost of building and operating content routing services will be much lower, as developers will be able to maximally reuse existing industry-standard tooling. Users will not need to learn a new RPC protocol and tooling to consume or expose the API. -This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS netowrk +This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS network and increasing data availability. ### Compatibility From 325ca1e892aa7cd51535ffbc86a6130700177553 Mon Sep 17 00:00:00 2001 From: "Masih H. Derkani" Date: Tue, 24 Jan 2023 13:41:45 +0000 Subject: [PATCH 24/28] Reduce the scope of IPIP-337 by excluding write operations Reduce the scope of IPIP-337 by temporarily excluding write operation. The write operations are to be added in a separate PR once the IPIP-337 is merged. See: - https://github.com/ipfs/specs/pull/337 --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 97 ++--------------------- 1 file changed, 5 insertions(+), 92 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 385184fe..9e2d2e04 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -86,39 +86,6 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco Each object in the `Providers` list is a *read provider record*. -### `PUT /routing/v1/providers` -- Response Codes - - `200`: the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records) -- Request Body -```json -{ - "Providers": [ - { - "Protocol": "", - "Schema": "bitswap", - ... - } - ] -} -``` - -Each object in the `Providers` list is a *write provider record*. - - - Response Body -```json -{ - "ProvideResults": [ - { ... } - ] -} -``` -- `ProvideResults` is a list of results in the same order as the `Providers` in the request, and the schema of each object is determined by the `Protocol` of the corresponding write object (called "Write Provider Records Response" in the Known Transfer Protocols section) - - This may contain output information such as TTLs, errors, etc. - - It is undefined whether the server will allow partial results -- The work for processing each provider record should be idempotent so that it can be retried without excessive cost in the case of full or partial failure of the request -- Default limit of 100 keys per request -- Implements pagination according to the Pagination section - ## Pagination APIs that return collections of results should support pagination as follows: @@ -155,7 +122,7 @@ limits, allowing every website to query the API for results: ```plaintext Access-Control-Allow-Origin: * -Access-Control-Allow-Methods: GET, PUT, POST, DELETE, OPTIONS +Access-Control-Allow-Methods: GET, OPTIONS ``` ## Known Transfer Protocols @@ -182,56 +149,6 @@ Schema: `bitswap` The server should respect a passed `transport` query parameter by filtering against the `Addrs` list. - -#### Write Provider Records - -```json -{ - "Protocol": "transport-bitswap", - "Schema": "bitswap", - "Signature": "", - "Payload": "" -} -``` - -- `Signature`: a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON - - Servers may ignore this field if they do not require signature verification -- `Payload`: a string containing a serialized JSON object which conforms with the following schema: -```json -{ - "Keys": ["cid1", "cid2"], - "Timestamp": 0, - "AdvisoryTTL": 0, - "ID": "12D3K...", - "Addrs": ["/ip4/..."], -} -``` - - `Keys` is a list of the CIDs being provided - - `Timestamp` is the current time - - `AdvisoryTTL` is the time by which the caller expects the server to keep the record available - - If this value is unknown, the caller may use a value of 0 - - `ID` is the peer ID that was used to sign the record - - `Addrs` is a list of string-encoded multiaddrs - -A 403 response code should be returned if the signature check fails. - -Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself. Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS). If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme. - -The `Payload` field is a string, not a proper JSON object, to prevent its contents from being accidentally parsed and re-encoded by intermediaries, which may change the order of JSON fields and thus cause the record to fail validation. - -#### Write Provider Records Response -```json -{ - "AdvisoryTTL": 0 -} -``` - -- `AdvisoryTTL` is the time at which the server expects itself to drop the record - - If less than the `AdvisoryTTL` in the request, then the client should re-issue the request by that point - - If greater than the `AdvisoryTTL` in the request, then the server expects the client to be responsible for the content for up to that amount of time (TODO: this is ambiguous) - - If 0, the server makes no claims about the lifetime of the record - - ### Filecoin Graphsync Multicodec name: `transport-graphsync-filecoinv1` Schema: `graphsync-filecoinv1` @@ -251,11 +168,7 @@ Schema: `graphsync-filecoinv1` ``` - `ID`: the peer ID of the provider -- `Addrs`: a list of known multiaddrs for the peer -- `PieceCID`: -- `VerifiedDeal`: -- `FastRetrieval`: - -#### Write Provider Records - -There is currently no specified schema. +- `Addrs`: a list of known multiaddrs for the provider +- `PieceCID`: the CID of the [piece](https://spec.filecoin.io/systems/filecoin_files/piece/#section-systems.filecoin_files.piece) within which the data is stored +- `VerifiedDeal`: whether the deal corresponding to the data is verified +- `FastRetrieval`: whether the provider claims there is an unsealed copy of the data available for fast retrieval From 9c47a311bc18e71493a42d48938259962e321c0a Mon Sep 17 00:00:00 2001 From: "Masih H. Derkani" Date: Tue, 24 Jan 2023 14:04:47 +0000 Subject: [PATCH 25/28] Address lint issues Fix lint issues as reported by GitHub `super-liner`, including: - empty lines surrounding headers. - maximum line length. - unique headers - natural language. - tailing space. --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 56 +++++++++++++---------- 1 file changed, 33 insertions(+), 23 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 9e2d2e04..285d4fb9 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -3,6 +3,7 @@ ![wip](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Content Routing HTTP API **Author(s)**: + - Gus Eggert **Maintainer(s)**: @@ -13,25 +14,28 @@ "Delegated content routing" is a mechanism for IPFS implementations to use for offloading content routing to another process/server. This spec describes an HTTP API for delegated content routing. -# API Specification -The Delegated Content Routing Routing HTTP API uses the `application/json` content type by default. +## API Specification + +The Delegated Content Routing Routing HTTP API uses the `application/json` content type by default. As such, human-readable encodings of types are preferred. This spec may be updated in the future with a compact `application/cbor` encoding, in which case compact encodings of the various types would be used. -## Common Data Types: +## Common Data Types - CIDs are always string-encoded using a [multibase](https://github.com/multiformats/multibase)-encoded [CIDv1](https://github.com/multiformats/cid#cidv1). - Multiaddrs are string-encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification) - Peer IDs are string-encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) -- Multibase bytes are string-encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64. +- Multibase bytes are string-encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use base64. - Timestamps are Unix millisecond epoch timestamps Until required for business logic, servers should treat these types as opaque strings, and should preserve unknown JSON fields. ### Versioning + This API uses a standard version prefix in the path, such as `/v1/...`. If a backwards-incompatible change must be made, then the version number should be increased. ### Provider Records + A provider record contains information about a content provider, including the transfer protocol and any protocol-specific information useful for fetching the content from the provider. The information required to write a record to a router (*"write" provider records*) may be different than the information contained when reading provider records (*"read" provider records*). @@ -48,7 +52,7 @@ Both read and write provider records have a minimal required schema as follows: } ``` -where: +Where: - `Protocol` is the multicodec name of the transfer protocol - `Schema` denotes the schema to use for encoding/decoding the record @@ -58,26 +62,28 @@ where: Specifications for some transfer protocols are provided in the "Transfer Protocols" section. - ## API + ### `GET /routing/v1/providers/{CID}` + - Response codes - `200`: the response body contains 0 or more records - `404`: must be returned if no matching records are found - `422`: request does not conform to schema or semantic constraints - Response Body -```json -{ - "Providers": [ - { - "Protocol": "", - "Schema": "", - ... - } - ] -} -``` - + + ```json + { + "Providers": [ + { + "Protocol": "", + "Schema": "", + ... + } + ] + } + ``` + - Default limit: 100 providers - Optional query parameters - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of transfer protocol names such as `transfer=bitswap,filecoin-graphsync-v1` @@ -92,13 +98,14 @@ APIs that return collections of results should support pagination as follows: - If there are more results, then a `NextPageToken` field should include an opaque string value, otherwise it should be undefined - The value of `NextPageToken` can be specified as the value of a `pageToken` query parameter to fetch the next page - - Character set is restricted to the regex `[a-zA-Z0-9-_.~]+`, since this is intended to be used in URLs + - Character set is restricted to the regular expression `[a-zA-Z0-9-_.~]+`, since this is intended to be used in URLs - The client continues this process until `NextPageToken` is undefined or doesn't care to continue - A `pageLimit` query parameter specifies the maximum size of a single page ### Implementation Notes -Servers are required to return *at most* `pageLimit` results in a page. It is recommended for pages to be as dense as possible, but it is acceptable for them to return any number of items in the closed interval [0, pageLimit]. This is dependent on the capabilities of the backing database implementation. For example, a query specifying a `transfer` filter for a rare transfer protocol should not *require* the server to perform a very expensive database query for a single request. Instead, this is left to the server implementation to decide based on the constraints of the database. +Servers are required to return *at most* `pageLimit` results in a page. It is recommended for pages to be as dense as possible, but it is acceptable for them to return any number of items in the closed interval [0, pageLimit]. This is dependent on the capabilities of the backing database implementation. +For example, a query specifying a `transfer` filter for a rare transfer protocol should not *require* the server to perform a very expensive database query for a single request. Instead, this is left to the server implementation to decide based on the constraints of the database. Implementations should encode into the token whatever information is necessary for fetching the next page. This could be a base32-encoded JSON object like `{"offset":3,"limit":10}`, an object ID of the last scanned item, etc. @@ -118,7 +125,7 @@ request to the endpoints defined in this specification, and read the received values. This means HTTP server implementing this API must (1) support [CORS preflight requests](https://developer.mozilla.org/en-US/docs/Glossary/Preflight_request) sent as HTTP OPTIONS, and (2) always respond with headers that remove CORS -limits, allowing every website to query the API for results: +limits, allowing every site to query the API for results: ```plaintext Access-Control-Allow-Origin: * @@ -126,13 +133,15 @@ Access-Control-Allow-Methods: GET, OPTIONS ``` ## Known Transfer Protocols + This section contains a non-exhaustive list of known transfer protocols (by name) that may be supported by clients and servers. ### Bitswap + Multicodec name: `transport-bitswap` Schema: `bitswap` -#### Read Provider Records +#### Bitswap Read Provider Records ```json { @@ -150,10 +159,11 @@ Schema: `bitswap` The server should respect a passed `transport` query parameter by filtering against the `Addrs` list. ### Filecoin Graphsync + Multicodec name: `transport-graphsync-filecoinv1` Schema: `graphsync-filecoinv1` -#### Read Provider Records +#### Filecoin Graphsync Read Provider Records ```json { From 655b1f2fc37ed8eac3fc1ab2c3376e1907527003 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 24 Jan 2023 22:37:25 +0100 Subject: [PATCH 26/28] Rename 0000-delegated-routing-http-api.md to 0337-delegated-routing-http-api.md --- ...ted-routing-http-api.md => 0337-delegated-routing-http-api.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename IPIP/{0000-delegated-routing-http-api.md => 0337-delegated-routing-http-api.md} (100%) diff --git a/IPIP/0000-delegated-routing-http-api.md b/IPIP/0337-delegated-routing-http-api.md similarity index 100% rename from IPIP/0000-delegated-routing-http-api.md rename to IPIP/0337-delegated-routing-http-api.md From d3431895a4d0af5aea19c6b249d1a8b479ce1fd3 Mon Sep 17 00:00:00 2001 From: Gus Eggert Date: Thu, 2 Feb 2023 10:14:48 -0500 Subject: [PATCH 27/28] Remove pagination and transport & transfer filters These were not implemented so do not belong in this spec. If we implement them later, we can reintroduce them at that time. --- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 41 +++++++++-------------- 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index 285d4fb9..f3cfada8 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -71,36 +71,25 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco - `404`: must be returned if no matching records are found - `422`: request does not conform to schema or semantic constraints - Response Body - - ```json - { - "Providers": [ - { - "Protocol": "", - "Schema": "", - ... - } - ] - } - ``` - -- Default limit: 100 providers -- Optional query parameters - - `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of transfer protocol names such as `transfer=bitswap,filecoin-graphsync-v1` - - `transport` for provider records with multiaddrs, only return records with multiaddrs explicitly supporting the passed transport protocols, encoded as decimal multicodec codes such as `transport=460,478` (`/quic` and `/tls/ws` respectively) -- Implements pagination according to the Pagination section +```json +{ + "Providers": [ + { + "Protocol": "", + "Schema": "", + ... + } + ] +} +``` + +- Response limit: 100 providers Each object in the `Providers` list is a *read provider record*. ## Pagination -APIs that return collections of results should support pagination as follows: - -- If there are more results, then a `NextPageToken` field should include an opaque string value, otherwise it should be undefined -- The value of `NextPageToken` can be specified as the value of a `pageToken` query parameter to fetch the next page - - Character set is restricted to the regular expression `[a-zA-Z0-9-_.~]+`, since this is intended to be used in URLs -- The client continues this process until `NextPageToken` is undefined or doesn't care to continue -- A `pageLimit` query parameter specifies the maximum size of a single page +This API does not support pagination, but optional pagination can be added in a backwards-compatible spec update. ### Implementation Notes @@ -140,6 +129,7 @@ This section contains a non-exhaustive list of known transfer protocols (by name Multicodec name: `transport-bitswap` Schema: `bitswap` +Specification: [ipfs/specs/BITSWAP.md](https://github.com/ipfs/specs/blob/main/BITSWAP.md) #### Bitswap Read Provider Records @@ -162,6 +152,7 @@ The server should respect a passed `transport` query parameter by filtering agai Multicodec name: `transport-graphsync-filecoinv1` Schema: `graphsync-filecoinv1` +Specification: [ipfs/go-graphsync/blob/main/docs/architecture.md](https://github.com/ipfs/go-graphsync/blob/main/docs/architecture.md) #### Filecoin Graphsync Read Provider Records From 573417e807ae2f487247670e2066ae03c3b34dc1 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Sat, 11 Feb 2023 01:20:16 +0100 Subject: [PATCH 28/28] ipip-337: final editorial changes --- IPIP/0337-delegated-routing-http-api.md | 16 +++++++----- routing/DELEGATED_CONTENT_ROUTING_HTTP.md | 32 +++++++++++------------ 2 files changed, 25 insertions(+), 23 deletions(-) diff --git a/IPIP/0337-delegated-routing-http-api.md b/IPIP/0337-delegated-routing-http-api.md index 2325a734..ab9df067 100644 --- a/IPIP/0337-delegated-routing-http-api.md +++ b/IPIP/0337-delegated-routing-http-api.md @@ -15,14 +15,14 @@ and supporting large content providers is a key strategy for driving down IPFS c These providers must handle high volumes of traffic and support many users, so leveraging industry-standard tools and services such as HTTP load balancers, CDNs, reverse proxies, etc. is a requirement. To maximize compatibility with standard tools, IPFS needs an HTTP API specification that uses standard HTTP idioms and payload encoding. -The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing is an experimental attempt at this, +The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing is an experimental attempt at this, but it has resulted in a very unidiomatic HTTP API which is difficult to implement and is incompatible with many existing tools. The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated content routing HTTP API. Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP, and implementations of this spec could then be rewritten in the IDL, maintaining backwards compatibility. -We expect this API to be extended beyond "content routing" in the future, so additional IPIPs may rename this to something more general such as "Delegated Routing HTTP API". +We expect this API to be extended beyond "content routing" in the future, so additional IPIPs may rename this to something more general such as "Delegated Routing HTTP API". ## Detailed design @@ -62,7 +62,7 @@ So this API proposal makes the following changes: - The Delegated Content Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts nor IPLD - There is a clear distinction between the RPC protocol (HTTP) and the API (Deleged Content Routing) - "Method names" and cache-relevant parameters are pushed into the URL path -- Streaming support is removed, and default response size limits are added along with an optional `pageLimit` parameter for clients to specify response sizes +- Streaming support is removed, and default response size limits are added. - We will add streaming support in a subsequent IPIP, but we are trying to minimize the scope of this IPIP to what is immediately useful - Bodies are encoded using idiomatic JSON, instead of using IPLD codecs, and are compatible with OpenAPI specifications - The JSON uses human-readable string encodings of common data types @@ -84,13 +84,14 @@ and increasing data availability. #### Backwards Compatibility IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. -Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, -the experimental support for Reframe in Kubo will be deprecated in the next release and delegated content routing will subsequently use this HTTP API. +Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, +the experimental support for Reframe in Kubo will be deprecated in the next release and delegated content routing will subsequently use this HTTP API. We may decide to re-add Reframe support in the future once these issues have been resolved.- #### Forwards Compatibility Standard HTTP mechanisms for forward compatibility are used: + - The API is versioned using a version number prefix in the path - The `Accept` and `Content-Type` headers are used for content type negotiation, allowing for backwards-compatible additions of new MIME types, hypothetically such as: - `application/cbor` for binary-encoded responses @@ -104,8 +105,9 @@ As a proof-of-concept, the tests for the initial implementation of this HTTP API ### Security -- TODO: cover user privacy -- TODO: parsing best practices: what are limits (e.g., per message / field)? +- All CID requests are sent to a central HTTPS endpoint as plain text, with TLS being the only protection against third-party observation. +- While privacy is not a concern in the current version, plans are underway to add a separate endpoint that prioritizes lookup privacy. Follow the progress in related pre-work in [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) and [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5). +- The usual JSON parsing rules apply. To prevent potential Denial of Service (DoS) attack, clients should ignore responses larger than 100 providers and introduce a byte size limit that is applicable to their use case. ### Alternatives diff --git a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md index f3cfada8..1f441230 100644 --- a/routing/DELEGATED_CONTENT_ROUTING_HTTP.md +++ b/routing/DELEGATED_CONTENT_ROUTING_HTTP.md @@ -1,6 +1,6 @@ # Delegated Content Routing HTTP API -![wip](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) Delegated Content Routing HTTP API +![reliable](https://img.shields.io/badge/status-reliable-green.svg?style=flat-square) Delegated Content Routing HTTP API **Author(s)**: @@ -54,7 +54,7 @@ Both read and write provider records have a minimal required schema as follows: Where: -- `Protocol` is the multicodec name of the transfer protocol +- `Protocol` is the multicodec name of the transfer protocol or an opaque string (for experimenting with novel protocols without a multicodec) - `Schema` denotes the schema to use for encoding/decoding the record - This is separate from the `Protocol` to allow this HTTP API to evolve independently of the transfer protocol - Implementations should switch on this when parsing records, not on `Protocol` @@ -66,11 +66,14 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco ### `GET /routing/v1/providers/{CID}` -- Response codes - - `200`: the response body contains 0 or more records - - `404`: must be returned if no matching records are found - - `422`: request does not conform to schema or semantic constraints -- Response Body +#### Response codes + +- `200` (OK): the response body contains 0 or more records +- `404` (Not Found): must be returned if no matching records are found +- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints + +#### Response Body + ```json { "Providers": [ @@ -83,7 +86,7 @@ Specifications for some transfer protocols are provided in the "Transfer Protoco } ``` -- Response limit: 100 providers +Response limit: 100 providers Each object in the `Providers` list is a *read provider record*. @@ -91,18 +94,15 @@ Each object in the `Providers` list is a *read provider record*. This API does not support pagination, but optional pagination can be added in a backwards-compatible spec update. -### Implementation Notes - -Servers are required to return *at most* `pageLimit` results in a page. It is recommended for pages to be as dense as possible, but it is acceptable for them to return any number of items in the closed interval [0, pageLimit]. This is dependent on the capabilities of the backing database implementation. -For example, a query specifying a `transfer` filter for a rare transfer protocol should not *require* the server to perform a very expensive database query for a single request. Instead, this is left to the server implementation to decide based on the constraints of the database. +## Streaming -Implementations should encode into the token whatever information is necessary for fetching the next page. This could be a base32-encoded JSON object like `{"offset":3,"limit":10}`, an object ID of the last scanned item, etc. +This API does not currently support streaming, however it can be added in the future through a backwards-compatible update by using a content type other than `application/json`. ## Error Codes -- `501`: must be returned if a method/path is not supported -- `429`: may be returned to indicate to the caller that it is issuing requests too quickly -- `400`: must be returned if an unknown path is requested +- `501` (Not Implemented): must be returned if a method/path is not supported +- `429` (Too Many Requests): may be returned along with optional [Retry-After](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After) header to indicate to the caller that it is issuing requests too quickly +- `400` (Bad Request): must be returned if an unknown path is requested ## CORS and Web Browsers