Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-378: Delegated Routing HTTP POST API #378

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
75 changes: 75 additions & 0 deletions src/ipips/ipip-0378.md
@@ -0,0 +1,75 @@
---
title: "IPIP-0378: Delegated Content Routing HTTP Provide Records API"
date: 2023-02-14
ipip: ratified
editors:
- name: Masih H. Derkani
github: masih
- name: Marcin Rataj
github: lidel
url: https://lidel.org/
relatedIssues:
- https://github.com/ipfs/specs/pull/378
order: 389
tags: ['ipips']
---

## Summary

This IPIP extends the [IPIP-337 HTTP Delegated Routing API](0337-delegated-routing-http-api.md) to provide records over `PUT` requests.

The work here was originally proposed as part of IPIP-337, and eventually was separated into its own IPIP in order to reduce the scope of original work, while enabling iterative release of the HTTP delegated routing APIs.

## Motivation

The IPFS interaction with DHT includes both read and write operations.
A user can provide records, advertising the presence of content, as well as looking up providers for a given CID.
The specification proposed by [IPIP-337](0337-delegated-routing-http-api.md) offers an idiomatic first-class support for offloading the lookup portion of this interaction onto other processes and/or servers.
Following the same motivations that inspired [IPIP-337](0337-delegated-routing-http-api.md), this document expands the HTTP APIs to also
offload the ability to provide records noto a third-party system.

## Detailed design

The API extensions are added to the [Delegated Content Routing HTTP API spec/`PUT`](../routing/DELEGATED_CONTENT_ROUTING_HTTP.md#put-routingv1providers) section, along with complimentary sections that outline known formats followed by example payload.

## Design rationale

The rationale for the design of `PUT` operations closely follows the reasoning listed in [IPIP-337](0337-delegated-routing-http-api.md#design-rationale).
The design uses a human-readable request/response structure with extensibility in mind.
The specification imposes no restrictions on the schema nor the protocol advertised in provider records.
The hope is that such extensibility will encourage and inspire innovation for better transfer protocols.
In order to reduce barrier for adoption, the existing

### User benefit

Expanding the user benefits listed as part of [IPIP-337](0337-delegated-routing-http-api.md#user-benefit), in the context of content routing write operations are typically more expensive than read operations. They involve book keeping such as TTL, gossip propagation, etc.
Therefore, it is highly desirable to reduce the burden of advertising provider records onto the network by means of delegation through simple to use HTTP APIs.

### Compatibility

#### Backwards Compatibility

##### DHT

The `PUT` APIs proposed here require a new data format for specifying provider records.
Since the records must include a valid signature, records published through HTTP delegated routing must be resigned.

##### Reframe

See [IPIP-337/Backwards Compatibility](0337-delegated-routing-http-api.md#backwards-compatibility).

#### Forwards Compatibility
Copy link
Member

@lidel lidel Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

As this IPIP is right now, the routing announcement signature is used only for the remote HTTP server to decide if announcement should be accepted or not. This blocks announcing with PeerID that you don't own.

However, when user asks for providers, they won't get original payload+signature back, and won't be able to verify provider records themselves.

We don't have to solve or require it in this IPIP, but we should add section to "Forwards Compatibility" explaining if/how signature type introduced in this IPIP could be stored and returned to end clients when someoneone wants that functionality.

If we don't write it down, we will be having multiple competing conventions, because people who want end-to-end routing integrity will invent something.


Initial thought on this is that we already have a concept of opaque metadata fields for opaque protocols in Peer Schema.

If we reuse it here, this section could hint at doing something like:

{
  "Schema": "peer",
  "ID": "bafz...",
  "Addrs": ["/ip4/..."],
  "Protocols": ["transport-bitswap", "signed-routing-v1"],
  "signed-routing-v1": {
    "Payload": "[mbase64-dag-cbor-blob]", 
    "Signature": "[mbase64-blob]"
  }
}

Any concerns @masih @willscott @hacdias ?

This is just for "Forwards Compatibility" section, don't expect this to be implemented by cid.contact any time soon (since storing payload and signatyre will add cost), but if we have it, we could also implement it in someguy, allowing people end-to-end integrity when they self-host own router (e.g. in smaller private swarms).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate concern on my part 👍


See [IPIP-337/Forwads Compatibility](0337-delegated-routing-http-api.md#forwards-compatibility).

### Security

See [IPIP-337/Security](0337-delegated-routing-http-api.md#security).

### Alternatives

- Reframe (general-purpose RPC) was evaluated, see "Design rationale" section for rationale why it was not selected.

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
169 changes: 163 additions & 6 deletions src/routing/http-routing-v1.md
Expand Up @@ -61,17 +61,17 @@ This API uses a standard version prefix in the path, such as `/v1/...`. If a bac

### `GET /routing/v1/providers/{cid}`

#### Path Parameters
#### `GET` Path Parameters

- `cid` is the [CID](https://github.com/multiformats/cid) to fetch provider records for.

#### Response Status Codes
#### `GET` Response Status Codes

- `200` (OK): the response body contains 0 or more records.
- `404` (Not Found): must be returned if no matching records are found.
- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints.

#### Response Body
#### `GET` Response Body

```json
{
Expand All @@ -93,6 +93,56 @@ The client SHOULD be able to make a request with `Accept: application/x-ndjson`

Each object in the `Providers` list is a record conforming to a schema, usually the [Peer Schema](#peer-schema).

### `PUT /routing/v1/providers`
lidel marked this conversation as resolved.
Show resolved Hide resolved

#### `PUT` Response codes

- `200` (OK): the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records)
- `400` (Bad Request): the server deems the request to be invalid and cannot process it
- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints
- `501` (Not Implemented): the server does not support providing records

#### `PUT` Request Body

```json
{
"Providers": [
{
"Schema": "announcement",
...
}
]
}
```


Each object in the `Providers` list is a *write provider record* entry.

Server SHOULD accept representing writes is [Announcement Schema](#announcement-schema).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Server SHOULD accept representing writes is [Announcement Schema](#announcement-schema).
Server SHOULD accept representing writes as [Announcement Schema](#announcement-schema).


:::warn

TODO: is below a sensible limit?

There SHOULD be no more than 100 `Providers` per request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this distinction implies we won't want to use this same protocol to sync overall routing tables between nodes - if there are more than 100 known providers for data, would we expect a different protocol to be used?

Copy link
Member

@lidel lidel Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could:


:::

#### `PUT` Response Body

```json
{
"ProvideResults": [
{ ... }
]
}
```

- `ProvideResults` is a list of results in the same order as the `Providers` in the request, and the schema of each object is determined by the `Protocol` of the corresponding write object
- This may contain output information such as TTLs, errors, etc.
- It is undefined whether the server will allow partial results <!-- TODO: maybe add Error field to `announcement` schema and use it in responses ? -->
lidel marked this conversation as resolved.
Show resolved Hide resolved
lidel marked this conversation as resolved.
Show resolved Hide resolved
- The work for processing each provider record should be idempotent so that it can be retried without excessive cost in the case of full or partial failure of the request

## Peer Routing API

### `GET /routing/v1/peers/{peer-id}`
Expand All @@ -114,10 +164,10 @@ represented as a CIDv1 encoded with `libp2p-key` codec.
{
"Peers": [
{
"Schema": "<schema>",
"Protocols": ["<protocol-a>", "<protocol-b>", ...],
"Schema": "peer",
"ID": "bafz...",
"Addrs": ["/ip4/..."],
"Protocols": ["<protocol-a>", "<protocol-b>", ...],
...
},
...
Expand All @@ -131,6 +181,34 @@ The client SHOULD be able to make a request with `Accept: application/x-ndjson`

Each object in the `Peers` list is a record conforming to the [Peer Schema](#peer-schema).

### `PUT /routing/v1/peers`

#### `PUT` Response codes

- `200` (OK): the server processed the full list of provider records (possibly unsuccessfully, depending on the semantics of the particular records)
- `400` (Bad Request): the server deems the request to be invalid and cannot process it
- `422` (Unprocessable Entity): request does not conform to schema or semantic constraints
- `501` (Not Implemented): the server does not support providing records

#### `PUT` Request Body

```json
{
"Providers": [
lidel marked this conversation as resolved.
Show resolved Hide resolved
{
"Schema": "announcement",
...
}
]
}
```


Each object in the `Providers` list is a *write provider record* entry.

Server SHOULD accept writes represented with [Announcement Schema](#announcement-schema)
objects with `CID` list.
lidel marked this conversation as resolved.
Show resolved Hide resolved

## IPNS API

### `GET /routing/v1/ipns/{name}`
Expand Down Expand Up @@ -225,7 +303,7 @@ limits, allowing every site to query the API for results:

```plaintext
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Methods: GET, PUT, OPTIONS
```

## Known Schemas
Expand Down Expand Up @@ -277,6 +355,81 @@ the case, the field MUST be ignored.

:::

### Announcement Schema

The `announcement` schema can be used in `PUT` operations to announce content providers or peer routing information.


```json
{
"Schema": "announcement",
"Payload": {
"CID": ["cid1", "cid2", ...],
"Scope": "block",
"Timestamp": "YYYY-MM-DDT23:59:59Z",
"TTL": 0,
"ID": "12D3K...",
"Addrs": ["/ip4/...", ...],
"Protocols": ["foo", ...],
"Metadata": "mbase64-blob",
},
"Signature": "mbase64-signature"
lidel marked this conversation as resolved.
Show resolved Hide resolved
}
```

- `Schema`: tells the server to interpret the JSON object as announce provider
- `Payload`: is a DAG-JSON-compatible object with a subset of the below fields
- `CID` is a list of the CIDs being provided.
- Skipped when used for `PUT /routing/v1/peers`
- `Scope` is an optional hint that provides semantic meaning about announced identifies:
- `block` announces only the individual blocks (implicit default if `Scope` is missing).
- `entity` announces enumerable entities behind the CIDs (e.g.: all blocks for UnixFS file or a minimum set of blocks to enumerate HAMT-sharded UnixFS directory).
- `recursive` announces entire DAGs behind the CIDs (e.g.: entire DAG-CBOR DAG, or everything in UnixFS directory, including all files in all subdirectories).
- `Timetamp` is the current time, formatted as an ASCII string that follows notation from [rfc3339](https://specs.ipfs.tech/ipns/ipns-record/#ref-rfc3339).
lidel marked this conversation as resolved.
Show resolved Hide resolved
- `TTL` is caching and expiration hint informing the server how long to keep the record available, specified in milliseconds.
- If this value is unknown, the caller may skip this field, or use a value of 0. The server's default will be used.
- `ID` is Peer ID of the node that provides the content and also indicates the `libp2p-key` that SHOULD be used for verifying `Signature` field.
- `Addrs` is an a list of string-encoded multiaddrs without `/p2p/peerID` suffix.
- `Protocols` is a list of protocols supported by `ID` and/or `Addrs`, if known upfront.
- `Metadata` is a string with multibase-encoded binary metadata that should be passed as-is
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-JSON](https://ipld.io/specs/codecs/dag-json/spec/) map notation
2. Prefix the DAG-JSON bytes with ASCII string `PUT /routing/v1 announcement:`
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By normalizing Payload to DAG-JSON here, as a step in signature generation, we no longer have to have Payload sent as base64-encoded string, and can have human-readable JSON object instead.

I think this is a net positive change, as it makes debugging/tracing easier, payload is no longer obfuscated on the wire.

@willscott @masih @hacdias – any concerns with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concerns other than the usual pitfalls of verifying signature of JSON object, which a reference implementation should clarify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel what do we do with empty fields? Are they not set? Are they set as null? Are they set as empty value (depending on their value, e.g., empty array)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty fields should not be present, just to remove ambiguity on the default value.

Copy link
Member

@lidel lidel Jan 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized going with DAG-JSON introduces inconsistency across specs.
IPNS Records do the same signature dance but normalize key-value map with DAG-CBOR. If we can, we should do the same here.

Let's switch to DAG-CBOR, this way there is no ambiguity, implementers don't try to be clever and reuse JSON strings, and signature does not require JSON representation in non-JSON context.

Suggested change
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-JSON](https://ipld.io/specs/codecs/dag-json/spec/) map notation
2. Prefix the DAG-JSON bytes with ASCII string `PUT /routing/v1 announcement:`
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-CBOR](https://ipld.io/specs/codecs/dag-cbor/spec/) map notation
- Intention here is to use similar signature normalization as with DAG-CBOR `Data` field in IPNS Records, allowing for partial code and dependency reuse.
2. Prefix the DAG-CBOR bytes with ASCII string `routing-record:` to avoid signature reuse attacks.
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to DAG-CBOR in 05dceba

- Client SHOULD sign every announcement.
- Servers SHOULD verify signature before accepting a record, unless running in a trusted environment.
lidel marked this conversation as resolved.
Show resolved Hide resolved
- ED25519 and other small public keys MUST be inlined inside of the `ID` field with the identity multihash type.
- Key types that exceed 42 bytes (e.g. RSA) SHOULD NOT be inlined, the `ID` field should only include the multihash of the key. The key itself SHOULD be obtained out-of-band (e.g. by fetching the block via IPFS) and cached.
If support for big keys is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.
- A [400 Bad Request](https://httpwg.org/specs/rfc9110.html#status.400) response code SHOULD be returned if the `Signature` check fails.

:::warn

TODO: what should be the limits? Max number of CIDs per `announcement` ?
lidel marked this conversation as resolved.
Show resolved Hide resolved

:::

#### Use in PUT responses

Server MAY return additional TTL information if the TTL is not provided in the request,
or if server policy is to provide TTL different than the requested one.

```json
{
"Schema": "announcement",
"Payload": {
"TTL": 17280000
}
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we create a separate announcement-response schema, or using announcement is ok? @hacdias what is a smaller headache for implementers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel I think it is fine to use the same one, and mention that when getting it as response, there may be an Error field too for https://github.com/ipfs/specs/pull/378/files#r1418770825.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Maybe overthinking this IPIP, but I am trying to keep some future doors open.

This IPIP is an opportunity to create a signed routing record convention that could be reused outside of HTTP API. For example, Amino DHT could start accepting signed announcements from this spec in addition to routing puts where PeerID matched one from libp2p connection, allowing light clients to delegate DHT puts.

I think with switch to DAG-CBOR done in 05dceba it is generic enough to use it for both (a signed DAG-CBOR map similar to IPNS).

But for errors, it felt like mixing abstractions, so I've added explicit error schema in ccbc085.


- `TTL` in response is the time at which the server expects itself to drop the record
- If less than the `TTL` in the request, then the client SHOULD repeat announcement earlier, before the announcement TTL expires and is forgotten by the routing system
- If greater than the `TTL` in the request, then the server client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- If greater than the `TTL` in the request, then the server client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system
- If greater than the `TTL` in the request, then the client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system

- If `0`, the server makes no claims about the lifetime of the record

### Legacy Schemas

Legacy schemas include `ID` and optional `Addrs` list just like
Expand Down Expand Up @@ -318,6 +471,10 @@ libp2p protocol.
}
```

#### Filecoin Graphsync Write Provider Records

There is currently no specified schema.

[multibase]: https://github.com/multiformats/multibase
[CIDv1]: https://github.com/multiformats/cid#cidv1
[multiaddr]: https://github.com/multiformats/multiaddr#specification
Expand Down