Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routing api: clarify string in the protocol field #403

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions routing/ROUTING_V1_HTTP.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,13 @@ Both read and write provider records have a minimal required schema as follows:

Where:

- `Protocol` is the multicodec name of the transfer protocol or an opaque string (for experimenting with novel protocols without a multicodec)
- `Protocol` is a globally unique opaque string that represents the transfer protocol used by a content provider.
Copy link
Member

@SgtPooki SgtPooki May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

globally unique and opaque are somewhat conflicting, aren't they? If it's opaque, how do we guarantee that it's globally unique? maybe i'm misunderstanding how opaque applies here, but maybe opaque is a "golang-esque" terminology that shouldn't be in the spec?

side note: I think we over(mis)use the term "opaque"

- If a multicodec has been registered for the protocol, it can be referred to by either its `name` or `code` in `0xHEX` notation. Using the `code` is preferred since it is what is sent on the wire and won't break if the plain text `name` label is ever changed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly out of energy to care about what we name things, however a couple things I'd like to call out in relation to the fact that this discussion has been had multiple times in the last year alone and it's not obvious to me what's changed/changing.

  • The protocol in Routing V1 IIRC was initially using the code in the table, but that was scrapped because people wanted prettier names
  • When the protocol in Routing V1 was based on the code it was using decimal rather than hex since that's more native to JSON and saves on bytes
  • At the time that it was decimal it was also brought up that the same argument could be made about multiaddrs which are represented in Routing V1 as text rather than as representations of the binary encoding

So high level questions I'd like to see answered for when someone proposes RoutingV2 which ends up revisiting this again:

  1. Why are codes preferred over text?
  2. Why a string hex representation rather than something else like decimal?
  3. Why 0x rather than multibase and f?
  4. Why use text for multiaddrs but not data transfer?

To be clear, I don't feel strongly about the answers to any of these and I think there are viable answers for all of them. However, for when this inevitably ends up revisited in the future (we've done it at least twice in the last year or so) I'd like to have a comment I can point at where someone can respond with "we are doing it differently because in retrospect we disagree with the logic around decision X".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are codes preferred over text?

I think text names have been changed at least one time before (I think for json and dag-json, or something like that) and it caused a painful migration. Names can change, while the codes are static.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Names can change, while the codes are static.

I mentioned this above already, but if the idea is that names can change why are we using text for multiaddrs? I'm not insisting we have to be consistent here and use text for both or codes for both protocol identifiers and multiaddrs, but this will come up again and for those in support of the change I'd like the reasoning documented which it currently is not.

- For protocols without a dedicated multicodec, it is recommended to use a globally unique enough identifier to prevent collisions.
- `Schema` denotes the schema to use for encoding/decoding the record
- This is separate from the `Protocol` to allow this HTTP API to evolve independently of the transfer protocol
- Implementations should switch on this when parsing records, not on `Protocol`
- Protocols without an explicit schema or multiple versions, can use the same opaque string as in `Protocol`
- `...` denotes opaque JSON, which may contain information specific to the transfer protocol

Specifications for some transfer protocols are provided in the "Transfer Protocols" section.
Expand Down Expand Up @@ -127,7 +130,7 @@ This section contains a non-exhaustive list of known transfer protocols (by name

### Bitswap

Multicodec name: `transport-bitswap`
Protocol: `transport-bitswap` multicodec name or code in hex `0x0900`
Schema: `bitswap`
Specification: [ipfs/specs/BITSWAP.md](https://github.com/ipfs/specs/blob/main/BITSWAP.md)

Expand All @@ -150,7 +153,7 @@ The server should respect a passed `transport` query parameter by filtering agai

### Filecoin Graphsync

Multicodec name: `transport-graphsync-filecoinv1`
Protocol: `transport-graphsync-filecoinv1` multicodec name or code in hex `0x0910`
Schema: `graphsync-filecoinv1`
Specification: [ipfs/go-graphsync/blob/main/docs/architecture.md](https://github.com/ipfs/go-graphsync/blob/main/docs/architecture.md)

Expand Down