Add radicle namespace#404
Conversation
|
Since the hash that we intend to encode in the CID is of a Git object, I wonder if it makes sense for us to use Line 48 in 45c88b8 |
|
I am not sure in how far semantics matter in this case. Sure, both things seem to capture SHA-1 and SHA-256 hashes, with the intention of identifying a Git object. But what about the intent/semantics of wanting to address a repository? This differs from the intent to address an object. Maybe someone can weigh in on how to balance such semantic differences. For example, I have relatively good knowledge about what it would mean to address a Radicle repository (this currently happens, just using different syntax). But TBH I have little clue about how the |
|
This PR reminded me of the discussion at #203 about adding a custom code, although things are really Git. In regards to what IPLD codecs are, see my comment at #204 (comment). |
|
Thanks @vmx for the pointers. The consensus, based on my reading of #203 and #204, is that the multicodec describes how to decode bytes, not what they semantically address. A Radicle Repository Identifier is currently base58btc(Git commit object ID of the identity payload). The bytes that get hashed are a standard Git blob. The JSON inside carries the Radicle-specific structure. No new binary format. Alas, |
| silverpine, multiaddr, 0x3f42, draft, Experimental QUIC over yggdrasil and ironwood routing protocol | ||
| sm3-256, multihash, 0x534d, draft, | ||
| sha256a, hash, 0x7012, draft, The sum of multiple sha2-256 hashes; as specified by Ceramic CIP-124. | ||
| radicle-git, namespace, 0x8776, draft, Radicle Git Repository Identifier |
There was a problem hiding this comment.
This PR feels like a good opportunity to maybe write down policy in README?
I agree with prior art, for the bytes a RID addresses (a Git blob from git hash-object), git-raw (0x78, permanent) already decodes them. Per #204 that is what the CID codec field describes, so a Radicle-specific data codec seems unnecessary(?) as it duplicates git-raw.
iiuc what Radicle attempts to add here is a namespace, which feels fine and less controversial, if its framed as "namespace where git and other identifiers are used".
ipfs, ipns, swarm, already work this way: the namespace tags the system, the inner CID picks its own codec (for example, ipfs namespace uses CIDs with dag-pb, raw, dag-cbor and other byte codecs internally).
/radicle/<cid> is the same shape, with the inner CID being cid(git-raw, multihash(<algo>, <hash>)). The Git 3 transition then needs no new codec: RIDs become cid(git-raw, multihash(sha2-256, ...)) and the radicle namespace stays put. The hash algorithm is signaled by the inner multihash, not the namespace.
So maybe just clarify that radicle namespace proposed in this PR is the RIP-2? (or similar, having some spec document referenced removes ambiguity):
| radicle-git, namespace, 0x8776, draft, Radicle Git Repository Identifier | |
| radicle, namespace, 0x8776, draft, Radicle namespace; payload is a Git object (git-raw 0x78), see RIP-2 |
ps. Would be good to link to RIP-2 in the PR body but other than that feels sensible + Radicle been around for a long time.
ps2. for unconvinced, prior art precedent: ipfs, ipns, swarm, streamid, lbry, plus recent merges adnl (#402), Massa (#396), shelter-* (#369).
There was a problem hiding this comment.
Sorry about the naive question, but what are namespaces actually for? Feel free to link to relevant resources
There was a problem hiding this comment.
The Git 3 transition then needs no new codec: RIDs become
cid(git-raw, multihash(sha2-256, ...))
This is understood, yes.
So maybe just clarify that
radiclenamespace proposed in this PR is the RIP-2? (or similar, [...]
Similar. We would likely have a new RIP that amends RIP 2. RIP 2 talks about the "20 plain bytes" encoding. The question is what comes first: The RIP or the entry in this table? Since this table seems to allow "draft" entries, I figured that it might be acceptable to add to the table first, then write the RIP which can safely mention the codec that is allocated, and in another PR update the line to refer to that shiny new RIP. If Radicle does not deliver the RIP and the codec sees no adoption, since the entry is not "permanent", it could also be removed.
Regarding your suggestion: Radicle might in the future work on other VCSes, not only Git. With the concept of a namespace, it seems that we would have more control. In that case I would drop the talk about Git and git-raw from the listing, and instead define the rest in the RIP. Or go the other way and name the namespace radicle-git.
You can see that I do not really know what these namespaces mean, so I would also appreciate guidance here.
There was a problem hiding this comment.
No worries, I am also tainted by too much history, so my understanding could be outdated, but here it is: a namespace tags which system an identifier belongs to, used either as a path prefix or as a binary prefix in wire formats. IIRC there was a time at Protocol Labs where we wanted to have Plan9-like future, where all systems/protocols lived under one /, so the meta-namespace is path (0x2f, the / byte) at table.csv:29; and then specific system namespaces (ipfs, ipns etc) nest under it.
Concrete usage:
- Paths (mostly IPFS, mostly historical):
/ipfs/<cid>/foo,/ipns/k51.../foo,/swarm/<addr>/.... Intention was to unify gateway URLs, routing keys, provider records etc. - ENS
contenthash(iiuc the main user of "namespaces" today? (ERC-1577, ENSIP-7): binary form<protoCode uvarint><CIDv1>, whereprotoCodeisipfs-ns(0xe3),ipns-ns(0xe5), orswarm-ns(0xe4). For example, an ENS record that resolves to IPFS encodes as0xe3 || 0x01 0x70 0x12 0x20 <32-byte hash>(ipfs namespace + CIDv1 ofdag-pb+sha2-256). The namespace byte tells the resolver which network to dispatch to; the inner CID picks its own codec.
As for more historical references, there is old multiformats/multiformats#55, which aimed to formalize <namespace-varint><body> as the wire pattern.
Loosely quoting Stebalien on this exact point in #204 (comment):
Within [system], use CIDs. Outside of [system], use namespaced paths, or namespaced CIDs if you need something shorter. E.g., an ENS record might refer to
<ipfs-codec><cidv1><...>and/or/ipfs/CIDv1/...
So the radicle namespace slots into the same machinery: ENS-style records, HTTP gateway URLs, and multipath routers can dispatch on the prefix without parsing the inner CID.
(But in practice, I only seen namespace codes being used in ENS/Ethereum and others using contenthash contracts)
There was a problem hiding this comment.
Okay, so instead of using
<cid(radicle-git, multihash(sha2-256, ...))><cid(radicle-git, multihash(sha1, ...))><cid(radicle-hg, multihash(sha1, ...))>(this one is fictional)
we would use
<radicle-codec><cid(git-raw, multihash(sha2-256, ...))><radicle-codec><cid(git-raw, multihash(sha1, ...))><radicle-codec><cid(hg-raw, multihash(sha1, ...))>(this one is fictional)
That sounds fine to me. How would that entry look like then?
| radicle-git, namespace, 0x8776, draft, Radicle Git Repository Identifier | |
| radicle, namespace, 0x8776, draft, Radicle namespace |
There was a problem hiding this comment.
(But in practice, I only seen namespace codes being used in ENS/Ethereum and others using contenthash contracts)
I have no specific objection to adding a radicle namespace, although from this discussion it doesn't seem like there's an intended use for the namespace at the moment.
Currently we see namespace requests exclusively for the use case documented above (which IIUC isn't something Radicle is looking for at the moment). However, it certainly seems reasonable to use <namespace><system-address> (or in Radicle's case <namespace><cid>) like ENS does to allow compact representation of addresses across multiple systems.
If there's an interest in getting the namespace code now then SGTM, if however it's not known how Radicle would use the namespace yet (or when it would be used instead of just a CID) it might make sense for the authors to park this until the need arises.
There was a problem hiding this comment.
We would use <radicle-codec><cid(git-raw, multihash(sha1, ...))> ASAP for repository identifiers.
E.g. https://radicle.network/nodes/seed.radicle.dev/rad%3Az3gqcJUoA1n9HaHKufZs5FCSGazv5/ would also be reachable via https://radicle.network/nodes/seed.radicle.dev/rad%3A<radicle-codec><cid(git-raw, multihash(sha1, 039fb5bb70789bed3e1f026ddb0c417ded37c3365882))>.
Is that somehow unreasonable?
There was a problem hiding this comment.
We would likely have a new RIP that amends RIP 2. [..]
The question is what comes first: The RIP or the entry in this table?
Sounds reasonable, no objection to the wire shape. Older entries like lbry (#332) were merged with a one-liner linking to project's spec, so the bar is low, we are just trying to raise it a bit :) Goal: someone reading table.csv's git history in five years can trace radicle 0x8776 back to a specifying document.
Since the bar is low: just link to the new RIP once you have a draft, and we can merge this.
Radicle <https://radicle.dev> is an open source, peer-to-peer code collaboration stack built on Git. Repositories are addressed by hashing their initial "Repository Identity Document". In order to allow addressing repositories by prefixing, reserve a new code.
|
Updated to the effect of my suggestion in #404 (comment) after that got a thumbs up by @lidel. |

Radicle is an open source, peer-to-peer code collaboration stack built on Git.
Repositories are addressed by hashing their initial "Repository Identity Document".
In order to allow addressing repositories by CIDs, reserve a new code.
For example, we currently use rad:z3gqcJUoA1n9HaHKufZs5FCSGazv5 to address the repository which you can view on the web at https://radicle.network/nodes/seed.radicle.dev/rad%3Az3gqcJUoA1n9HaHKufZs5FCSGazv5.
The identifier here is what we call a "Repository Identifier" and it is a plain base56btc encoding of a special Git object in that repository.
With the imminent introduction of SHA-256 as the default object format/hash function in Git 3, we plan to introduce more flexibility into our repository identifier scheme. CIDs with this multicodec code look very promising.