Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] ENSIP-17: DataURI Format in Contenthash #165

Closed
wants to merge 2 commits into from

Conversation

sshmatrix
Copy link

This proposed extension to ENSIP-07 introduces data:uri format in ENS Contenthash field, allowing dynamic data streaming using EIP-3668: CCIP-Read and ENSIP-10: Wildcard Resolution. Please feel free to go through the draft and ask any questions, seek clarifications, give suggestions or propose edits.

@Arachnid
Copy link
Member

Thanks for submitting this! It's nice to see expanded standardisation of this, and I like the novel combination of CCIP read and contenthash.

However, contenthash is a binary field, as defined here: https://docs.ens.domains/ens-improvement-proposals/ensip-7-contenthash-field

The value returned by contenthash MUST be represented as a machine-readable multicodec.

This spec will need to be updated to either use an existing multicodec type to encode the data, or to use a newly defined one.

@sshmatrix
Copy link
Author

sshmatrix commented Nov 1, 2023

This spec will need to be updated to either use an existing multicodec type to encode the data, or to use a newly defined one.

We were initially looking to use plaintextv2 as hex("pla") = 0x706c61 from multicodec.

522 plaintextv2 multiaddr 0x706c61 draft

But we're eventually deviating from multicodec on this to feature a default/fallback profile implemented in some JS code e.g. ens/contenthash.js

 default: {
    encode: encodes.utf8,
    decode: decodes.utf8,
  },

We can request to add stringTohex("data:") = 0x646174613a in multicodec too. But we prefer reusing NFT DataURIs directly without any extra prefix for namespace, version & multiaddr/version.
For instance, contenthash for 1234.hello-nft.eth can resolve bytes(tokenURI(1234)) directly as data:application/json,{...metadata}.

@Arachnid
Copy link
Member

Arachnid commented Nov 1, 2023

But we're eventually deviating from multicodec on this to feature a default/fallback profile implemented in some JS code e.g. ens/contenthash.js

The problem with this is that it directly contradicts ENSIP-7, as quoted above. Contenthashes must be valid multicodec values. We're not going to standardise something that contradicts an earlier standard like this.

@sshmatrix
Copy link
Author

sshmatrix commented Nov 4, 2023

This is a copy of the response posted in the ENS Forum.


Thanks for the feedback! We have looked into possible ways for this draft ENSIP to be compatible with multicodec. These are our findings in form of different implementations with and without multicodec. We are open to either implementation in the end and update this draft ENSIP as required.

A) Bypass Multicodec:

First, we'd like to point to the current state of ens/content-hash.js. When using hex("data:") = 0x646174613a as prefix, encoding doesn't work and the decoder nearly works but it removes the first byte in the process. Please see example below,

import {encode, decode } from "@ensdomains/content-hash";
console.log(encode("data:text/plain;base64,SGVsbG8gV29ybGQ"))
//> 00000000000000000000

console.log(decode("646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751"))
//> ata:text/plain;base64,SGVsbG8gV29ybGQ

The extra 0x00 prefix identifier as a spacer/pseudo namespace could prevent any future collision with multicodec.

console.log(decode("00646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751"))
//> data:text/plain;base64,SGVsbG8gV29ybGQ

Quoting @Raffy here,

Although for simplicity, I’m a fan of bypassing the multicodec stuff (as long as the first uvarint decodes correctly) and just embedding raw utf-8 data

This will bypass multicodec for DataURIs in Contenthash. ens/content-hash.js and any gateways/clients can easily implement this with basic checks, i.e. checking for if prefix is 0x00646174613a before encoding and decoding so that ENS clients and gateways can use DataURIs directly without leaving any room for current or future collisions with multicodec formats. This approach will be ENS specific and we can change our ENSIP draft to reflect this.

✅ This is our preferred implementation but we are not married to it.

B) Multicodec Compatible Formats

If multicodec must be used, then we'd like to propose the following options:

1) raw data type with IPFS namespace:

IPFS namespace is compatible with DataURIs using raw data type.

import { CID } from 'multiformats/cid'
import { identity } from 'multiformats/hashes/identity'
import * as raw from "multiformats/codecs/raw";
const utf8 = new TextEncoder();

let data = utf8.encode('data:text/plain;base64,SGVsbG8gV29ybGQ')
let cid = CID.create(1, raw.code, await identity.digest(data))

IPFS Format :

base32: bafkqajtemf2gcotumv4hil3qnrqws3r3mjqxgzjwgqwfgr2wonreoodhkyzds6lci5iq
base16: f01550026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751
Contenthash: 0xe30101550026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751
Namespace Version Multiaddr Multihash Length Data
ipfs 1 raw identity 38 data:text/plain;base64,SGVsbG8gV29ybGQ
0xe301 0x01 0x55 0x00 0x26 0x646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751

CID Inspector : https://cid.ipfs.tech/#bafkqajtemf2gcotumv4hil3qnrqws3r3mjqxgzjwgqwfgr2wonreoodhkyzds6lci5iq

Public Gateway : https://ipfs.io/ipfs/bafkqajtemf2gcotumv4hil3qnrqws3r3mjqxgzjwgqwfgr2wonreoodhkyzds6lci5iq

Since this method uses IPFS namespace, ens/content-hash.js and any compatible gateway or client must check if the encoded payload is using raw as multicodec with identity (blank) as multihash; the shorthand prefix for this is 0xe301015500. Clients can decode the remaining raw data as utf-8 string; if this data is not data:uri formatted, it should be auto-rendered as plaintext for correctly formatted data:uri clients, and gateways can render according to mime or type included in the DataURI payload.

2) plaintextv2 data with IPFS or IPLD namespace:

This is similar to the previous option but using plaintextv2 instead of raw as multicodec.

import { CID } from 'multiformats/cid'
import { identity } from 'multiformats/hashes/identity'
const utf8 = new TextEncoder();

let data = utf8.encode('data:text/plain;base64,SGVsbG8gV29ybGQ');
let cid = CID.create(1, 0x706c61, await identity.digest(data))

plaintextv2 format using IPFS namespace

Base32 : bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2r
Base16 : f01e1d8c1030026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751
Contenthash : 
0xe30101e1d8c1030026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751

|Namespace|Version|Multiaddr|Multihash|Length|Data|
|---|---|---|---|---|---|---|
|ipfs|1|plaintextv2``identity|38|data:text/plain;base64,SGVsbG8gV29ybGQ|
|0xe301|0x01|0xe1d8c103|0x00|0x26|0x646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751|

CID Inspector : https://cid.ipfs.tech/#bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2r

Public Gateway : https://ipfs.io/ipfs/bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2rh

plaintextv2 format using IPLD namespace

Base32 : bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2r
Base16 : f01e1d8c1030026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751
Contenthash : 
0xe20101e1d8c1030026646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751
Namespace Version Multiaddr Multihash Length Data
ipld 1 plaintextv2 identity 38 data:text/plain;base64,SGVsbG8gV29ybGQ
0xe201 0x01 0xe1d8c103 0x00 0x26 0x646174613a746578742f706c61696e3b6261736536342c534756736247386756323979624751

CID Inspector : https://cid.ipfs.tech/#bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2r

Public Gateway : https://dweb.link/api/v0/dag/get?arg=bahq5rqidaatgiylume5hizlyoqxxa3dbnfxdwytbonstmnbmkndvm43ci44govrshf4wer2r

NOTE: plaintextv2 is still in draft and IPFS gateways CANNOT yet render it properly resulting in 500 error. Trying to use it as IPLD might require changing the encoding process too. We do not prefer this method.

C) CARv1 strings

CARv1 files as strings can represent IPFS data or IPLD files and directory but this implementation is more complex than previous options so we'll only mention this as a footnote. We don't have the bandwidth to implement this. If ENS devs are happy to explore this for future implementation, it'll be one of best options for fully on- or off-chain generators and IPFS data storage.

Based on this, we are happy to get more feedback and then make changes to the draft ENSIP!

@Arachnid
Copy link
Member

Arachnid commented Nov 4, 2023

I'm confused why IPFS is involved at all. Why can't you either use an existing multicodec identifier or define a new one for URIs?

@sshmatrix
Copy link
Author

sshmatrix commented Nov 6, 2023

I'm confused why IPFS is involved at all. Why can't you either use an existing multicodec identifier or define a new one for URIs?

We had thought about the option of a new namespace in our draft but we skipped it since IPFS/IPLD namespace with plaintext or raw encoded payload contained within the CID is sufficiently unique, and backwards compatible with IPFS gateways returning plaintext data.

Proposed IPFS (raw-ipld/plaintext) : 0xe301015500 + <data.length> + <data>
  Normal IPFS      (dag-pb/sha256) : 0xe301017012 + <hash.length> + <hash of data or dag>

However, we're open to requesting a new ENS-specific namespace for this ENSIP only. In this regard, please suggest a short code (>=2 bytes) for this and we'll PR that in the multicodec table soon. Something like

  • hex('ens') = 0x656e73 sounds like a good option to us; equivalent namespace is VARINT(0x656e73) = 0xf3dc9503
  • Non-ASCII option 0xda7a is also good, which will lead to a VARINT(0xda7a) = 0xfab403 namespace

The above two options with raw multiaddr should implement like:

  Namespace Version Multiaddr Multihash Length Example
    1 raw identity 38 data:text/plain;base64,SGVsbG8gV29ybGQ
0xda7a 0xfab403 0x01 0x55 0x00 0x26 0x646...751
hex('ens') 0xf3dc9503 0x01 0x55 0x00 0x26 0x646...751

Please suggest us more options other than these two!

There are currently no DataURI-related multiaddr or namespaces and we do not want to introduce one in this context due to lack of manpower and funding to follow up on sidequests. DataURI class is too broad and it'll also require mime/type codecs which are pending on issues or PR for a very long time. See below:

multiformats/multicodec#159

multiformats/multicodec#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants