-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed IPLD changes #111
Conversation
This patch fixes some of the inconsistencies in the IPLD spec, as well as resolving some existing issues, namely: - ipfs/kubo#2053 - allow raw data in leaf nodes - ipfs/kubo#1582 - use EJSON for encoding (non-UTF8) binary data - mandate CBOR Strict Mode to avoid having to deal with duplicated keys - change `link` to `content` (although we could also `data` or something similar), as file content needn't be merkle-links, and can in fact be directly embedded into small files - removed the extraneous `subfiles` wrapper, as file content can be either (a merkle-link to) a bytestring, a unicode string, or a list thereof
@Kubuxu pointed out that this depends on whether the multicodec header is appended to data prior to hashing and storage. I'm assuming this will be the case in general (@jbenet please correct me if I've misunderstood), but I'd like to argue for an exemption for raw binary data ( Specifically, I propose that only the raw data (without the multicodec header) be stored in the repo (for the reasons discussed in ipfs/kubo#2053), with the I'd also like to propose that the hash for raw binary objects only be based on the raw data (I'm assuming the hash includes the multicodec header in general). This is mainly to promote interoperability with non-ipfs content addressing systems (cc @btrask). I think (?) this should be safe, as it wouldn't be possible for a peer to alter the codec without altering the hash anyway. (Alternatively multihash could reserve a bit to differentiate between raw hashes and multicodec-tagged-data hashes, perhaps?) |
Sounds promising. Thanks @davidar! |
I noticed the use of EJSON which is very similar to the links we now have. To make it simple, I saw in the documentation that EJSON could represent dates and binary data this way:
We do represent links this way:
Byt then, why not be compatible with EJSON and implement it this way?
Or:
Also, don't forget that escape is needed in case there is a legitimate key in JSON that is names "$date" for example. the EJSON way of escaping is:
For the format, see this page where I found more information. Mildred Ki'Lya mildred@mildred.fr |
Having native formats without custom formatting is bound to simplify implementations a lot. Basically, we would just have either CBOR with extra tags (for dates and links) or EJSON with an extra type for links. We do not invent our own syntax over JSON, we reuse EJSON. |
On Thu, 16 Jun 2016 03:08:12 -0700
See this code: https://github.com/primus/ejson/blob/master/index.js#L282-331 And this README: https://github.com/sixolet/ejson/blob/master/README.md I don't really like that EJSON does not seem to be formally described Mildred Ki'Lya mildred@mildred.fr |
Please note that in my previous example I prefer the first syntax because it doesn't introduce a difference with native types that are represented like I'd say: let's stick to EJSON as it is implemented (even though I would prefer a slightly different syntax), but make a formal specification (because EJSON has none). |
At the risk of bikeshedding over the issue of how to represent merkle links in JSON, I'm inclined to agree with @mildred's suggestion that we use an existing convention like EJSON rather than inventing our own. I'll also note that { "content": {"$type": "merkle-link", "$value": "QmCCC...222"} } maps quite naturally into YAML content: !merkle-link QmCCC...222 |
ping @jbenet |
how important is to be compatible with EJSON? |
@nicola not sure, I just used it because it had already been mentioned in the original spec Edit: We need some way of differentiating binary data from Unicode text to fix ipfs/kubo#1582 and EJSON seems sensible enough. Regarding merkle links, I'm inclined to agree with @mildred's argument, but I'm not overly fussed either way. |
@davidar, to me ejson is just another encoding, that of course we should support. The way I see it is that a JSON with |
Yes, definitely. |
Closing the PR here in favour of moving the conversation to ipld/specs. Created a issue to make sure this is handled there - ipld/specs#39 |
This patch fixes some of the inconsistencies in the IPLD spec, as well as resolving some existing issues, namely:
Also some possible changes to the examples, but these are more to do with suggested convention rather than IPLD itself:
link
tocontent
(although we could also usedata
or something similar), as file content needn't be merkle-links, and can in fact be directly embedded into small filessubfiles
wrapper, as file content can be either (a merkle-link to) a bytestring, a unicode string, or a list thereofI can split these into separate PRs if that would be more helpful.