New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intiial exploration report for IPLD URL Scheme #195
Conversation
| The `Accept` header may be optionally provided to specify the codec that should be used in the response. | ||
| This follows the same conventions as in the [IPFS gateway](https://github.com/ipfs/go-ipfs/pull/8758). | ||
| Other codecs that may be targetted are DAG-CBOR. | ||
| TODO: `application/vnd.ipld.dag-json` or just `application/json`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maximize interop with other tools both could be supported in requests (dag-json is a strict subset of json), but when it comes to responses, it is better to go with more specific application/vnd.ipld.dag-json.
fysa my plan is to register it as part of ipfs/kubo#8823 (ETA go-ipfs 0.14), would be cool if I could link to this spec along with https://ipld.io/docs/codecs/known/dag-json/
This comment has been minimized.
This comment has been minimized.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only worry with regards to using application/vnd.ipld.dag-json is that it wouldn't get recognized by common browser tools like syntax highlighting extensions and the such. I think it could still work with await (await fetch(url)).json() in browsers, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RangerMauve fair point! so.. application/json everywhere + also support application/vnd.ipld.dag-json in requests? that should maximize reuse of things that have nice UX/support for JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm into that. I'll add it to the doc.
|
|
||
| The `{CID}` should be the "base" CID to extend data from. | ||
| Sadly if we are to use URLs (TODO: why should we use URLs?), we must have some sort of CID in the hostname. | ||
| TODO: Document some inline CIDs to use. IPFS has this empty directory CID `ipfs://bafyaabakaieac` which would be cool to have analogs for for empty `{}` object and `[]` array nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we now have the concept of "unit" type in schema-land:
ipld/specs/schemas/schema-schema.ipldsch
Lines 761 to 789 in cfe4b49
| ## UnitRepresentation is an enum for describing how a TypeDefnUnit should be represented in the data model. | |
| ## | |
| ## Unit types are commonly seen represented in several ways. | |
| ## A null token is a common one. | |
| ## A true token is sometimes seen (especially, when people encode "sets" in json: | |
| ## often this will be seen as a map where the values are keys and the map values are 'true'). | |
| ## Also, an empty map can be a useful unit value; | |
| ## an empty map accurately communicates a lack of data. | |
| ## (The emptymap strategy can be a particularly interesting choice if you want to | |
| ## have a schema that is evolvable in the future to start using a struct or map | |
| ## in the same place as the unit type currently stands, while having older documents | |
| ## continue to be parsable by the evolved schema.) | |
| ## | |
| ## Unlike many of the other representation information types seen in the schema-schema, | |
| ## this one is just an enum, rather than being a union. | |
| ## That's because there's no possibility of every needing to annotate more customization | |
| ## onto values in the unit type... because there are no possible values in the unit type. | |
| ## | |
| ## Note that there is no discernible logical difference between | |
| ## `type Foo struct {}` and `type Foo unit representation emptymap`; | |
| ## only that the latter can be said to be a more explicit description of intent. | |
| ## Both will result in identical representations, and both have identical cardinality (which is 1). | |
| ## | |
| type UnitRepresentation enum { | |
| | Null ("null") | |
| | True ("true") | |
| | False ("false") | |
| | Emptymap ("emptymap") | |
| } |
And we have identity CIDs, we could just pick one to standardise on as a base.
https://cid.ipfs.io/#baeaaaapw might be a good choice, null is a commonly understood unit value and might be a good "there's nothing here to start from".
Or are you wanting to convey codec information with the "base"? So, Content-Type tells the receiver what format the data is being provided in, but the base CID says what codec should be used for the encoded block? Do we want that kind of flexibility? And if so, what about base CIDs that have some data because they're CIDs that reference some existing block or have some data in them as identity CIDs?
Yet another possibility here is to just use identity CIDs to send the data and Content-Type to say what codec to encode into—what limitations do we run into with hostnames here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe the semantics of a PUT could all fall under IPLD Patch since it already has all needed modification primitives (and more)? Otherwise this would be introducing a mutation operation that is limited to only adding or changing data from existing blocks, whereas Patch gives us full modification potential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are you wanting to convey codec information with the "base"?
I think keeping the codec information in the Content-Type header would be best to keep things URL-y. ipld://baeaaaapw/ seems like a great place to start.
what about base CIDs that have some data because they're CIDs that reference some existing block or have some data in them as identity CIDs?
I think this would be useful. e.g. It'd be cool to have some base CIDs with examples like an empty array or an empty map. Is there an existing list I could reference somewhere? Should I just generate these myself with the ipld CLI?
'
maybe the semantics of a PUT could all fall under IPLD Patch since it already has all needed modification primitives
I was actually thinking that the PATCH HTTP method could be used there instead. I think it'd be cool to add support for it as a first-class feature, but since Patch is still a bit of a WIP I didn't want to block this to wait on it.
One thing to consider here is that we just need to have something in the hostname portion of the URL since in my experience most parsers (like browsers) will freak out if it's only sometimes there.
Alternately, in IPNS land we were looking at using localhost as a "no-op" name for when we want to do something with data that isn't already tied to a particular CID. @lidel might have more to say on that front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I think my idea was that if we want to invent some name to indicate "empty" state, the
localhostis not a bad choice because it is something devs understand, resolves safely to loopback device if DNS is ever involved (either on purpose, or as a bug), and also hints that data is added to local IPLD node, and not some magical cloud:PUT ipld://localhost→ get CID
-
In case a drive-by reader is confused by "JSON Patch", it is different thing from "HTTP PATCH" method:
- The "JSON Patch" @rvagg mentioned is a convention (RFC 6902 from the IETF) for representing JSON diffs (http://jsonpatch.com) – see relevant example in ipfs dag diff ipfs/kubo#4801 (comment) (and the discussion that follows).
- Wherever possible, we should reuse RFCs like this, maximizing interop with existing tools, and/or promoting existing standards (even if they are not popular, it is still better than inventing something new).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm into having ipld://localhost/ be a special thing for when there's no base CID to build off of. @rvagg @warpfork Any strong feelings on that matter?
Regarding PATCH, I've added a link to the GH issue where the implementation is being tracked: ipld/go-ipld-prime#350
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable to me; it is slightly arbitrary (e.g. why not :: to be uber-modern?) but I guess anything we choose is!
| The `Content-Type` may be optionally provided to specify the encoding of the `Body`. | ||
| By default it will be assumed to be `DAG-JSON`. | ||
|
|
||
| If a `Path` is specified, intermediate Nodes will be created and added to the Node specified by the CID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this strictly stick to the integer components of a path, where if it encounters a 0 it'll create an array for you? And what if the path has a 1 (or later) in it? Do we make an array that has that many elements that are ... empty? Because we'd have to know what "empty" is (back to the unit discussion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to be safe it would be good to generate intermediate Map Nodes if a node doesn't already exist at that path otherwise it might be too magical.
Also, should we error out if somebody tries to add a named key to a node which is an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could also be augmented down the line with Schemas so you can give better type hints for individual segments of the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think erroring out when there isn't something obvious and having fancier behavior in the future would be best for this initial version. I'll add some notes about how lists should be handled in the meantime though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more on this, maybe we should error out when there's no intermediate node? E.g., users would need to manually add intermediate nodes with separate requests as they're building up the DAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rvagg Mind weighing in on whether we should error out when there's no intermediate node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm, I don't have a strong opinion. There's a nice convenience to automatically handling intermediate nodes (mkdir -p, rm -r - but notice they require special flags!). It might be tedious to have to do them all, one-by-one—round-trips get expensive in content-addressed land. But on the other hand, does it indicate a possible error if you're encountering missing intermediates? Perhaps so.
Your call if you're able to form a firmer opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been chewing on it and I think the convenience outweighs the downsides.
I've got auto-generation (-p) for ipfs/ipns/hyper in Agregore at the moment and not having it made a lot of things more annoying. Ultimately, I think developer UX is really important for this stuff and streamlining is usually worth it.
|
This steps on the That may not be a conflict, however, Either way, it might be worth looking at that and making sure we're not doubling up on something that's not quite the same and may end up being confusing to users. If there's a path to harmonisation then that'd be awesome. |
|
@lidel Would you be able to comment on @rvag's comment regarding conflicts with what is already in the IPFS gateway? My gut feeling is that the |
|
@RangerMauve : do you want to incorporate any more feedback here, or are you good if we merge now? |
|
@BigLep Gonna go over this weekend to see if there are more notes and TODOs to add and then I think I'll be ready to merge next week. |
|
Based on the discussion we had at the end of the last IPLD weekly meeting, I've added a section on escaping special characters and reserved some future syntax for Lenses based on the suggestions from @aschmahmann One thing I'd like resolved before we merge is to get some more example unit CIDs to base off of. Ideally it'd be nice to have the following:
Would somebody mind linking me to info about how I may generate these or find an existing set? The encoding doesn't matter too much, but DAG-JSON might be easiest to mess with manually. |
|
Here's how I make these things, I have the codecs and js-multiformats repos locally and built because I need to |
|
@RangerMauve would be useful to add any missing edge cases to https://github.com/ipld/codec-fixtures (if not present already) – we point more and more people at fixtures like these, so it is good return of time invested. |
|
I think this is good to merge |
Starting a PR so that folks can start bikeshedding early on.
Not ready for review until I fill out a bunch of TODOs.