Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose more explicit expectations of serialization of UnixFS data #271

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 2 additions & 1 deletion UNIXFS.md
Expand Up @@ -78,7 +78,8 @@ This `Data` object is used for all non-leaf nodes in Unixfs.

For files that are comprised of more than a single block, the 'Type' field will be set to 'File', the 'filesize' field will be set to the total number of bytes in the file (not the graph structure) represented by this node, and 'blocksizes' will contain a list of the filesizes of each child node.

This data is serialized and placed inside the 'Data' field of the outer merkledag protobuf, which also contains the actual links to the child nodes of this object.
This data is serialized and placed inside the 'Data' field of the outer structure, which also contains the actual links to the child nodes of this object.
The final structure is most commonly serialized as a merkledag protobuf, but serialized data from other codecs may also be interpreted by the Unixfs ADL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"serialized data from other codecs" is too broad...

An implicit constraint of the merkledag protobuf encoding is that Links is a single-level list of structs with a well defined single pointer in each.

By saying "you can replace the unxifs merkledag container with anything else" you lose the implicit constraints and need to redefine them somewhere ( here? )

Stepping back once again however - what is the real purpose of this change request to a strongly-ossified protocol? If it is just for ease of test fixtures - this is overkill. If there are other (future?) use-cases: please share them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand any of this comment. Why is that phrasing too broad?

Copy link
Member

@warpfork warpfork Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking in with the sheer passage of time here: the debated phrasing is from >3 years ago.

To contextualize that: we didn't have an IPLD Data Model spec back then (or at best it was still nascent and hotly contested). We didn't have a concept of ADL back then. I'm not the original author, but ISTM it's very likely that this was defined in terms of "merkledag" and "protobuf" not because those were critically important but simply because the terminology for that was accessible at the time and seemed sufficiently clear.

We have clear language for the Data Model now, and that means clear language for nested data structures that are agnostic of codec. Defining unixfsv1 in terms of that language is now easy and clear (and implementing it that way is also easy, as demonstrated by the PR that spawned this discussion, which is incidentally tiny -- very strong evidence of the actual simplicity of this).

It's true that unixfs is a long standing piece of history, and for most of its time it's been protobuf only. But I don't really see a reason to treat that as a sacred cow.

(And so what if the first given reason is test fixtures? That's an excellent reason. If we need a second one, I humbly submit "that it makes sense". As a third: "freedom from codecs is one of the founding principles of IPLD".)


For files comprised of a single block, the 'Type' field will be set to 'File', 'filesize' will be set to the total number of bytes in the file and the file data will be stored in the 'Data' field.

Expand Down