-
Notifications
You must be signed in to change notification settings - Fork 108
Making eth an application rather than a format #29
Comments
This is incorrect. We don't need to make it format specific, because a valid IPLD format has to offer a way for the IPLD Resolver to resolve through that format (aka partial resolver or block scope resolver) and that is where interface-ipld-format comes in. See: ipld/js-ipld#60 Essentially, we just need a .resolve and a .tree function to make any data format work.
Again, not 100% correct. IPFS files will resolve under /ipfs - This is an application on top of IPLD, the unixfs application. IPFS files don't need to reserve a multicodec, because multicodecs are for IPLD formats. unixfs uses a IPLD format (currently it uses the dag-pb).
Yes, that is called the unixfs-engine, but it is not an IPLD Format, it is a usage of underlying IPLD formats :) |
sorry s/IPFS/unixfs, the argument is still the same, Think of it as "ethfs", that instead of serving binary (IPLD binary built from IPLD objects), it server IPLD objects (from IPLD binary which is Eth binary) Again, my argument is not really a "let's make it this way", but understand why this way is not better than the current. I fear 1000th new data formats will pop up, while they are not data formats, they are application specific formats |
The key difference is that Ethereum already exists, it has its own Merkle Data Structure, it doesn't ride the IPLD objects like unixfs does. |
Answer to @diasdavid: Ethfs will just be a way to encode eth binary data into IPLD (not Ethereum). If we have IPLD-binary, then eth objects will be IPLD by default, then you just need to pass them through /eth to make them (structured) IPLD objects. The parallelism is more and less the following:
Instead of having eth-block to be a data format. /eth/HASHofBlock is much easier and intuitive than /eth/CIDPREFIX+HashOfBlock (Unless, what you mean by data formats, I call transformations (which has a .resolve function)) |
So, I may be missing some motivation here but this is how I see it (I apologize if this is a bit disorganized or I'm missing some significant motivation). Why CIDs
Format or Application Level
This is really a more general problem than just eth; it also applies to applications like git, etc The trade-off is:
Why Application Level
Why Format LevelCan name objects across datastores. That is, one can name an arbitrary Eth object in an IPLD object without importing it into an IPLD datasatore because it's already an IPLD object. IMO, it only makes sense to do this when there is a way to resolve such names. That is, if you would have to download and import an entire dataset anyways (to hunt through it to find your data), you might as well just import it all into your blobstore and convert it to IPLD+CBOR along the way (or some other IPLD format). For example, I think git should be implemented at the application level because there's no way to resolve a git hash into a git object (unless you have a copy of the git repo). When working with a git repo, users should just import the entire repo into IPLD. Application Architecture SketchI'd like to suggest a general application architecture that works on-top of IPLD. Import/ExportApplications built on-top of IPLD must support a way to quickly import/export objects to/from an IPLD service. This would allow one to import git/eth/etc objects into IPLD, work with them, and then export them back out (if necessary). This is basically how IPFS's tar, unixfs, etc support works today. Naming/IndexingApplications need to support mapping IPLD names to/from names used outside of IPLD (e.g., CID <-> git object hash). This could either be supported in the blobstore (make it a multi-key/value store that can map multiple keys (namespaced by the application) to a single value) or at the application level (store an index). |
Another way to think about this (summary of the above): Given an IPLD object that points to a git object, what could you do with this information? The only thing I can think of is to use some contextual information to find and download the git repository and then import it into your IPLD blob store. However, if you're going to do that, you might as well deterministically (based on the format declared in the CID) re-encode the git objects as IPLD+(CID format) objects. The only time I wouldn't do this is when there exists some way to ask some non-IPLD server for an object by-name. |
Not true for most cases, double hashing and transformations have a cost, and when data has a huge churn, this cost adds up very quickly. This is way it is important to be able to reference the data in its 'native' form. |
|
I have little time to respond in full here, sorry. the gist is this:
@nicola @diasdavid and @jbenet discussed in person recently. this should no if you need more explanation, i can try again later. On Tue, Oct 18, 2016 at 12:27 PM, Steven Allen notifications@github.com
|
A logical follow up to this is: ipld/ipld#16 |
So, when I wrote up my opinion above I claimed "IMO, it only makes sense to do this when there is a way to resolve such names." While thinking about how to use IPFS as a git caching proxy, I realized this objection was pointless: the IPFS daemon/bitswap protocol will provide this service. For example, an IPLD aware git proxy would:
So, while I do believe we should encourage new applications to use existing formats, making git and friends first-class formats makes sense. Sorry for the confusion. |
Closing due to staleness as per team agreement to clean up the issue tracker a bit (ipld/team-mgmt#28). This doesn't mean this issue is off the table entirely, it's just not on the current active stack but may be revisited in the near future. If you feel there is something pertinent here, please speak up, reopen, or open a new issue. [/boilerplate] |
This is the argument on why we should have eth-blocks not as a format and a multicodec, but a namespace and an application.
This answers #27 (and https://github.com/unixfs/notes/issues/173) in a different way than the current proposal: application vs data format
Current state: Eth-block as a data format
Eth-block a data format for IPLD:
eth-blocks
will resolve in/ipld
eth-blocks
will need to reserve a multicodec number that will be prefixed their hasheth-blocks
will needipld-parser-eth
Process to transform
eth-block
into IPLD:eth-block
hashBy having
eth-block
to be a data format, we are overspecializing a format to only work with a particular application. What if there will be 100 new cryptocurrencies? Will we create new formats?Proposed: Eth-block as an application
Parallelism with unixfs
Let me start with this unixfs on IPLD parallelism
tl;dr:
unixfs as a data format
Say that we treat unixfs as a data format for IPLD, then:
/ipld
ipld-parser-unixfs
Process to transform unixfs into IPLD
read hash, spot multicodec, transform IPLD objects into IPLD binary (= unixfs objects)
unixfs as an application
Instead, for simplicity, instead of doing that, we made unixfs an application on top of IPLD, not a data format.
unixfs as an application:
/unixfs
/unixfs
will transform IPLD objects into IPLD binary (= unixfs objects) as shown before/unixfs
will serve IPLD binaryProcess to transform IPLD to unixfs
Eth-block as an application
Eth-block as an application:
/eth
namespace/eth
will transform IPLD binary (which is Eth binary block) into IPLD objects/eth
will serve IPLD objects (traversable & so on)Process to transform Eth-block (= IPLD binary) into IPLD object
End of the story
At the end of the day, if you look at the process, it is essentially the same
Differences
Other questions
cc @diasdavid @jbenet @dignifiedquire @Stebalien
The text was updated successfully, but these errors were encountered: