-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) records + merkledag specs #7
Conversation
### Serialized Format | ||
|
||
|
||
(TODO remove this? use only protobuf?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are okay using only protobuf. since we version tag the repos and the clients and protocols, we can write in logic to handle multicodec OR protobuf later if for some reason we have to switch away from protobuf due to it being proven to be stealing money from the poor, or some other terrible thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm increasingly dissatisfied by the annoyances of protobuf's shortcomings. have to trick it into doing things, like self description or streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@greglook not yet! but it's made by @richhickey so it's likely exactly what i want. thanks for the pointer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@greglook ah no, this is a text format. it's not optimized for binary rep. (unless i'm missing a page)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbenet see Datomic/fressian and cognitect/transit-format for binary representations. The latter is not strictly EDN, but includes the same core ideas. Also this thread for a comparison among them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, thanks @greglook! though the farther this moves away from strict 1:1 mapping to JSON, the more of an adoption hurdle it is
@jbenet is there anything on telling which of two records is the 'most valid' ? |
var ProofOfWork = "proof-of-work" | ||
|
||
// ProofOfStorage proves certain data is possessed by prover. | ||
var ProofOfStorage = "proof-of-storage" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distinction between Proof of Storage and Proof of Retrievability wasn't immediately obvious to me. After searching a bit I feel like I now have a better idea for how Proof of Storage works, but linking out to whatever you consider canonical docs or useful review articles for each proof type would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder where these proofs are used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but linking out to whatever you consider canonical docs or useful review articles for each proof type would be nice.
Yeah this is super raw and early. almost removed this section.
I also wonder where these proofs are used
- this part (the proof types) is very very WIP. just a thought.
- the idea is to have "proof objects" that we can point to, and have their type tell us how to process them.
- a proof of storage could potentially be used in a provider record, to prevent spam. and they would be used in other things, like filecoin.
// Order is a function that sorts two records based on validity. | ||
// This means that one record should be preferred over the other. | ||
// there must be a total order. if return is 0, then a == b. | ||
// Return value is -1, 0, 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why restrict the return values? I'd recommend consumers just use > 0
instead of == 1
when acting on this information.
(yay diagrams. if anyone knows a good web based programmatic + drag and drop diagram tool, lmk. (most of them suck) |
jbenet for the records, youre thinking this right: https://gist.github.com/whyrusleeping/8f2c206ac2fbc952fea2 |
@jbenet re: a drawing tool, i use http://draw.io its pretty nice |
type Record struct { | ||
Scheme Link // link to the validity scheme | ||
Signature Link // link to a cryptographic signature over the rest of record | ||
Value Data // an opaque value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here value is data, and above it is a link. Above we have data for the validity information, but here there is no place for that to go. Something is wonky here. If i have my way, it would look like:
{
"Data": "validity data, to be interpreted after parsing the scheme link",
"Links":[
{
"Name":"Scheme",
"Hash": "pointer to schema definition object (really just a placeholder)",
},
{
"Name":"Signature",
"Hash": "pointer to cryptographic signature",
},
{
"Name":"Value",
"Hash": "pointer to records actual content",
}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and below, where we have the date time thing, that info would just be another link.
I was mistaken, All data associated with the validity should be put in the Data segment of the record node. (at least in my view of actually implementing this thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing, is the signature separate from the validity data? I'm okay with that, just want to make the distinction clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Wed, Jun 03, 2015 at 06:46:24PM -0700, Jeromy Johnson wrote:
+type Record struct {
- Scheme Link // link to the validity scheme
- Signature Link // link to a cryptographic signature over the rest of record
- Value Data // an opaque value
Another thing, is the signature separate from the validity data? I'm
okay with that, just want to make the distinction clear.
Just to keep the vocabulary consistent, @jbenet was putting the
signature under “correctness” not “validity” (since it's a
done-right-at-craft-time issue 1), although they both fall under the
“validity scheme” 2.
I agree with 3 that it makes sense to allow a given validity scheme
to place correctness/validity information in additional links and/or
the data block as it sees fit, so long as it doesn't clobber the
base-Record "Scheme" or "Value" names (and do those have to be
title-cased?).
As it's currently written up, the linking for a signature is going to
be weird, since what you really want signed is the record itself.
The whole “signable part” business reminds me of OpenPGP with it's
“unhashed subpacket data” 4. I played around with an OpenPGP
implementation while trying to rotate my main GnuPG key 5, and the
whole “unhashed data” business is a pain in the neck. Can we just
make signature objects first-class citizens and allow (require?) folks
to push signatures instead of records. Then the signature could link
to the record, which would carry the validity information and a link
to the record payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question 3, is there actually going to be anything in the 'schema object' ? will it even be an object? what should it contain?
…blockstore Copy the AllKeys() method from the blockstore [1] to the datastore [2]. You can't implement it efficiently using the existing datastore interface, so I don't know how you'd add it on top of a generic datastore that lacked such a method. We don't want to encourage drilling down through layers and using what should be internal implementation details. The previous paragraph explained why we need an AllKeys() method in the datastore. We also need to expose AllKeys() in the blockstore interface, so we can build garbage collection and similar logic on top of the blockstore, without having to drill down to the datastore layer to write those tools (see, for example, the IRC discussion from [3] through [4]). Also remove the Key argument from the blockstore's Put(). The backing datastore need not be content-addressable, but I think we want to require content-addressability for the block store. However, multihash gives us some choices for the hash function and digest size, so the blockstore's Put does accept those (and then it computes the hash internally). Besides requiring content-addressability, I'd also require the blockstore to only store serialized Merkle objects. That makes deserializing the content easier, and we've worked hard to make Merkle objects sufficiently general that they should suffice for any data we want to put into the blockstore. I've also tried to clarify that the exchange-server doesn't have the potentially expensive AllKeys() method by explicitly listing the methods it does have. We probably also want to extend the Get(Key) response with optional "will send" and cancel information. See the optimistic transmission graphic in [5] for more on this. [1]: https://gist.github.com/jbenet/d1fedddfef85f0c4efd5#file-modules-go-L162 [2]: https://gist.github.com/jbenet/d1fedddfef85f0c4efd5#file-modules-go-L122 [3]: https://botbot.me/freenode/ipfs/2015-06-23/?msg=42683298&page=4 [4]: https://botbot.me/freenode/ipfs/2015-06-23/?msg=42688156&page=4 [5]: ipfs#7 (comment) License: MIT Signed-off-by: W. Trevor King <wking@tremily.us>
Currently go-ipfs has (in routing/dht/providers.go) in-memory handling for the provider listing. That works well enough, but it seems like we'd want to store this sort of thing in the generic record store to avoid duplicate record-store-like code. The problem with records like this is that they're keyed off the multihash for the provided object, but the records themselves will be created and signed by multiple providing nodes. That means we can't store a single signature as the record-store entry (which provider would sign it?). This commit adds a record-list object that addresses this case. The record-list object has Merkle-links to signatures where the link names are the IDs for the publishing nodes (e.g. the providers for the provider-list case, or wanters in wantlists, etc.). Each linked signature would have a signed payload with data containing the claim. For example: I, <publisher-ID>, can provide <multihash-of-provided-object>. or I, <publisher-ID>, would like to hear about changes to <multihash-of-wanted-object>. or some more easily maching-parsable version of similar claims. The Merkle chain of the record would be: record <publisher-ID>: <signature-ID> ↓ signature Key: <provider-ID> Signee: <providing-claim-ID> … ↓ <providing-claim-id> Data: <claim-payload> Nodes that had reason to trust each other (e.g. to not forge providing claims or to properly currate a provider-list) wouldn't have to fetch and verify the signed data. Nodes that had no reason to trust each other (e.g. fetching a provider list from an untrusted node or receiving a new providing-claim from an untrusted node) should aquire and check the signature before using the providing-claim for lookup or adding it to a provider-list. Since most nodes won't trust each other, I expect the signature, providing-claim, public-key, etc., packets would be passed around in one optimistic-transmit block [1], and probably be optimistically hosted on the same nodes that host the provider-list itself for that purpose. [1]: ipfs#7 (comment) License: MIT Signed-off-by: W. Trevor King <wking@tremily.us>
I'm going to merge this and continue improving. |
(WIP) records + merkledag specs
includes also keychain types.
this is all very much WIP