-
Notifications
You must be signed in to change notification settings - Fork 43
Threads v2 (discussion) #566
Comments
Oh wow, I'm sorry it was not my intention to trigger a major rewrite. But since we're discussing this subject I would offer few thoughts & pointers that I think might be relevant here.
|
Also /cc @pvh he's far more qualified on hypermerge side of things |
I should point out I don't have a great solution for feed identifier yet. My hope is (pub sub base) IPNS will be good enough. But if itsn't then my backup plan is to generate keypair and store public key mapped to recent head on DHT & use PubSub to announce updates. |
No need to be sorry! :) I've been meaning to jot down the next iteration. You just gave me the kick. This wouldn't amount to a full rewrite... smaller than it sounds actually. What you've describes is pretty close to threads. You can see the block message here: https://github.com/textileio/go-textile/blob/master/pb/protos/model.proto#L89 Thread structure looks like this: https://github.com/textileio/go-textile/blob/master/pb/protos/model.proto#L35 Thanks for the links. I will take a look for sure! |
Also worth noting, when blocks are sent over the wire, they get wrapped in an https://github.com/textileio/go-textile/blob/master/pb/protos/message.proto#L51 Which is signed by the peer and encrypted with the thread key. |
Since the blocks themselves are protos and not IPLD objects, the links between them are also encrypted. However, off-block data, like a file or directory, is part of an IPLD object. |
Nice, I did not realize it was that similar. Few remarks / questions:
|
Oh that is interesting. In my mind content stored in the feed was encrypted already. I wonder what lead you to choose doing this instead. Is that so to avoid decryption on local use or ? |
Two answers (sander might have better answers here as well)
Also, not an instead situation: content stored in feed _ is_ already encrypted... but we also want to encrypt the block updates so that links between updates etc aren't visible in plain text. So there is essentially multiple layers of encryption. |
Yep, file / content encryption is dictated by the thread schema, which essentially describes what the IPLD file structure looks like. By default, encryption is on. Schema https://github.com/textileio/go-textile/blob/master/pb/protos/model.proto#L159 For example, this is a directory that contains files generated by one of the "build-in" schemas (https://github.com/textileio/go-textile/blob/master/schema/textile/media.go) https://ipfs.io/ipfs/QmcbVc6mu9eDnukB2HC2WJ7Ug4zLuADydo2yYtxfFRKzLc The block that points to this directory uses it's top-level hash as |
Thanks for elaboration. I understand intent now, which is to conceal pointers to the feed itself that is it's size and encrypted content that it points to. That way you could have two layers of participation:
This leads me to thinking that it would be really nice to have IPLD composition (as in function composition) so you could pipeline IPLD encryption resolver with IPLD proto-buf resolver. Unless I'm overlooking something that would simplify things a bit. |
Another relevant thought that IPLD should probably have a native support for encryption built-in so that GraphSync queries could cut through encryption layer. |
One more piece of context here. Below is an example
|
I sketched out more or less what I'm aiming for & would love feedback |
Where can I see what you've sketched out, @Gozala? |
Oh, I forgot my to paste a link 🤦♂️ https://github.com/gozala/ipdf/ in terms of other files only worth looking right now is this one https://github.com/Gozala/ipdf/blob/master/src/format.js |
It dawned on my over the weekend that I've gone down the single writer piece of the hybrid single/multi-peer trees proposed solution above in the past and ran into a wall: Of course you can't simply compose one tree out of the other because they're immutable chains (the hashed data of a block includes it's parent(s)). You could compose an additional chain with blocks that contain the other blocks, but then you'd have two add them as new blocks to IPFS. We could try a normal IPLD chain for the composed chain, which would not require new blocks. The links would be exposed, but the chain would "leak" much less information since nodes simply point to another encrypted proto message blob. IPLD node for a single peer's record of all other immutable thread member trees:
|
Yes to using IPLD if/where possible. This would likely also increase interoperability with other tools significantly. I think the exposed links are probably ok (best someone could determine is the length and frequency of chain updates), but could be perhaps still encrypt the top level IPLD object as we did previously? Or even encrypt the links before adding them, creating something like an Encrypted-IPLD object? |
Obviously this would require a special IPLD resolver... but maybe this would be something others could find useful without too much additional work? |
@Gozala very nice! I took a long weekend and have a bunch of catch up today. Will jump in this evening or tomorrow. |
@sanderpick Chances are I'm misunderstanding what you're saying above, so please bear with me. @mikeal suggested something relevant here which I incorporated into my draft: https://github.com/Gozala/ipdf/blob/master/src/format.js#L25-L34 export type Head<a> = {
author: AuthorPublicKey,
signature: Signature,
block: Encoded<Block<a>, ReplicatorKey>
}
export type Block<a> = {
links: CID<Head<a>>[],
message: Encoded<Message<a>, FollowerKey>
} Idea here is that |
As per following threads: It seems that there is interest in making it possible to evolve IPLD to allow recursive message unpacking of some sorts. My plan for implementation now is to implement kind of polyfill for that or maybe an utility that can take "decoder" and IPLD resolver and return new IPLD resolver that will use decoder to decode block before continue with a resolution. I also find idea of encoding decoder config into block itself really compelling. |
After talking w/ @Gozala about DAT and having a few other concerns pile up in my head, I want to start an epic around the next version of threads.
The current thread implementation has a couple major downsides. One related to the append-only log, and the other to over-the-wire orchestration:
MERGE
blocks. Conflicts arise with almost every write since members all write to the same tree. As a rule of thumb, the lessMERGE
blocks the better, or in other words, the tree should be as narrow as possible.Proposed solutions:
Thread members already keep track of who's in a thread via the
thread_peers
table. Similar to DAT, we can track HEAD for each peer by simply adding a column to this table. This would drastically reduce the number ofMERGE
blocks in each chain because only one peer would be writing to each.Now, the problem remains, how does each member efficiently snapshot the thread for backup? Each node can write their own additional chain that essentially joins all the others (including their own single). This joined tree doens't need to be shared/synced with other peers (they write their own). So, merges are just not needed. The HEAD of this tree is what is written to backup snapshots. During a recovery from the snapshot, a peer can walk it backwards, building a single tree for each peer.
So far, we've had good luck with gossipsub in search. Non-server based nodes are able to participate in search queries. We can replace direct P2P comms between normal peers with gossipsub + an ACK mechanism to hopefully increase peer discoverability. So, similar to how search currently works where the Cafe Service subs each peer to its own peerID, the Thread Service can sub each peer to its own account address (related to #565).
Thoughts?
cc @Gozala @jimpick
The text was updated successfully, but these errors were encountered: