New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Multi-Writer DEP #10

Merged
merged 11 commits into from Jul 6, 2018

Conversation

Projects
None yet
4 participants
@bnewbold
Contributor

bnewbold commented Mar 5, 2018

Current status: needs proof-reading, and there are some unresolved issues, but ready for review

Rendered pre-merge

Big TODOs:

  • write "Semantics and Usage" section
  • re-structure and re-write implementation details
  • examples (with valid messages)
  • security and privacy concerns

@joehand joehand referenced this pull request Mar 6, 2018

Closed

Data redundancy #191

hypercore feed, and it is broadly considered best practice not to distribute
secret keys between multiple users or multiple devices. In fact, the current
hypercore implementation has no mechanism to resolve disputes or recover if
multiple agents used the same secret key to append to the same feed.

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

Perhaps note that this is by design (for simplicity, scalability etc)

- secure key distribution and authentication (eg, if a friend should be given
write access to a hyperdb database, how is that friend's feed key found and
verified?)
- merge conflict resolution, potentially using application-layer semantics

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

the last part could be misunderstood. we detect conflicts and provide apis for the user to deal with those multiple values (if they want)

replicated data type" ([CRDT][crdt]).
[vc]: https://en.wikipedia.org/wiki/Vector_clock
[crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

great section

[scaling]: #scaling
TODO: brief note on scaling properties (eg, reasonable numbers of writers per
database)

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

up to 1000 in the first impl will prob scale thing. noting that we have ways of scaling to millions of writers but thats for later versions :)

is written to `./local/`. All other feeds are written under directories
prefixed `./peers/<feed-discovery-key>/`.
TODO: the above disk format is incorrect?

This comment has been minimized.

@mafintosh

This comment has been minimized.

@bnewbold

bnewbold Jun 10, 2018

Contributor

What is the actual on-disk format? I don't have any actual hyperdbs with more than source and local on my laptop.

- when this node was written, the largest seq # in the third feed I have is 5
For example, Bob's vector clock for Alice's seq=3 entry above would be `[0, 3]`
since he knows of her latest entry (seq=3) and his own (seq=0).

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

worth noting here that the order of the clock is not consistent across writers. it's only consistent the inflatedfeed message of the same feed

TODO:
- Why is this design the best in the space of possible designs?

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

maps almost perfectly to append-only logs, proven tech (a HAMT is nothing new)

TODO:
- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

we looked at a bunch of different data structures. it basically boils down to this: we need a kv store that is sparse friendly (ie no external indexing needed), not too much of a space overhead, roundtrip friendly, and that can be mapped on an append-only log.

not many approaches exists that fit this over than a trie. a normal prefix trie might end up being better (lex ordered instead of hash ordered) though but it's close to trivial to support both.

you mention complexity here. it's actually the same complexity to implement this datastructure in single writer mode as it is to impl the current hyperdrive one (same scheme, just uses hashes). the multiwriter one has more potential race conditions but a continuation of the the same idea.

This comment has been minimized.

@bnewbold

bnewbold Jun 10, 2018

Contributor

Thanks for these notes! TBC, these questions came straight from the DEP template.

- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?
- What is the impact of not doing this?

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

leaving it up to the ecosystem to figure out multiwriter?

i think the only over valid approach would have been to try and solve multiwriter at the hypercore layer but it becomes just as tricky there as it has been getting this right

[unresolved]: #unresolved-questions
What is the technical term for the specific CRDT we are using?
"Operation-based"?

This comment has been minimized.

@mafintosh
"Operation-based"?
If there are conflicts resulting in ambiguity of whether a key has been deleted
or has a new value, does `db.get(key)` return an array of `[new_value, None]`?

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

db.get always return an array of nodes and not values. this is to provide more context. in case of the scenario you describe one of the nodes would have value === null (we are considering adding a .deleted flag instead)

# Changelog
[changelog]: #changelog

This comment has been minimized.

@mafintosh

mafintosh May 2, 2018

@jimpick has been instrumental and getting this production ready as well

@aral

This comment has been minimized.

aral commented Jun 6, 2018

I’m sure you guys have already looked into the current academic research around CRDTs but I just came out of a deep dive where I reached the same conclusions (DAG with publickey auth weaved into the graph). On the conflict resolution side, have you evaluated the causal tree/Logoot/LSEQTree approaches at all?

Links to all three from the quick summary of my own research:
https://indienet.info/other/spikes/crdt/

PS. Given that you’ve basically built what I was planning to, I believe we are going to be using DAT for Indie Site going forward which makes me extremely happy :)

@bnewbold bnewbold force-pushed the bnewbold:dep-multiwriter branch from 7c8f44f to 89c4174 Jun 10, 2018

@bnewbold

This comment has been minimized.

Contributor

bnewbold commented Jun 10, 2018

@mafintosh needs proof-reading, and there are some unresolved issues, but I think this is mostly ready for review. I left some questions in "unresolved".

I had been procrastinating this for a long time because I thought it was going to be really gnarly, but it's actually super simple compared to all the hyperdb trie nitty gritties.

Looks like the hyperdb DEP will need a small update with the deleted and Header changes.

@bnewbold bnewbold changed the title from WIP: Multi-Writer DEP to Draft: Multi-Writer DEP Jun 17, 2018

@bnewbold bnewbold referenced this pull request Jun 17, 2018

Closed

Upcoming Meeting Agenda - 20th June 2018 #24

0 of 6 tasks complete
multiple nodes will be returned. If a key has been deleted from the database, a
node with the `deleted` flag will be returned; note that such "tombstone" nodes
can still have a `value` field, which may contain application-specific metadata
(such as a timestamp).

This comment has been minimized.

@mafintosh

mafintosh Jun 20, 2018

if all nodes to be returned are deleted null will be returned instead (similar to looking up a non-written key)

Couldn't the local feed's sequence number be skipped from vector clocks,
because it's implied by the sequence number of the hypercore entry itself? Same
with the key in the feed list (for inflated entries).

This comment has been minimized.

@mafintosh

mafintosh Jun 20, 2018

yep, just added for now for simplicity

"Operation-based" or "State-based"?
What is the actual on-disk layout (folder structure), if not what is documented
here?

This comment has been minimized.

@mafintosh

mafintosh Jun 20, 2018

content/<disc-key> <-- any content feed
peers/<disc-key> <-- writers
source <-- original writer
local <-- local writer if not original
@pfrazee

This comment has been minimized.

Member

pfrazee commented Jun 21, 2018

Great work, @bnewbold.

Do we need to take some time to discuss the permissions schemes? I thought we were going to have an owner / writer distinction, where owners can authorize and writers can only write.

[unresolved]: #unresolved-questions
What is the technical term for the specific CRDT we are using?
"Operation-based" or "State-based"?

This comment has been minimized.

@pfrazee

pfrazee Jun 21, 2018

Member

This is tough to answer because there's only one op currently (put) but I'm fairly sure it's ops-based.

Op-based vs State-based

@pfrazee pfrazee referenced this pull request Jul 6, 2018

Closed

Upcoming Meeting Agenda - 4th July 2018 #25

0 of 6 tasks complete
@bnewbold

This comment has been minimized.

Contributor

bnewbold commented Jul 6, 2018

I integrated most comments.

@pfrazee I think this is the first I've heard of an owner/writer permissions distinction; is that in current hyperdb? If it wasn't in the 3.0 release, I think at this point i'd say we should ship this as draft and update as things change.

@pfrazee

This comment has been minimized.

Member

pfrazee commented Jul 6, 2018

@bnewbold sounds good

@bnewbold bnewbold merged commit 3ad7ed8 into datprotocol:master Jul 6, 2018

@bnewbold bnewbold deleted the bnewbold:dep-multiwriter branch Jul 6, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment