Skip to content

Commit

Permalink
orders
Browse files Browse the repository at this point in the history
  • Loading branch information
gritzko committed Jul 9, 2016
1 parent 6124a56 commit b6653c3
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 8 deletions.
14 changes: 9 additions & 5 deletions SUMMARY.md
@@ -1,11 +1,15 @@
# Table of Contents

* [Table of Contents](SUMMARY.md) - this document
* [Base64x64 numbers](64x64.md) - our sacred serialization format
* [Stamps](stamp.md) - event/object ids for a distributed system
* [Specifiers](spec.md) - compound event... descriptors
* [Operations](op.md) - immutable ops are Swarm's blood cells
* [Replicas](replica.md) - database replicas, full and partial
* [Introduction](README.md) - what is the Swarm protocol
* Data replication model
* [Replicas](replica.md) - database replicas, full and partial
* [Order](order.md) - op order (partial, local linear)
* Protocol primitives
* [Base64x64 numbers](64x64.md) - our sacred serialization format
* [Stamps](stamp.md) - event/object ids for a distributed system
* [Specifiers](spec.md) - compound event... descriptors
* [Operations](op.md) - immutable ops are Swarm's blood cells
* [Handshakes](handshake.md) - how sync sessions start and end
* [Peer-to-peer handshakes](peer_handshake.md) - for full database replicas
* [Client handshakes](client_handshake.md) - for clients, to connect to a database
Expand Down
6 changes: 3 additions & 3 deletions matrix.md
Expand Up @@ -7,7 +7,7 @@ There is no way to alter or censor the op stream in transit.

Still, there is another group of attacks, most notably the famous double-spending attack, that depend on the attacker's ability to broadcast different versions of reality to different peers, i.e. to *lie*.
Once the attacker sends out contradictory ops, that creates a swarm split-brain as on the picture `(I)`.
If the swarm is physically permanently separated, the attacker (`A`) can lie to both parts of the network (`O`, `P` peers) regarding its own actions.
If the swarm is physically permanently separated, the attacker `A` can lie to both parts of the network (`O`, `P` peers) regarding its own actions.
Note that the attacker can not misrepresent or censor other peer's actions, as those are signed and entangled.
Once `P` peers entangle `A`'s lies into their op streams, `A` can no longer relay `P`'s ops to the `O` side, because they are entangled to his own `P`-side lies.
Similarly, it can no longer relay `O` ops to the `P` side as they get entangled with `O`-side lies.
Expand All @@ -23,9 +23,9 @@ Both `O` and `P` peers see the other side going offline.
Suppose, the attacker does not control the bottleneck link, like on picture `(II)`.
Then, the split-brain becomes transitory.
The lie will be detected as soon as both versions of `A`'s actions are known to all peers.
In the general case, that should happen at the [RTT timescale][rtt]).
In the general case, that should happen at the [RTT timescale][rtt].
For example, `R` peers will get the `R`-side lie first, `Q`-side lie second.
The lie will be seen as a *fork* of the `A`'s [*home* op log](crypto.md): a certain op will be followed by two distinct versions of the consequent op.
The lie will be seen as a *fork* of the `A`'s [*home op log*](crypto.md): a certain op will be followed by two distinct versions of the consequent op.

So, the options for the attacking peer are quite limited.
Still, there is a window of opportunity for the duration of the split-brain.
Expand Down
53 changes: 53 additions & 0 deletions order.md
@@ -0,0 +1,53 @@
## Storage and relay orders

At its base, Swarm is a log replication protocol.
It relays immutable ops while preserving the causal order.
Formally, the protocol's guarantees are:

* every op is delivered to every peer replica,
* it is delivered exactly once and
* in accordance with the [happened-before order][morelamport].

The general op relay rule is that all new ops are stored first, relayed second, in the same order as they were received.
In case two replicas were temporarily disconnected, on reconnection the ops must be replayed in the same order (replay order is the relay order).
Only concurrent ops can go in different orders at different replicas.

Swarm guarantees are not that much different from TCP guarantees: exactly-once in-order delivery, in a certain scope.
Of course, Swarm's scope is higher in the stack than TCP's.
From the practical standpoint, 80% of the protocol's effort goes into gluing together segments of continuous TCP-like transmissions and continuous log-structured append-only storage writes.
The objective is to give it all an appearance of a single continuous session. That way, Swarm implements the abstraction of a distributed partially-ordered op log.

Swarm is partially-ordered, so there is no single total op order.
For a given database, op orders vary at different replicas.
Still, there are some useful and important linear orders too.

*Replica order* is the order of [ops](op.md) generated by a single *origin* [replica](replica.md). A replica is considered a sequential process, its op ids are monotonous (later ops have higher ids). This order is global (same-origin ops go in exactly the same order at every replica).

*Home peer order* is the order of [client](replica.md) ops as they arrive on their [home peer](replica.md). Essentially, a home peer is used to create a de-facto linear order for all the ops its clients generate. The peer's own ops belong to that order too. This order is consistent at every replica, except client replicas themselves (well, clients don't get the full op log anyway).

*Arrival order* is the de-facto order of ops as they arrive on a certain replica. When peers sync, they talk in terms of each other's arrival orders. This is the most variable order of all: it is replica-specific. Another term for this is [*delivery order*][crdt]. It was also addressed earlier as relay and replay order -- they are all the same. A single Swarm replica can be seen as an arrival-order op log.

A Swarm network is peers connected by an arbitrary graph of peering connections.
Clients only connect to their respective home peers.
Peer connections should form a connected graph, at least most of the time.
That graph is not necessarily a tree.
Hence, there is some redundancy in op relay.
Normally, a peer gets every new op from every of its connected peers.
For example, the next picture shows three peers connected in a ring (`a-b-c-a`) and an op propagation diagram for a new op created by `a`.
Every peer receives the op twice, relays once.

a---b a b c a abc peers
\ / | _ _ op stored
c | \_ \ / op relayed
| / \_
time V / \

Even peer's own ops are echoed back by its connected peers; that serves as an acknowledgement.
Hence, there is an incentive not to make the graph too dense.
That redundancy does not affect the client side: as a client is only connected to its home peer, it gets every operation once.

Op delivery guarantees can be further hardened by [crypto](crypto.md).
[Cryptographic entanglement](matrix.md) ensures that no op was corrupted, omitted or injected in transit; it further allows all peers to cross-sign all the data and to ensure that every peer sees exactly the same data.

[morelamport]: http://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf
[crdt]: http://hal.upmc.fr/inria-00555588/document

0 comments on commit b6653c3

Please sign in to comment.