Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 124 lines (92 sloc) 4.554 kB
7d56bc8 @etrepum readme
etrepum authored
1 statebox - state "monad" for automated conflict resolution
2 ==========================================================
3
4 <bob@redivi.com>
5
6 Overview:
7 ---------
8
9 statebox is a data structure you can use with an eventually consistent
10 system such as riak to resolve conflicts between siblings in a deterministic
11 manner.
12
13 Status:
14 -------
15
84ea939 @etrepum update README to reflect statebox being in production
etrepum authored
16 Used in production at Mochi Media for multiple backend services.
7d56bc8 @etrepum readme
etrepum authored
17
18 Theory:
19 -------
20
21 A statebox wraps a current value and an event queue. The event queue is
22 an ordered list of `{timestamp(), op()}`. When two or more statebox
23 are merged with `statebox:merge/1`, the event queues are merged with
24 `lists:umerge/1` and the operations are performed again over the current
25 value of the newest statebox, producing a new statebox with conflicts
26 resolved in a deterministic manner.
27
28 An `op()` is a `{fun(), [term()]}`, with all but the last argument specified
29 in the term list. For example `{ordsets:add_element/2, [a]}`. To evaluate
30 this op, `ordsets:add_element(a, value(Statebox))` will be called. It is also
c4f8083 @etrepum add batch operation support
etrepum authored
31 possible to specify an `op()` as a `{module(), atom(), [term()]}` tuple, or
32 as a list of `op()` when performing several operations at the same timestamp.
7d56bc8 @etrepum readme
etrepum authored
33
34 There are several important limitations on the kinds of `op()` that are safe
35 to use (`{F, [Arg]}` is the example `op()` used below):
36
37 * An `op()` must be repeatable: `F(Arg, F(Arg, Value)) =:= F(Arg, Value)`
38 * If the `{fun(), [term()]}` form is used, the `fun()` should be a reference
39 to an exported function.
40 * `F(Arg, Value)` should return the same type as `Value`.
41
42 Some examples of safe to use `op()` that ship with Erlang:
43
44 * `{fun ordsets:add_element/2, [SomeElem]}` and
45 `{fun ordsets:del_element/2, [SomeElem]}`
46 * `{fun ordsets:union/2, [SomeOrdset]}` and
47 `{fun ordsets:subtract/2, [SomeOrdset]}`
48 * `{fun orddict:store/3, [Key, Value]}`
49
50 Some examples of functions you can not use as `op()`:
51
52 * `{fun orddict:update_counter, [Key, Inc]}` - it is not repeatable.
53 `F(a, 1, [{a, 0}]) =/= F(a, 1, F(a, 1, [{a, 0}]))`
54
55 Optimizations:
56 --------------
57
58 There are two functions that modify a statebox that can be used to
d67ee3e @etrepum fix README
etrepum authored
59 reduce its size. One or both of these should be done every time before
7d56bc8 @etrepum readme
etrepum authored
60 serializing the statebox.
61
62 * `truncate(N, Statebox)` return Statebox with no more than `N` events in its
63 queue.
64 * `expire(Age, Statebox)` return Statebox with no events older than
65 `last_modified(Statebox) - Age`. If using `new/1` and `modify/2`, then this
66 is in milliseconds.
67
68 Usage:
69 ------
70
71 Simple `ordsets()` example:
72
73 New = statebox:new(fun () -> [] end),
d67ee3e @etrepum fix README
etrepum authored
74 ChildA = statebox:modify({fun ordsets:add_element/2, [a]}, New),
75 ChildB = statebox:modify({fun ordsets:add_element/2, [b]}, New),
7d56bc8 @etrepum readme
etrepum authored
76 Resolved = statebox:merge([ChildA, ChildB]),
d67ee3e @etrepum fix README
etrepum authored
77 statebox:value(Resolved) =:= [a, b].
7d56bc8 @etrepum readme
etrepum authored
78
79 With manual control over timestamps:
80
81 New = statebox:new(0, fun () -> [] end),
d67ee3e @etrepum fix README
etrepum authored
82 ChildA = statebox:modify(1, {fun ordsets:add_element/2, [a]}, New),
83 ChildB = statebox:modify(2, {fun ordsets:add_element/2, [b]}, New),
7d56bc8 @etrepum readme
etrepum authored
84 Resolved = statebox:merge([ChildA, ChildB]),
d67ee3e @etrepum fix README
etrepum authored
85 statebox:value(Resolved) =:= [a, b].
33766d4 @etrepum add statebox_orddict convenience wrapper
etrepum authored
86
87 Using the `statebox_orddict` convenience wrapper:
88
89 New = statebox_orddict:from_values([]),
90 ChildA = statebox:modify([statebox_orddict:f_store(a, 1),
91 statebox_orddict:f_union(c, [a, aa])],
92 New),
93 ChildB = statebox:modify([statebox_orddict:f_store(b, 1),
94 statebox_orddict:f_union(c, [b, bb])],
95 New),
96 Resovled = statebox_orddict:from_values([ChildA, ChildB]),
84ea939 @etrepum update README to reflect statebox being in production
etrepum authored
97 statebox:value(Resolved) =:= [{a, 1}, {b, 1}, {c, [a, aa, b, bb]}].
e12c5a1 @etrepum update docs, statebox_orddict:orddict_from_proplist/1
etrepum authored
98
99 Resources
100 ---------
101
102 On Mochi Labs
103 =============
104
105 [statebox, an eventually consistent data model for Erlang (and Riak)][labs0]
106 on the Mochi Labs blog describes the rationale for statebox and shows how it
107 works.
108
109 [labs0]: http://labs.mochimedia.com/archive/2011/05/08/statebox/
110
111 Convergent / Commutative Replicated Data Types
112 ==============================================
113
114 The technique used to implement this is similar to what is described in
115 this paper:
116 [A comprehensive study of Convergent and Commutative Replicated Data Types][CRDT].
117 statebox was developed without knowledge of the paper, so the terminology and
118 implementation details differ.
119
120 I think the technique used by statebox would be best described as a
121 state-based object, although the merge algorithm and event queue
122 is similar to how op-based objects are described.
123
124 [CRDT]: http://hal.archives-ouvertes.fr/inria-00555588/
Something went wrong with that request. Please try again.