Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 124 lines (92 sloc) 4.554 kb
7d56bc8 Bob Ippolito readme
etrepum authored
1 statebox - state "monad" for automated conflict resolution
2 ==========================================================
3
4 <bob@redivi.com>
5
6 Overview:
7 ---------
8
9 statebox is a data structure you can use with an eventually consistent
10 system such as riak to resolve conflicts between siblings in a deterministic
11 manner.
12
13 Status:
14 -------
15
84ea939 Bob Ippolito update README to reflect statebox being in production
etrepum authored
16 Used in production at Mochi Media for multiple backend services.
7d56bc8 Bob Ippolito readme
etrepum authored
17
18 Theory:
19 -------
20
21 A statebox wraps a current value and an event queue. The event queue is
22 an ordered list of `{timestamp(), op()}`. When two or more statebox
23 are merged with `statebox:merge/1`, the event queues are merged with
24 `lists:umerge/1` and the operations are performed again over the current
25 value of the newest statebox, producing a new statebox with conflicts
26 resolved in a deterministic manner.
27
28 An `op()` is a `{fun(), [term()]}`, with all but the last argument specified
29 in the term list. For example `{ordsets:add_element/2, [a]}`. To evaluate
30 this op, `ordsets:add_element(a, value(Statebox))` will be called. It is also
c4f8083 Bob Ippolito add batch operation support
etrepum authored
31 possible to specify an `op()` as a `{module(), atom(), [term()]}` tuple, or
32 as a list of `op()` when performing several operations at the same timestamp.
7d56bc8 Bob Ippolito readme
etrepum authored
33
34 There are several important limitations on the kinds of `op()` that are safe
35 to use (`{F, [Arg]}` is the example `op()` used below):
36
37 * An `op()` must be repeatable: `F(Arg, F(Arg, Value)) =:= F(Arg, Value)`
38 * If the `{fun(), [term()]}` form is used, the `fun()` should be a reference
39 to an exported function.
40 * `F(Arg, Value)` should return the same type as `Value`.
41
42 Some examples of safe to use `op()` that ship with Erlang:
43
44 * `{fun ordsets:add_element/2, [SomeElem]}` and
45 `{fun ordsets:del_element/2, [SomeElem]}`
46 * `{fun ordsets:union/2, [SomeOrdset]}` and
47 `{fun ordsets:subtract/2, [SomeOrdset]}`
48 * `{fun orddict:store/3, [Key, Value]}`
49
50 Some examples of functions you can not use as `op()`:
51
52 * `{fun orddict:update_counter, [Key, Inc]}` - it is not repeatable.
53 `F(a, 1, [{a, 0}]) =/= F(a, 1, F(a, 1, [{a, 0}]))`
54
55 Optimizations:
56 --------------
57
58 There are two functions that modify a statebox that can be used to
d67ee3e Bob Ippolito fix README
etrepum authored
59 reduce its size. One or both of these should be done every time before
7d56bc8 Bob Ippolito readme
etrepum authored
60 serializing the statebox.
61
62 * `truncate(N, Statebox)` return Statebox with no more than `N` events in its
63 queue.
64 * `expire(Age, Statebox)` return Statebox with no events older than
65 `last_modified(Statebox) - Age`. If using `new/1` and `modify/2`, then this
66 is in milliseconds.
67
68 Usage:
69 ------
70
71 Simple `ordsets()` example:
72
73 New = statebox:new(fun () -> [] end),
d67ee3e Bob Ippolito fix README
etrepum authored
74 ChildA = statebox:modify({fun ordsets:add_element/2, [a]}, New),
75 ChildB = statebox:modify({fun ordsets:add_element/2, [b]}, New),
7d56bc8 Bob Ippolito readme
etrepum authored
76 Resolved = statebox:merge([ChildA, ChildB]),
d67ee3e Bob Ippolito fix README
etrepum authored
77 statebox:value(Resolved) =:= [a, b].
7d56bc8 Bob Ippolito readme
etrepum authored
78
79 With manual control over timestamps:
80
81 New = statebox:new(0, fun () -> [] end),
d67ee3e Bob Ippolito fix README
etrepum authored
82 ChildA = statebox:modify(1, {fun ordsets:add_element/2, [a]}, New),
83 ChildB = statebox:modify(2, {fun ordsets:add_element/2, [b]}, New),
7d56bc8 Bob Ippolito readme
etrepum authored
84 Resolved = statebox:merge([ChildA, ChildB]),
d67ee3e Bob Ippolito fix README
etrepum authored
85 statebox:value(Resolved) =:= [a, b].
33766d4 Bob Ippolito add statebox_orddict convenience wrapper
etrepum authored
86
87 Using the `statebox_orddict` convenience wrapper:
88
89 New = statebox_orddict:from_values([]),
90 ChildA = statebox:modify([statebox_orddict:f_store(a, 1),
91 statebox_orddict:f_union(c, [a, aa])],
92 New),
93 ChildB = statebox:modify([statebox_orddict:f_store(b, 1),
94 statebox_orddict:f_union(c, [b, bb])],
95 New),
96 Resovled = statebox_orddict:from_values([ChildA, ChildB]),
84ea939 Bob Ippolito update README to reflect statebox being in production
etrepum authored
97 statebox:value(Resolved) =:= [{a, 1}, {b, 1}, {c, [a, aa, b, bb]}].
e12c5a1 Bob Ippolito update docs, statebox_orddict:orddict_from_proplist/1
etrepum authored
98
99 Resources
100 ---------
101
102 On Mochi Labs
103 =============
104
105 [statebox, an eventually consistent data model for Erlang (and Riak)][labs0]
106 on the Mochi Labs blog describes the rationale for statebox and shows how it
107 works.
108
109 [labs0]: http://labs.mochimedia.com/archive/2011/05/08/statebox/
110
111 Convergent / Commutative Replicated Data Types
112 ==============================================
113
114 The technique used to implement this is similar to what is described in
115 this paper:
116 [A comprehensive study of Convergent and Commutative Replicated Data Types][CRDT].
117 statebox was developed without knowledge of the paper, so the terminology and
118 implementation details differ.
119
120 I think the technique used by statebox would be best described as a
121 state-based object, although the merge algorithm and event queue
122 is similar to how op-based objects are described.
123
124 [CRDT]: http://hal.archives-ouvertes.fr/inria-00555588/
Something went wrong with that request. Please try again.