Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 264 lines (229 sloc) 10.822 kb
fccc685 Initial open-source release
MLstate authored
1 (*
2 Copyright © 2011 MLstate
3
6fac5ce @Aqua-Ye [cleanup] ocamllib: typo on Opa
Aqua-Ye authored
4 This file is part of Opa.
fccc685 Initial open-source release
MLstate authored
5
6fac5ce @Aqua-Ye [cleanup] ocamllib: typo on Opa
Aqua-Ye authored
6 Opa is free software: you can redistribute it and/or modify it under the
fccc685 Initial open-source release
MLstate authored
7 terms of the GNU Affero General Public License, version 3, as published by
8 the Free Software Foundation.
9
6fac5ce @Aqua-Ye [cleanup] ocamllib: typo on Opa
Aqua-Ye authored
10 Opa is distributed in the hope that it will be useful, but WITHOUT ANY
fccc685 Initial open-source release
MLstate authored
11 WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
12 FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for
13 more details.
14
15 You should have received a copy of the GNU Affero General Public License
6fac5ce @Aqua-Ye [cleanup] ocamllib: typo on Opa
Aqua-Ye authored
16 along with Opa. If not, see <http://www.gnu.org/licenses/>.
fccc685 Initial open-source release
MLstate authored
17 *)
18 (*
19 @author Louis Gesbert
20 **)
21
22 open Cps.Ops
23 module L = Badop_locator
24 module Node_property = Badop_structure.Node_property
25
26 (* In here we introduce parallelism by dispatching requests among several
27 servers, using the Locator module to select a host given a path (we say
28 servers even though they may just be a local instance of Badop: that
29 behaviour is mostly for testing).
30
31 Several difficulties have to be taken care of:
32 1) Have a way to recover all children of a given node
33 2) Handling links (and copies) correctly
34 3) Time-consistency
35
36 (1) is needed for requests like Children or Search. The proposed solution is
37 to keep the full list of children for any given node at some place ; an
38 optimisation could be to just keep ranges and the servers they belong to, and
39 have the Children query span among these servers.
40 - write operation should return a status telling if the node was created or
41 updated. If created, then (recursively) add it to the parent (if hosted on a
42 different server). Children run on the parent has then no difficulty.
43 - in the meantime we can first check for existence before any write, but
44 that's a huge performance loss (at least until queries are grouped).
45 - a very first prototype could query all servers for Children and Search;
46 another one could always write to the parents (... and then until the root)
47 for every node write.
48 - when we have the Node_properties (see (2) below), we could do it just for
49 variable-number-of-children nodes (Multi and Hidden nodes from the high-level
50 schema).
51
52 (2) will be handled by means of Node_properties, pushed to the low-level
53 database at initialisation. Then we can check any path for parent nodes that
54 could be links (eg, in /mylist/0/tl/0/tl/0/hd, we know that the 'tl'
55 intermediate nodes should be links). Do a (read Status) on each of these in
56 order to get their real location until we get the final resolved path ; then
57 that path can be used to locate the real host.
58
59 (3) if we start a meta-transaction M by a read on server A, a transaction trA
60 is started on A inside M. If sometime later we want to read on server B,
61 still inside M, we need to extend M with trB on server B. Problem is, in some
62 cases it could be inconsistent to start trB at the latest possible
63 revision. That would happen if B has any dependency to a revision on A
64 posterior to the start of trA.
65 - this is not critical and won't be handled in the near future.
66 - a simple solution might be to push the max of all local revisions + 1 as
67 the new revision number to all local dbs. It's an over-approximation but
68 should work
69 - a better solution could be to remember the start-revision of local
70 transactions on each server and propagate them at commit-time. Then each
71 server knows its dependencies towards each other's revision; and when
72 propagating a transaction we can ask for "last revision that doesn't depend
73 on anything further than this list of revisions my meta-transaction is
74 already started on"
75 - doing this using a special part of the db (map other_server => last_revision)
76 that we write to at each commit, and using the history on that needs to be
77 investigated, it sounds like a neat solution. (of course, for that we would
78 need to be able to set the merge-policy of such nodes to TakeMax (or actually
79 TakeSup and relying on our backend's revisions to form a lattice)).
80 *)
81
82 module F (Bk: Badop.S) = struct
83
84 type local_transaction =
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
85 | Tr_notstarted of Bk.database * Bk.revision option * (exn -> unit)
fccc685 Initial open-source release
MLstate authored
86 | Tr_started of Bk.transaction
87 | Tr_changed of Bk.transaction
88
89 type local_revision =
90 | Rev_now
91 | Rev_fixed of Bk.revision
92
93 type database = Bk.database L.t
94 type revision = local_revision L.t
95 type transaction_status = Fresh | Changed | Prepared | Committed | Aborted
96 type transaction = { (* meta-transaction *)
97 loc: local_transaction L.t;
98 status: transaction_status ref;
99 }
100
101 (* general todo for below: make asynchronous (need to implement multi-cont) *)
102 let open_database options k = match options with
103 | Badop.Options_Dispatcher (flat_replication, bk_options) ->
104 let loc = L.create ~flat_replication bk_options in
105 L.map loc
106 (fun o k -> Bk.open_database o @> k)
107 @> k
108 | _ -> assert false
109
110 let close_database xdb =
111 L.sequential_iter xdb Bk.close_database;
112 fun _ -> ()
113
114 let status xdb k =
115 L.map xdb Bk.status
116 @> fun stloc ->
117 Badop.Layer_multi("Dispatcher", L.to_list stloc) |> k
118
119 module Tr = struct
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
120 let start xdb errk k =
121 L.map xdb (fun db k -> Tr_notstarted (db,None,errk) |> k)
fccc685 Initial open-source release
MLstate authored
122 @> fun loc -> {
123 loc = loc;
124 status = ref Fresh;
125 } |> k
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
126 let start_at_revision xdb rev errk k =
fccc685 Initial open-source release
MLstate authored
127 L.mapi xdb
128 (fun key db k ->
129 let rev_opt = match L.get_key rev key with Rev_now -> None | Rev_fixed r -> Some r in
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
130 Tr_notstarted (db, rev_opt, errk) |> k)
fccc685 Initial open-source release
MLstate authored
131 @> fun loc -> {
132 loc = loc;
133 status = ref Fresh;
134 } |> k
135
136 let on_started f default loc k = match loc with
137 | Tr_started tr | Tr_changed tr -> f tr @> k
138 | Tr_notstarted _ -> default |> k
139
140 let on_changed f default loc k = match loc with
141 | Tr_changed tr -> f tr @> k
142 | Tr_started _ | Tr_notstarted _ -> default |> k
143
144 let abort xtr k =
145 xtr.status := Aborted;
146 L.iter xtr.loc (on_started Bk.Tr.abort ()) @> k
147
148 let prepare xtr k = match !(xtr.status) with
149 | Fresh | Prepared -> (xtr, true) |> k
150 | Committed | Aborted -> (xtr, false) |> k
151 | Changed ->
152 L.map_reduce xtr.loc
153 (fun acc (loc,ok) -> loc, acc && ok) true
154 (fun loc k -> match loc with
155 | Tr_changed tr ->
156 Bk.Tr.prepare tr
157 @> fun (tr,ok) -> (Tr_changed tr, ok) |> k
158 | loc -> (loc, true) |> k)
159 @> fun (loc,ok) ->
160 if ok then
161 ({ loc = loc; status = ref Prepared }, true) |> k
162 else
163 abort xtr @> fun () -> (xtr, false) |> k
164
165 let commit xtr k = match !(xtr.status) with (* two-phase commit *)
166 | Fresh -> xtr.status := Committed; true |> k
167 | Committed | Aborted -> false |> k
168 | Changed | Prepared ->
169 prepare xtr
170 @> function
171 | xtr, true ->
172 L.iter xtr.loc
173 (on_started (fun tr k -> Bk.Tr.commit tr @> fun r -> assert(r); () |> k) ())
174 @> fun () -> xtr.status := Committed; true |> k
175 | xtr, false -> abort xtr @> fun () -> false |> k
176 end
177
178 type 'which read_op = ('which,revision) Badop.generic_read_op
179 type 'which write_op = ('which,transaction,revision) Badop.generic_write_op
180
181 let get_tr (push: local_transaction -> unit) ltr k = match ltr with
182 | Tr_started tr | Tr_changed tr -> tr |> k
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
183 | Tr_notstarted (db,None,errk) ->
fccc685 Initial open-source release
MLstate authored
184 (* FIXME: start at a revision guaranteed consistent with the transactions that xtr already contains *)
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
185 Bk.Tr.start db errk
fccc685 Initial open-source release
MLstate authored
186 @> fun tr -> push (Tr_started tr); tr |> k
3321f32 [enhance] database: fatal database errors now trigger the fail-transacti...
Louis Gesbert authored
187 | Tr_notstarted (db,Some rev,errk) ->
188 Bk.Tr.start_at_revision db rev errk
fccc685 Initial open-source release
MLstate authored
189 @> fun tr -> push (Tr_started tr); tr |> k
190
191 let get_local_rev key rev k = match L.get_key rev key with
192 | Rev_now -> assert false
193 | Rev_fixed r -> r |> k
194
195 let set_local_rev loc key bkrev k =
196 L.mapi loc
197 (fun key' _ k -> if key' = key then Rev_fixed bkrev |> k else Rev_now |> k)
198 @> k
199
200 let read xtr path read_op k =
201 L.at_path xtr.loc path
202 (fun key ltr k ->
203 get_tr (L.push_key xtr.loc key) ltr
204 @> fun tr -> Badop.Aux.map_read_op read_op ~revision:(get_local_rev key)
205 @> fun bk_read_op -> Bk.read tr path bk_read_op
206 @> function
207 | `Answer resp ->
208 Badop.Aux.map_read_op resp ~revision:(set_local_rev xtr.loc key)
209 @> fun ans -> `Answer ans |> k
210 | `Absent -> `Absent |> k
211 | `Linkto p -> `Linkto p |> k)
212 @> k
213
214 let raw_write loc path ?(loc_path=path) write_op k =
215 L.mapi_path loc loc_path
216 (fun key ltr k ->
217 get_tr (L.push_key loc key) ltr
218 @> fun tr ->
219 Badop.Aux.map_write_op write_op
220 ~revision:(get_local_rev key)
221 ~transaction:(fun _xtr k -> (assert false: Bk.transaction) |> k)
222 @> fun bk_write_op -> Bk.write tr path bk_write_op
223 @> fun resp -> Tr_changed (Badop.Aux.result_transaction resp) |> k)
224 @> k
225
226 let write xtr path write_op k =
227 match !(xtr.status) with
228 | Prepared | Committed | Aborted ->
229 (* we're not allowed to change that transaction anymore, mark it as
230 aborted (but with a new ref, to keep the parent sane) *)
231 Badop.Aux.respond_set_transaction write_op { xtr with status = ref Aborted } |> k
232 | Fresh | Changed ->
233 (fun k -> match Path.pop_last path with
234 | None -> xtr.loc |> k
235 | Some (_,parent) ->
236 (* register the existence of the node to its parent: write Unit /
237 remove to the same path but on the server hosting the parent *)
238 let query = match write_op with
239 | Badop.Clear _ -> Badop.Clear (Badop_lib.Dialog.query ())
240 | _ -> Badop.Set (Badop_lib.Dialog.query (DataImpl.Unit))
241 in raw_write xtr.loc path ~loc_path:parent query @> k)
242 @> fun loc -> raw_write loc path write_op
243 @> fun loc ->
244 Badop.Aux.respond_set_transaction write_op { loc = loc; status = ref Changed } |> k
245
246
247 let write_list xtr path_write_op_list k =
248 let wr xtr (path, op) k =
249 write xtr path op @> fun resp -> Badop.Aux.result_transaction resp |> k
250 in
251 Cps.List.fold wr xtr path_write_op_list k
252
253 let node_properties xtr config k =
254 (* TODO xtr.config <- config; *)
255 L.iter xtr (fun db -> Bk.node_properties db config) @> k
256
257 module Debug = struct
258 let revision_to_string rev =
259 Base.String.concat_map ~left:"[" ~right:"]" ","
260 (function Rev_fixed r -> Bk.Debug.revision_to_string r | Rev_now -> "x")
261 (L.to_list rev)
262 end
263 end
Something went wrong with that request. Please try again.