forked from juju/juju
-
Notifications
You must be signed in to change notification settings - Fork 0
/
doc.go
117 lines (87 loc) · 5.39 KB
/
doc.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
// Copyright 2015 Canonical Ltd.
// Licensed under the AGPLv3, see LICENCE file for details.
/*
The lease package exists to implement distributed lease management on top of
mgo/txn, and to expose assert operations that allow us to gate other mgo/txn
transactions on lease state. This necessity has affected the package; but,
apart from leaking assertion operations, it functions as a distributed lease-
management system with various useful properties.
These properties of course rest upon assumptions; ensuring the validity of the
following statements is the job of the client.
* The lease package has exclusive access to any collection it's configured
to use. (Please don't store anything else in there.)
* Given any (collection,namespace) pair, any client Id will be unique at any
given point in time. (Run no more than one per namespace per server, and
identify them according to where they're running).
* Time passes at approximately the same rate for all clients. (Note that
the clients do *not* have to agree what time it is, or what time zone
anyone is in: just that 1s == 1s. This is likely to be true already if
you use lease.SystemClock{}.)
So long as the above holds true, the following statements will too:
* A successful ClaimLease guarantees lease ownership until *at least* the
requested duration after the start of the call. (It does *not* guaranntee
any sort of timely expiry.)
* A successful ExtendLease makes the same guarantees. (In particular, note
that this cannot cause a lease to be shortened; but that success may
indicate ownership is guaranteed for longer than requested.)
* ExpireLease will only succeed when the most recent writer of the lease is
known to believe the time is after the expiry time it wrote.
Remarks on clock skew
---------------------
When expiring a lease (or determining whether it needs to be extended) we only
need to care about the writer, because everybody else is determining their own
skew relative only to the writer. That is, assuming 3 clients:
A) knows "the real time"; wrote a lease at 01:00:00, expiring at 01:00:30A
B) is 20 seconds ahead; read the lease between 01:00:23B and 01:00:24B
C) is 5 seconds behind; read the lease between 00:59:57C and 00:59:58C
...then B cannot infer an expiry time earlier than 01:00:54L (=01:00:34A) and
C cannot infer an expiry time earlier than 01:00:28C (=01:00:33A). If A fails
to expire its lease, then C will trigger first and try to expire it, and most
likely succeed; and when C succeeds, B's subsequent attempt to expire the
lease will certainly fail, because C has updated both the clock document and
the lease document and invalidated B's assertions.
So B can and does then Refresh; and sees the lease document written by C, and
now needs only to consider its offset relative to C in order to Do The Right
Thing.
Schema design
-------------
For each namespace, we store a single clock document; and one additional
document per lease. The lease document holds the name, holder, expiry, and
writer of the lease; the clock document contains the most recent time
acknowledged by each client that has written to the namespace.
Every transaction that the lease package makes is gated on a write to the
clock document (which *must* precede any lease operations) which acks a
recent time and fails if it appears to be going backward in time (this could
happen if we crashed at the wrong moment and left a transaction queued but
unprepared for some time: we definitely don't want to accept those operations).
The fact that the clock document is involved in every transaction renders it a
per-namespace bottleneck, but the ability to discard outdated transactions is
valuable; and the centralised record of acknowledged times mitigates the impact
of client failure.
That is to say: assuming client C wrote lease L at time T, and wrote lease M
at time U (later than T); and then failed; then a fresh client D will be able
to expire lease L earlier (by U-T) than it could infer with the information in
lease L alone.
(We could ofc still calculate that by storing a written time in each lease
document, but it'd be more hassle to collate the data, harder to inspect the
database, and would only be able to make much weaker anti-time-travel promises
than we can manage with the clock doc.)
Client usage considerations
---------------------------
* Client operates at a relatively low level of abstraction. Claiming a held
lease will fail, even on behalf of the holder; expiring an expired lease
will fail; but at least we can allow lease extensions to race benignly,
because they don't involve ownership change and thus can't break promises
(so long as our skew logic is correct).
* ErrInvalid is normal and expected; you should never pass that on to your
own clients, because it indicates that you tried to manipulate the client
in an impossible way. You can and should inspect Leases() and figure out
what to do instead; that may well be "return an error", but please be sure
to return your own error, suitable for your own level of abstraction.
* You *probably* shouldn't ever need to actually call Refresh. It's perfectly
safe to let state drift arbitrarily far out of sync; when you try to run
operations, you will either succeed by luck despite your aged cache... or,
if you fail, you'll get ErrInvalid and a fresh cache to inspect to find out
recent state.
*/
package lease