Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 428 lines (308 sloc) 16.627 kb
0c1b88b lots more docs.
Robey Pointer authored
1
62482e2 cleanup.
Robey Pointer authored
2 A working guide to kestrel
3 ==========================
0c1b88b lots more docs.
Robey Pointer authored
4
5 Kestrel is a very simple message queue that runs on the JVM and uses the
6 memcache protocol (with some extensions) to talk to clients.
7
8 A single kestrel server has a set of queues identified by a name, which is
9 also the filename of that queue's journal file (usually in
10 `/var/spool/kestrel`). Each queue is a strictly-ordered FIFO of "items" of
11 binary data. Usually this data is in some serialized format like JSON or
12 ruby's marshal format.
13
d6199a9 explain queue names.
Robey Pointer authored
14 Generally queue names should be limited to alphanumerics `[A-Za-z0-9]`, dash
15 (`-`) and underline (`_`). In practice, kestrel doesn't enforce any
16 restrictions other than the name can't contain slash (`/`) because that can't
17 be used in filenames, squiggle (`~`) because it's used for temporary files,
e45676a incorporate matt sanford's feedback.
Robey Pointer authored
18 plus (`+`) because it's used for fanout queues, and dot (`.`) because it's
19 reserved for future use. Queue names are case-sensitive, but if you're running
20 kestrel on OS X or Windows, you will want to refrain from taking advantage of
21 this, since the journal filenames on those two platforms are *not*
22 case-sensitive.
d6199a9 explain queue names.
Robey Pointer authored
23
0c1b88b lots more docs.
Robey Pointer authored
24 A cluster of kestrel servers is like a memcache cluster: the servers don't
25 know about each other, and don't do any cross-communication, so you can add as
26 many as you like. Clients have a list of all servers in the cluster, and pick
27 one at random for each operation. In this way, each queue appears to be spread
8c93d94 more doc improvements.
Robey Pointer authored
28 out across every server, with items in a loose ordering.
0c1b88b lots more docs.
Robey Pointer authored
29
58746d2 more docs!
Robey Pointer authored
30 When kestrel starts up, it scans the journal folder and creates queues based
31 on any journal files it finds there, to restore state to the way it was when
32 it last shutdown (or was killed or died). New queues are created by referring
33 to them (for example, adding or trying to remove an item). A queue can be
34 deleted with the "delete" command.
35
0c1b88b lots more docs.
Robey Pointer authored
36
37 Configuration
38 -------------
39
70ef6d0 add some more docs, and make the guide refer to them.
Robey Pointer authored
40 The config files for kestrel are scala expressions loaded at runtime, usually
41 from `production.scala`, although you can use `development.scala` by passing
42 `-Dstage=development` to the java command line.
43
44 The config file evaluates to a `KestrelConfig` object that's used to configure
45 the server as a whole, a default queue, and any overrides for specific named
46 queues. The fields on `KestrelConfig` are documented here with their default
47 values:
48 http://robey.github.com/kestrel/doc/main/api/net/lag/kestrel/config/KestrelConfig.html
0c1b88b lots more docs.
Robey Pointer authored
49
50 To confirm the current configuration of each queue, send "dump_config" to
51 a server (which can be done over telnet).
52
53 To reload the config file on a running server, send "reload" the same way.
70ef6d0 add some more docs, and make the guide refer to them.
Robey Pointer authored
54 You should immediately see the changes in "dump_config", to confirm. Reloading
55 will only affect queue configuration, not global server configuration. To
56 change the server configuration, restart the server.
0c1b88b lots more docs.
Robey Pointer authored
57
70ef6d0 add some more docs, and make the guide refer to them.
Robey Pointer authored
58 Logging is configured according to `util-logging`. The logging configuration
59 syntax is described here:
60 https://github.com/twitter/util/blob/master/util-logging/README.markdown
0c1b88b lots more docs.
Robey Pointer authored
61
70ef6d0 add some more docs, and make the guide refer to them.
Robey Pointer authored
62 Per-queue configuration is documented here:
1531d0d finish cleaning up the guide.
Robey Pointer authored
63 http://robey.github.com/kestrel/doc/main/api/net/lag/kestrel/config/QueueBuilder.html
74f21d8 document expiration_timer_frequency_seconds, max_item_size, and move_…
Robey Pointer authored
64
0c1b88b lots more docs.
Robey Pointer authored
65
66 The journal file
67 ----------------
68
69 The journal file is the only on-disk storage of a queue's contents, and it's
70 just a sequential record of each add or remove operation that's happened on
71 that queue. When kestrel starts up, it replays each queue's journal to build
72 up the in-memory queue that it uses for client queries.
73
58746d2 more docs!
Robey Pointer authored
74 The journal file is rotated in one of two conditions:
0c1b88b lots more docs.
Robey Pointer authored
75
1531d0d finish cleaning up the guide.
Robey Pointer authored
76 1. the queue is empty and the journal is larger than `defaultJournalSize`
0c1b88b lots more docs.
Robey Pointer authored
77
1531d0d finish cleaning up the guide.
Robey Pointer authored
78 2. the journal is larger than `maxJournalSize`
0c1b88b lots more docs.
Robey Pointer authored
79
1531d0d finish cleaning up the guide.
Robey Pointer authored
80 For example, if `defaultJournalSize` is 16MB (the default), then if the queue
81 is empty and the journal is larger than 16MB, it will be truncated into a new
82 (empty) file. If the journal is larger than `maxJournalSize` (1GB by default),
83 the journal will be rewritten periodically to contain just the live items.
0c1b88b lots more docs.
Robey Pointer authored
84
1531d0d finish cleaning up the guide.
Robey Pointer authored
85 You can turn the journal off for a queue (`keepJournal` = false) and the queue
0c1b88b lots more docs.
Robey Pointer authored
86 will exist only in memory. If the server restarts, all enqueued items are
1531d0d finish cleaning up the guide.
Robey Pointer authored
87 lost. You can also force a queue's journal to be sync'd to disk periodically,
88 or even after every write operation, at a performance cost, using
89 `syncJournal`.
0c1b88b lots more docs.
Robey Pointer authored
90
1531d0d finish cleaning up the guide.
Robey Pointer authored
91 If a queue grows past `maxMemorySize` bytes (128MB by default), only the
0c1b88b lots more docs.
Robey Pointer authored
92 first 128MB is kept in memory. The journal is used to track later items, and
93 as items are removed, the journal is played forward to keep 128MB in memory.
94 This is usually known as "read-behind" mode, but Twitter engineers sometimes
95 refer to it as the "square snake" because of the diagram used to brainstorm
96 the implementation. When a queue is in read-behind mode, removing an item will
97 often cause 2 disk operations instead of one: one to record the remove, and
98 one to read an item in from disk to keep 128MB in memory. This is the
99 trade-off to avoid filling memory and crashing the JVM.
100
101
58746d2 more docs!
Robey Pointer authored
102 Item expiration
103 ---------------
0c1b88b lots more docs.
Robey Pointer authored
104
58746d2 more docs!
Robey Pointer authored
105 When they come from a client, expiration times are handled in the same way as
106 memcache: if the number is small (less than one million), it's interpreted as
107 a relative number of seconds from now. Otherwise it's interpreted as an
108 absolute unix epoch time, in seconds since the beginning of 1 January 1970
109 GMT.
0c1b88b lots more docs.
Robey Pointer authored
110
58746d2 more docs!
Robey Pointer authored
111 Expiration times are immediately translated into an absolute time, in
1531d0d finish cleaning up the guide.
Robey Pointer authored
112 *milliseconds*, and if it's further in the future than the queue's `maxAge`,
a3c112f fix typos
Robey Pointer authored
113 the `maxAge` is used instead. An expiration of 0, which is usually the
58746d2 more docs!
Robey Pointer authored
114 default, means an item never expires.
0c1b88b lots more docs.
Robey Pointer authored
115
58746d2 more docs!
Robey Pointer authored
116 Expired items are flushed from a queue whenever a new item is added or
117 removed. An idle queue won't have any items expired, but you can trigger a
118 check by doing a "peek" on it.
0c1b88b lots more docs.
Robey Pointer authored
119
a3c112f fix typos
Robey Pointer authored
120 The global config option `expirationTimerFrequency` can be used to
74f21d8 document expiration_timer_frequency_seconds, max_item_size, and move_…
Robey Pointer authored
121 start a background thread that periodically removes expired items from the
1531d0d finish cleaning up the guide.
Robey Pointer authored
122 head of each queue. See `README.md` file for more.
74f21d8 document expiration_timer_frequency_seconds, max_item_size, and move_…
Robey Pointer authored
123
0c1b88b lots more docs.
Robey Pointer authored
124
58746d2 more docs!
Robey Pointer authored
125 Fanout Queues
126 -------------
127
128 If a queue name has a `+` in it (like "`orders+audit`"), it's treated as a
129 fanout queue, using the format `<parent>+<child>`. These queues belong to a
130 parent queue -- in this example, the "orders" queue. Every item written into
131 a parent queue will also be written into each of its children.
132
133 Fanout queues each have their own journal file (if the parent queue has a
134 journal file) and otherwise behave exactly like any other queue. You can get
135 and peek and even add items directly to a child queue if you want. It uses the
136 parent queue's configuration instead of having independent child queue
137 configuration blocks.
138
139 When a fanout queue is first referenced by a client, the journal file (if any)
140 is created, and it will start receiving new items written to the parent queue.
141 Existing items are not copied over. A fanout queue can be deleted to stop it
142 from receiving new items.
143
144
145 Memcache commands
146 -----------------
147
148 - `SET <queue-name> <flags (ignored)> <expiration> <# bytes>`
149
150 Add an item to a queue. It may fail if the queue has a size or item limit
151 and it's full.
152
e45676a incorporate matt sanford's feedback.
Robey Pointer authored
153 - `GET <queue-name>[options]`
66c498b docs!
Robey Pointer authored
154
58746d2 more docs!
Robey Pointer authored
155 Remove an item from a queue. It will return an empty response immediately if
156 the queue is empty. The queue name may be followed by options separated
157 by `/`:
158
66c498b docs!
Robey Pointer authored
159 - `/t=<milliseconds>`
160
161 Wait up to a given time limit for a new item to arrive. If an item arrives
162 on the queue within this timeout, it's returned as normal. Otherwise,
163 after that timeout, an empty response is returned.
164
165 - `/open`
166
167 Tentatively remove an item from the queue. The item is returned as usual
168 but is also set aside in case the client disappears before sending a
169 "close" request. (See "Reliable Reads" below.)
58746d2 more docs!
Robey Pointer authored
170
66c498b docs!
Robey Pointer authored
171 - `/close`
58746d2 more docs!
Robey Pointer authored
172
66c498b docs!
Robey Pointer authored
173 Close any existing open read. (See "Reliable Reads" below.)
58746d2 more docs!
Robey Pointer authored
174
66c498b docs!
Robey Pointer authored
175 - `/abort`
58746d2 more docs!
Robey Pointer authored
176
66c498b docs!
Robey Pointer authored
177 Cancel any existing open read, returing that item to the head of the
178 queue. It will be the next item fetched. (See "Reliable Reads" below.)
58746d2 more docs!
Robey Pointer authored
179
66c498b docs!
Robey Pointer authored
180 - `/peek`
58746d2 more docs!
Robey Pointer authored
181
66c498b docs!
Robey Pointer authored
182 Return the first available item from the queue, if there is one, but don't
183 remove it. You can't combine this with any of the reliable read options.
58746d2 more docs!
Robey Pointer authored
184
d1b4ee7 @rtyler Update guide.md
rtyler authored
185 For example, to open a new read, waiting up to 500msec for an item:
e45676a incorporate matt sanford's feedback.
Robey Pointer authored
186
187 GET work/t=500/open
188
189 Or to close an existing read and open a new one:
190
191 GET work/close/open
192
58746d2 more docs!
Robey Pointer authored
193 - `DELETE <queue-name>`
194
195 Drop a queue, discarding any items in it, and deleting any associated
196 journal files.
197
198 - `FLUSH <queue-name>`
199
200 Discard all items remaining in this queue. The queue remains live and new
201 items can be added. The time it takes to flush will be linear to the current
202 queue size, and any other activity on this queue will block while it's being
203 flushed.
204
205 - `FLUSH_ALL`
206
207 Discard all items remaining in all queues. The queues are flushed one at a
208 time, as if kestrel received a `FLUSH` command for each queue.
0c1b88b lots more docs.
Robey Pointer authored
209
58746d2 more docs!
Robey Pointer authored
210 - `VERSION`
211
212 Display the kestrel version in a way compatible with memcache.
213
214 - `SHUTDOWN`
215
216 Cleanly shutdown the server and exit.
217
218 - `RELOAD`
219
220 Reload the config file and reconfigure all queues. This should have no
221 noticable effect on the server's responsiveness.
222
223 - `DUMP_CONFIG`
224
225 Dump a list of each queue currently known to the server, and list the config
226 values for each queue. The format is:
227
66c498b docs!
Robey Pointer authored
228 queue 'master' {
229 max_items=2147483647
230 max_size=9223372036854775807
231 max_age=0
232 max_journal_size=16277216
233 max_memory_size=134217728
234 max_journal_overflow=10
235 max_journal_size_absolute=9223372036854775807
236 discard_old_when_full=false
237 journal=true
238 sync_journal=false
239 }
240
241 The last queue will be followed by `END` on a line by itself.
58746d2 more docs!
Robey Pointer authored
242
243 - `STATS`
244
245 Display server stats in memcache style. They're described below.
246
247 - `DUMP_STATS`
248
249 Display server stats in a more readable style, grouped by queue. They're
250 described below.
251
1531d0d finish cleaning up the guide.
Robey Pointer authored
252 - `MONITOR <queue-name> <seconds>`
253
254 Monitor a queue for a time, fetching any new items that arrive. Clients
255 are queued in a fair fashion, per-item, so many clients may monitor a
256 queue at once. After the given timeout, a separate `END` response will
257 signal the end of the monitor period. Any fetched items are open
258 transactions (see "Reliable Reads" below), and should be closed with
259 `CONFIRM`.
260
261 - `CONFIRM <queue-name> <count>`
262
263 Confirm receipt of `count` items from a queue. Usually this is the response
264 to a `MONITOR` command, to confirm the items that arrived during the monitor
265 period.
266
74f21d8 document expiration_timer_frequency_seconds, max_item_size, and move_…
Robey Pointer authored
267
58746d2 more docs!
Robey Pointer authored
268 Reliable reads
269 --------------
270
271 Normally when a client removes an item from the queue, kestrel immediately
272 discards the item and assumes the client has taken ownership. This isn't
273 always safe, because a client could crash or lose the network connection
274 before it gets the item. So kestrel also supports a "reliable read" that
275 happens in two stages, using the `/open` and `/close` options to `GET`.
276
277 When `/open` is used, and an item is available, kestrel will remove it from
278 the queue and send it to the client as usual. But it will also set the item
279 aside. If a client disconnects while it has an open read, the item is put back
e45676a incorporate matt sanford's feedback.
Robey Pointer authored
280 into the queue, at the head, so it will be the next item fetched. Only one
281 item can be "open" per client connection.
58746d2 more docs!
Robey Pointer authored
282
283 A previous open request is closed with `/close`. The server will reject any
284 attempt to open another read when one is already open, but it will ignore
285 `/close` if there's no open request, so that you can add `/close` to every
286 `GET` request for convenience.
287
288 If for some reason you want to abort a read without disconnecting, you can use
289 `/abort`. But because aborted items are placed back at the head of the queue,
290 this isn't a good way to deal with client errors. Since the error-causing item
291 will always be the next one available, you'll end up bouncing the same item
292 around between clients instead of making progress.
293
294 There's always a trade-off: either potentially lose items or potentially
295 receive the same item multiple times. Reliable reads choose the latter option.
296 To use this tactic successfully, work items should be idempotent, meaning the
297 work could be done 2 or 3 times and have the same effect as if it had been
298 done only once (except wasting some resources).
299
e45676a incorporate matt sanford's feedback.
Robey Pointer authored
300 Example:
301
302 GET dirty_jobs/close/open
303 (receives job 1)
304 GET dirty_jobs/close/open
305 (closes job 1, receives job 2)
306 ...etc...
58746d2 more docs!
Robey Pointer authored
307
74f21d8 document expiration_timer_frequency_seconds, max_item_size, and move_…
Robey Pointer authored
308
58746d2 more docs!
Robey Pointer authored
309 Server stats
310 ------------
311
66c498b docs!
Robey Pointer authored
312 Global stats reported by kestrel are:
313
314 - `uptime` - seconds the server has been online
315 - `time` - current time in unix epoch
316 - `version` - version string, like "1.2"
317 - `curr_items` - total of items waiting in all queues
318 - `total_itmes` - total of items that have ever been added in this server's
319 lifetime
320 - `bytes` - total byte size of items waiting in all queues
321 - `curr_connections` - current open connections from clients
322 - `total_connections` - total connections that have been opened in this
323 server's lifetime
324 - `cmd_get` - total `GET` requests
325 - `cmd_set` - total `SET` requests
326 - `cmd_peek` - total `GET/peek` requests
327 - `get_hits` - total `GET` requests that received an item
328 - `get_misses` - total `GET` requests on an empty queue
329 - `bytes_read` - total bytes read from clients
330 - `bytes_written` - total bytes written to clients
331
332 For each queue, the following stats are also reported:
333
334 - `items` - items waiting in this queue
335 - `bytes` - total byte size of items waiting in this queue
336 - `total_items` - total items that have been added to this queue in this
337 server's lifetime
338 - `logsize` - byte size of the queue's journal file
339 - `expired_items` - total items that have been expired from this queue in this
340 server's lifetime
341 - `mem_items` - items in this queue that are currently in memory
342 - `mem_bytes` - total byte size of items in this queue that are currently in
343 memory (will always be less than or equal to `max_memory_size` config for
344 the queue)
345 - `age` - time, in milliseconds, that the last item to be fetched from this
346 queue had been waiting; that is, the time between `SET` and `GET`; if the
347 queue is empty, this will always be zero
348 - `discarded` - number of items discarded because the queue was too full
349 - `waiters` - number of clients waiting for an item from this queue (using
350 `GET/t`)
721bf6b document the new stat.
Robey Pointer authored
351 - `open_transactions` - items read with `/open` but not yet confirmed
66c498b docs!
Robey Pointer authored
352
353
354 Kestrel as a library
355 --------------------
356
357 You can use kestrel as a library by just sticking the jar on your classpath.
358 It's a cheap way to get a durable work queue for inter-process or inter-thread
359 communication. Each queue is represented by a `PersistentQueue` object:
360
1531d0d finish cleaning up the guide.
Robey Pointer authored
361 class PersistentQueue(val name: String, persistencePath: String,
362 @volatile var config: QueueConfig, timer: Timer,
363 queueLookup: Option[(String => Option[PersistentQueue])]) {
66c498b docs!
Robey Pointer authored
364
365 and must be initialized before using:
366
367 def setup(): Unit
58746d2 more docs!
Robey Pointer authored
368
66c498b docs!
Robey Pointer authored
369 specifying the path for the journal files (if the queue will be journaled),
1531d0d finish cleaning up the guide.
Robey Pointer authored
370 the name of the queue, a `QueueConfig` object (derived from `QueueBuilder`),
371 a timer for handling timeout reads, and optionally a way to find other named
372 queues (for `expireToQueue` support).
58746d2 more docs!
Robey Pointer authored
373
66c498b docs!
Robey Pointer authored
374 To add an item to a queue:
58746d2 more docs!
Robey Pointer authored
375
1531d0d finish cleaning up the guide.
Robey Pointer authored
376 def add(value: Array[Byte], expiry: Option[Time]): Boolean
58746d2 more docs!
Robey Pointer authored
377
66c498b docs!
Robey Pointer authored
378 It will return `false` if the item was rejected because the queue was full.
0c1b88b lots more docs.
Robey Pointer authored
379
66c498b docs!
Robey Pointer authored
380 Queue items are represented by a case class:
381
1531d0d finish cleaning up the guide.
Robey Pointer authored
382 case class QItem(addTime: Time, expiry: Option[Time], data: Array[Byte], var xid: Int)
66c498b docs!
Robey Pointer authored
383
384 and several operations exist to remove or peek at the head item:
385
386 def peek(): Option[QItem]
387 def remove(): Option[QItem]
388
389 To open a reliable read, set `transaction` true, and later confirm or unremove
390 the item by its `xid`:
391
392 def remove(transaction: Boolean): Option[QItem]
393 def unremove(xid: Int)
394 def confirmRemove(xid: Int)
395
1531d0d finish cleaning up the guide.
Robey Pointer authored
396 You can also asynchronously remove or peek at items using futures.
66c498b docs!
Robey Pointer authored
397
1531d0d finish cleaning up the guide.
Robey Pointer authored
398 def waitRemove(deadline: Option[Time], transaction: Boolean): Future[Option[QItem]]
399 def waitPeek(deadline: Option[Time]): Future[Option[QItem]]
66c498b docs!
Robey Pointer authored
400
401 When done, you should close the queue:
402
403 def close(): Unit
404 def isClosed: Boolean
405
406 Here's a short example:
407
1531d0d finish cleaning up the guide.
Robey Pointer authored
408 var queue = new PersistentQueue("work", "/var/spool/kestrel", config, timer, None)
66c498b docs!
Robey Pointer authored
409 queue.setup()
410
411 // add an item with no expiration:
412 queue.add("hello".getBytes, 0)
413
414 // start to remove it, then back out:
415 val item = queue.remove(true)
416 queue.unremove(item.xid)
417
418 // remove an item with a 500msec timeout, and confirm it:
1531d0d finish cleaning up the guide.
Robey Pointer authored
419 queue.waitRemove(500.milliseconds.fromNow, true)() match {
420 case None =>
421 println("nothing. :(")
422 case Some(item) =>
423 println("got: " + new String(item.data))
424 queue.confirmRemove(item.xid)
66c498b docs!
Robey Pointer authored
425 }
0c1b88b lots more docs.
Robey Pointer authored
426
66c498b docs!
Robey Pointer authored
427 queue.close()
Something went wrong with that request. Please try again.