Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

Scribe Overview

BertPC edited this page Dec 7, 2012 · 3 revisions

Reliability

The scribe system is designed to be robust to failure of the network or any specific machine, but does not provide transactional guarantees. If a scribe instance on a client machine (we’ll call it a resender for the moment) is unable to send messages to the central scribe server it saves them on local disk, then sends them when the central server or network recovers. To avoid overloading the central server upon a restart, the resender waits a random time between reconnect attempts, and if the central server is near capacity it will return TRY_LATER, which tells the resender to not attempt another send for several minutes.

The central server has similar behavior (the same code in fact) for handling failure of the nfs filer or distributed filesystem it’s writing to. If the filesystem goes down the scribe server writes to local disk until it recovers, then sends the data from local disk to the remote filesystem. The order of the messages is preserved in both this and the resender case.

These error cases will lead to loss of data:

  • If a client can’t connect to either the local or central scribe server the message will be lost
  • If a scribe server crashes it could lose a small amount of data that’s in memory but not on disk
  • Some multiple component failure cases, such as a resender can’t connect to any central server and its local disk fills up
  • Some rare timeout conditions can lead to duplicate messages

Running Scribe
Logging Messages

Configuration

The scribe server is configured by the file specified in the -c command line option, or the file /usr/local/scribe/scribe.conf if none is specified on the command line. The basic idea of the configuration is that a particular category of messages is sent to one or more “stores” of various types. Some types of stores can contain other stores; for example, a bucket store contains many file stores and distributes messages to them based on a hash.

The configuration file consists of a global section and a section for each store. The global section includes the listening port number and the maximum number of messages that the server can handle in a second. Each store section must include a category and a type. There is no restriction on the number of categories or the number of stores per category. The remaining items in the store configuration depend on the store type, and include such things as file location, maximum file size, how often to rotate files, and where a resender should send its data. A store can also contain another store configuration, the name of which is specific to the type of store. For example, a store of type buffer contains and stores and a store of type bucket contains a store called .

The types of stores currently available are:

  • file – writes to a file, either local or nfs.
  • network – sends messages to another scribe server.
  • buffer – contains a primary and a secondary store. Messages are sent to the primary store if possible, and otherwise the secondary. When the primary store becomes available the messages are read from the secondary store and sent to the primary. Ordering of the messages is preserved. The secondary store has the restriction that it must be readable, which at the moment means it has to be a file store.
  • bucket – contains a large number of other stores, and decides which messages to send to which stores based on a hash.
  • null – discards all messages.
  • thriftfile – similar to a file store but writes messages into a Thrift TFileTransport file.
  • multi – a store that forwards messages to multiple stores.

Scribe Configuration