Skip to content

CAIDA/dbats

Repository files navigation

DBATS - DataBase of Aggregated Time Series
Version 0.1

DBATS is a high performance time series database engine optimized for
inserting/updating values for many series simultaneously. DBATS can easily
sustain write speeds of more than 1.6 million values per second, and CAIDA has
used DBATS databases with more than 40 million distinct series.

Since DBATS is a low-level time series database engine, it has limited tools for
inserting and querying data. As such, DBATS should be used in conjuction with
mod_dbats (included in this release) to provide a Graphite-compatible HTTP read
interface, and with libtimeseries
(http://www.caida.org/tools/utilities/libtimeseries) to provide a high(er)-level
write API.

A DBATS time series is a sequence of data entries measured at regular time
intervals. A set of similar time series with the same period and entry size is a
"bundle". Each time series in a bundle is identified by a user-defined string
key and an automatically assigned integer key id. The primary bundle stores the
original raw data. Additional "aggregate" bundles can be defined that merge
sub-sequences of data points for a key in the primary bundle into single data
points for the same key in the aggregate bundle.

Conceptually, DBATS stores a bundle in a logical table where rows are time and
columns are metric keys, and each cell entry contains an array of one or more
64-bit values. Each row of an aggregate bundle table corresponds to a set of
rows in the primary data bundle table.

Internally, DBATS uses a number of BDB databases (tables):

 - key_name -> key_id
 - key_id -> key_name
 - bundle_id -> { agg_func, steps, period, time_range }
 - { bundle_id, time, frag_id } -> is_set_fragment
 - { bundle_id, time, frag_id } -> data_fragment

Each metric key is assigned an id when it is created. Keys can not be deleted
and key ids can never change. Data fragments are blocks of 10000 data entries,
covering sequential key_ids with the same time and bundle_id. Data fragments are
stored compressed in BDB. Is_set_fragments are corresponding blocks of flags
indicating whether each entry is actually set. Aggregate values are calculated
immediately (in the same transaction) when primary values are set.

The design is optimized for inserting many values in the same timeslice before
moving to another timeslice.

DBATS is based on TSDB (http://luca.ntop.org/tsdb.pdf), with the following key
differences:
 - aggregation series (calculated at insert time)
 - option to truncate old series values
 - functions to iterate over list of keys
 - faster reading with key id
 - ACID transactions
 - 64 bit values

----------------------------------------------------------------------
Documentation for the C API is in the "html" directory of the release tarball.

----------------------------------------------------------------------
Requirements

  - Berkeley DB 5.x (preferably 5.3 or later)

The mod_dbats apache httpd module additionally requires:

  - Apache httpd 2.2 or later
  - apxs (should be included with Apache)

----------------------------------------------------------------------
Building

To build and install dbats:
    ./configure [options]
    make
    make check ;#(optional)
    make install

To install the optional mod_dbats apache module:
    As root (or other user with permission to restart apache), run
    "make install-apache" in the dbats build tree.  This will do the
    following:
    1. if an old dbats module is already loaded, disable it and restart server
    2. install and enable the new dbats module
    3. restart apache server

Most of the GNU standard configure options and make targets are also
available.

If your libdb is installed in a nonstandard location $dir/lib and
the corresponding db.h header is in $dir/include, you should use the
./configure option "--with-libdb-prefix=$dir".  This should work whether
libdb is static or shared.

----------------------------------------------------------------------
Programs

Run any program with the "-?" option for a list of options.

* dbats_init
  Create a DBATS database and optionally define aggregates and keys.

* dbats_info
  Display information about a DBATS database.

* dbats_recover
  Recover a damaged DBATS database.  This is useful when any app reports
  "dbats_open: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery",
  which can occur after an app is killed without cleanly closing the db.

* dbats_series_limit
  Set the maximum number of data points to store in a timeseries.

* dbats_best_bundle
  Find the bundle that best covers the specified time range.

* dbats_delete_key
  Delete a set of keys from a DBATS database.

* report-in
* report-out
  Example programs that write to and read from a DBATS database.

About

DataBase of Aggregated Time Series

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published