Skip to content
Erlang Shell Makefile DTrace
Branch: dev
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
apps
hooks Linting and cleanup Nov 25, 2015
rel
schema Add bkt_trie Mar 13, 2018
share Add dtrace for evicts Mar 15, 2018
.gitignore Get deb packaging working again Apr 9, 2017
.gitlab-ci.yml update ci Oct 18, 2017
.travis.yml Make sure travis checks things can be released Mar 24, 2017
LICENSE Added LICENSE file. Jul 18, 2014
Makefile update to new dist Aug 31, 2017
README.md Some readme improvements Apr 4, 2017
config.mk
ddc-send Added simple console client. Aug 19, 2014
dialyzer.mittigate
elvis.config Add cache stats command Feb 27, 2018
eqc Gitlab swtich Sep 27, 2017
fifo.mk libump Sep 22, 2017
rebar.config add eep Mar 13, 2018
rebar.lock add eep Mar 13, 2018
rebar3 Gitlab swtich Sep 27, 2017
rebar_eqc.config Added some EQC tests for metric_vnode. Jun 14, 2014
tree add eep Mar 13, 2018

README.md

Read more at the official site and the documentation.

DalmatinerDB

DalmatinerDB is a metric database written in pure Erlang. It takes advantage of some special properties of metrics to make some tradeoffs. The goal is to make a store for metric data (time, value of a metric) that is fast, has a low overhead, and is easy to query and manage.

Tradeoffs

I try here to be explicit about the tradeoffs we made, so people can decide if they are happy with them (costs vs gains). The acceptable tradeoffs differ from case to case, but I hope the choices we made fit a metric store quite well. If you are comparing DalmatinerDB with X, please don't assume that just because X does not list the tradeoffs they made, they have none; be inquisitive and make a decision based on facts, not marketing.

A detailed comparison between databases can be found here:

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCEWdPtAoccJ4a-IuZv4fXDHxM/edit#gid=0

Let the Filesystem handle it

A lot of work is handed down to the file system, ZFS is exceptionally smart and can do things like checksums, compressions and caching very well. Handing down these tasks to the filesystem simplifies the codebase, and builds on very well tested and highly performant code, instead of trying to reimplement it.

Prioritise the overall writes over individual ones

DalmatinerDB offers a 'best effort' on storing the metrics, there is no log for writes (if enabled in ZFS, the ZIL (ZFS Intent Log) can log write operations) or forced sync after each write. This means that if your network fails, packets can get lost, and if your server crashes, unwritten data can be lost.

The point is that losing one or two metric points in a huge series is a non-problem, the importance of a metric is often seen in aggregates, and DalmatinerDB fills in the blanks with the last written value. However there is explicitly no guarantee that data is written, so this can be an issue if every single point of metric is of importance!

Flat files

Data is stored in a flat binary format, this means that reads and writes can be calculated to a filename+offset by simple math, there is no need for traversing data-structures. This means however that if a metric stops unwritten, points can 'linger' around for a while depending on how the file size was picked.

As an example: if metrics are stored with a precision down to the second, and 1 week of data is stored per file, up to one week of unused data can be stored, but it should be taken into account that with compression this data will be compressed quite well.

You can’t perform that action at this time.