Beringei is a high performance, in-memory storage engine for time series data.
C++ CMake Thrift Shell
Switch branches/tags
Nothing to show
Clone or download
Hieu Pham and facebook-github-bot Remove lock contention in multimaster read
Summary:
We changed behavior of multi-master reads. We remove the need for a lock, and move the merging step towards the end.
For every response from each host:
* Lock
* Deserialize
* Merge
* Unlock
* Set `oneComplete` if all keys are read.

During finalization:
* Report

For every response from each host:
* Check status code, count number of keys
* Add raw response to an `MPMCQueue`
* Set `oneComplete` if one region contains all read keys.

During finalization:
* Merge
* Report

Reviewed By: jeffdshen

Differential Revision: D8776533

fbshipit-source-id: 43ab43f49ed9fa5f939a6655aaa55bcac6b4959f
Latest commit 75c3002 Jul 11, 2018

README.md

** THIS REPO HAS BEEN ARCHIVED AND IS NO LONGER BEING ACTIVELY MAINTAINED **

Beringei CircleCI

A high performance, in memory time series storage engine

In the fall of 2015, we published the paper “Gorilla: A Fast, Scalable, In-Memory Time Series Database” at VLDB 2015. Beringei is the open source representation of the ideas presented in this paper.

Beringei is a high performance time series storage engine. Time series are commonly used as a representation of statistics, gauges, and counters for monitoring performance and health of a system.

Features

Beringei has the following features:

  • Support for very fast, in-memory storage, backed by disk for persistence. Queries to the storage engine are always served out of memory for extremely fast query performance, but backed to disk so the process can be restarted or migrated with very little down time and no data loss.
  • Extremely efficient streaming compression algorithm. Our streaming compression algorithm is able to compress real world time series data by over 90%. The delta of delta compression algorithm used by Beringei is also fast - we see that a single machine is able to compress more than 1.5 million datapoints/second.
  • Reference sharded service implementation, including a client implementation.
  • Reference http service implementation that enables direct Grafana integration.

How can I use Beringei?

Beringei can be used in one of two ways.

  1. We have created a simple, sharded service, and reference client implementation, that can store and serve time series query requests.
  2. You can use Beringei as an embedded library to handle the low-level details of efficiently storing time series data. Using Beringei in this way is similar to RocksDB - the Beringei library can be the high performance storage system underlying your performance monitoring solution.

Requirements

Beringei is tested and working on:

  • Ubuntu 16.10

We also depend on these open source projects:

Building Beringei

Our instructions are for Ubuntu 16.10 - but you will probably be able to modify the install scripts and directions to work with other linux distros.

  • Run sudo ./setup_ubuntu.sh.

  • Build beringei.

mkdir build && cd build && cmake .. && make
  • Generate a beringei configuration file.
./beringei/tools/beringei_configuration_generator --host_names $(hostname) --file_path /tmp/beringei.json
  • Start beringei.
./beringei/service/beringei_main \
    -beringei_configuration_path /tmp/beringei.json \
    -create_directories \
    -sleep_between_bucket_finalization_secs 60 \
    -allowed_timestamp_behind 300 \
    -bucket_size 600 \
    -buckets $((86400/600)) \
    -logtostderr \
    -v=2
  • Send data.
while [[ 1 ]]; do
    ./beringei/tools/beringei_put \
        -beringei_configuration_path /tmp/beringei.json \
        testkey ${RANDOM} \
        -logtostderr -v 3
    sleep 30
done
  • Read the data back.
./beringei/tools/beringei_get \
    -beringei_configuration_path /tmp/beringei.json \
    testkey \
    -logtostderr -v 3

License

Beringei is BSD-licensed. We also provide an additional patent grant.