🛠️ Realistic benchmark for key value stores
Switch branches/tags
Clone or download
goncalotomas use faster key generation function in some drivers (#214)
* add ct targets to each database to ease testing of a single db

* refactor some code, use faster key generation function

* refactor some code, use faster key generation function
Latest commit 02261d4 Dec 2, 2018

README.md

FMKe

Erlang Version Build Status Coverage Status Dialyzer Enabled

FMKe is an extendable real world benchmark for distributed key-value stores.
This repository contains code for the application server and a set of scripts for orchestrating deployment and local execution of micro-benchmarks.

Why?

Here is a comparison of available benchmark specifications that we analyzed, with FMKe for comparison:

Benchmark Target Systems Workload type
TPC-C SQL-Based databases **realistic ✔️
TPC-E SQL-Based databases **realistic ✔️
YCSB Key-value stores ✔️ synthetic
FMKe Key-value stores ✔️ **realistic ✔️

** Emulates real application patterns

Backing the realistic claims

FMKe was one of the final contributions of the SyncFree European research project. It was designed to benchmark its reference platform, AntidoteDB, by closely emulating a realistic application. One of the industrial partners of the project, Trifork, provided statistical data about Fælles Medicinkort (FMK), a sub-system relative to the Danish National Joint Medicine Card. The real system is backed by a distributed key value store to ensure high availability, which validates the decision to use it as a benchmark (originally) for AntidoteDB.

System description

The real world FMK system, and FMKe alike are designed to store patient health data, mostly revolving around medical prescriptions. Here is the ER diagram:

Build Status

There are 4 core entities: treatment facilities, patients, and pharmacies. Other records appear as relations between these entities, but it will become apparent that the workload focuses heavily on prescription records. More information about the system operations and data model can be found in this document.

Architecture

Build Status
Consider FMKe as a general application server that contains the logic mimicking the real FMK system. We decided not to release FMKe as a single monolithic application, since there are multiple benefits in separating it in these 3 components.
Firstly, separating the application server from the workload generation component doesn't require us to reinvent the wheel, since many good workload generation tools already exist. On the other hand, making the application logic independent of the database allows for collaboration in supporting a broader set of data stores.
We have a generic interface for key-value stores (implemented as an Erlang behaviour) that is well specified, which makes supporting a new database as simple as writing a driver for it. Furthermore, pull requests with new drivers or optimizations for existing ones are accepted and welcomed.

Supported data stores

  • AntidoteDB
  • Cassandra
  • Redis
  • Riak KV

How the benchmark is deployed

By default FMKe keeps a connection pool to a single database node, and the workload generation is performed by Lasp Bench.
To benchmark clustered databases with n nodes, n FMKe instances can be deployed, or alternatively one FMKe node can connect to multiple nodes (the exact number is dependent on the connection pool size).
To avoid network and CPU bottlenecks that could impact the result of the benchmark, it is advised to use different servers for each one of the components. Having said that, a number of scripts are available for development that enable local execution of micro benchmarks.

Use case: AntidoteDB evaluation

FMKe was used in January 2017 to evaluate the performance of AntidoteDB. The evaluation took place in Amazon Web Services using m3.xlarge instances which have 4 vCPUs, 15GB RAM and 2x40GB SSD storage.
The biggest test case used 36 AntidoteDB instances spread across 3 data centers (Germany, Ireland and United States), 9 instances of FMKe and 18 instances of (former Basho Bench) Lasp Bench that simulated 1024 concurrent clients performing operations as quickly as possible.
Before the benchmark, AntidoteDB was populated with over 1 million patient keys, 50 hospitals, 10.000 doctors and 300 pharmacies.

Testing out FMKe locally

FMKe requires Erlang/OTP and rebar3. You need at least Erlang 20, FMKe will not compile in previous versions.
You can test out FMKe locally by cloning the repository:

git clone https://github.com/goncalotomas/FMKe.git

Once you have a local copy of the repository, the first step is to choose your target data store:

make select-TARGET_DB

Where TARGET_DB should be one of the supported databases. From now on let's assume that we chose riak.
You don't need to have any databases installed, since local benchmarks use Docker images.
Finally, you can run a micro-benchmark by using the following command:

make bench-riak

Alternatively, you can also validate that your FMKe copy is functional by running unit tests with your desired database as backend:

make eunit-riak

This command will run a battery of unit tests that ensure that all functionality related to the benchmark is able to performed in the database you have previously selected.