FMKe is an extendable real world benchmark for distributed key-value stores.
This repository contains code for the application server and a set of scripts for orchestrating deployment and local execution of micro-benchmarks.
Here is a comparison of available benchmark specifications that we analyzed, with FMKe for comparison:
|Benchmark||Target Systems||Workload type|
** Emulates real application patterns
Backing the realistic claims
FMKe was one of the final contributions of the SyncFree European research project. It was designed to benchmark its reference platform, AntidoteDB, by closely emulating a realistic application. One of the industrial partners of the project, Trifork, provided statistical data about Fælles Medicinkort (FMK), a sub-system relative to the Danish National Joint Medicine Card. The real system is backed by a distributed key value store to ensure high availability, which validates the decision to use it as a benchmark (originally) for AntidoteDB.
The real world FMK system, and FMKe alike are designed to store patient health data, mostly revolving around medical prescriptions. Here is the ER diagram:
There are 4 core entities: treatment facilities, patients, and pharmacies. Other records appear as relations between these entities, but it will become apparent that the workload focuses heavily on prescription records. More information about the system operations and data model can be found in this document.
Consider FMKe as a general application server that contains the logic mimicking the real FMK system. We decided not to release FMKe as a single monolithic application, since there are multiple benefits in separating it in these 3 components.
Firstly, separating the application server from the workload generation component doesn't require us to reinvent the wheel, since many good workload generation tools already exist. On the other hand, making the application logic independent of the database allows for collaboration in supporting a broader set of data stores.
We have a generic interface for key-value stores (implemented as an Erlang behaviour) that is well specified, which makes supporting a new database as simple as writing a driver for it. Furthermore, pull requests with new drivers or optimizations for existing ones are accepted and welcomed.
Supported data stores
- Riak KV
How the benchmark is deployed
By default FMKe keeps a connection pool to a single database node, and the workload generation is performed by Lasp Bench.
To benchmark clustered databases with n nodes, n FMKe instances can be deployed, or alternatively one FMKe node can connect to multiple nodes (the exact number is dependent on the connection pool size).
To avoid network and CPU bottlenecks that could impact the result of the benchmark, it is advised to use different servers for each one of the components. Having said that, a number of scripts are available for development that enable local execution of micro benchmarks.
Use case: AntidoteDB evaluation
FMKe was used in January 2017 to evaluate the performance of AntidoteDB. The evaluation took place in Amazon Web Services using
m3.xlarge instances which have 4 vCPUs, 15GB RAM and 2x40GB SSD storage.
The biggest test case used 36 AntidoteDB instances spread across 3 data centers (Germany, Ireland and United States), 9 instances of FMKe and 18 instances of (former Basho Bench) Lasp Bench that simulated 1024 concurrent clients performing operations as quickly as possible.
Before the benchmark, AntidoteDB was populated with over 1 million patient keys, 50 hospitals, 10.000 doctors and 300 pharmacies.
Testing out FMKe locally
git clone https://github.com/goncalotomas/FMKe.git
Once you have a local copy of the repository, the first step is to choose your target data store:
Where TARGET_DB should be one of the supported databases. From now on let's assume that we chose
You don't need to have any databases installed, since local benchmarks use Docker images.
Finally, you can run a micro-benchmark by using the following command:
Alternatively, you can also validate that your FMKe copy is functional by running unit tests with your desired database as backend:
This command will run a battery of unit tests that ensure that all functionality related to the benchmark is able to performed in the database you have previously selected.