Skip to content

danielealbano/cachegrand

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

This PR modifies the workflow used to start cachegrand to bootstrap the
epoch gc workers in program.c.

For each object type a new worker is started, to take care of a specific
object type to collect up the staged object when possible.

The context of these workers is stored in program_context_t, which is
defined in program.h, therefore plenty of unit are updated, as part of
the PR, to include new headers:
- epoch gc
- epoch gc worker
- ring unbounded queue uint128
48ba7eb

Git stats

Files

Permalink
Failed to load latest commit information.

Build & Test codecov LGTM Grade Lines of code COCOMO

cachegrand logo

What is cachegrand?

cachegrand is an open-source fast, scalable and modular Key-Value store designed from the ground up to take advantage of today's hardware, compatible with the well known Redis, able to provide better performance than many other platforms.

The benchmarks below have been carried out on an AMD EPYC 7502P with 2 x 25Gbit network links using Ubuntu 22.04, with the default configuration and using memtier_benchmark with 10M different keys, 64 bytes of payload, and with 100 clients per thread (1 thread 100 clients, 64 threads 6400 clients) using 2 different machines with the same hardware to generate the load.

cachegrand is able to scale almost linearly if enough cpu power is left to the operating system to process the network data.

GET Operations/s   SET Operations/s

Below a comparison done using batching (pipelines) with 256 commands and payloads of 64 bytes, because of the extra load generated by the batching the amount of clients has been reduced to 10 per thread for a total of 640 clients.

GET Operations/s   SET Operations/s

Latencies are great as well, especially taking into account that with 6400 clients over 64 cores the operating system doesn't really have too much room to handle the network traffic.

Latency with 1 threads and 100 clients   Latency with 64 threads and 6400 clients

Key features:

  • Modular architecture to support widely used protocols, e.g. Redis, Prometheus, etc.
  • Time-series database for fast data writes and retrieval with primitives built to handle different data types (e.g. small strings, large blobs, jsons, etc.) - work in progress;
  • Hashtable GET Lock-free and Wait-free operations, SET and DELETE use localized spinlocks, the implementation is capable to digest 2.1 billion records per second on a 1x AMD EPYC 7502 (see benches);
  • An extremely fast ad-hoc memory allocator for fixed size allocations, Fast Fixed Memory Allocator (or FFMA) capable of allocating and freeing memory in O(1);
  • Linear vertical scalability when using the in-memory database, 2x cpus means 2x requests (see benches);
  • Built for flash memories to be able to efficiently saturate the available IOPS in modern DC NVMEs and SSDs - proof of concept support;

Planned Key Features:

  • More modules for additional platforms compatibility, e.g. Memcache, AWS S3, etc., or to add support for monitoring, e.g. DataDog, etc.;
  • Ad ad-hoc network stack based on DPDK / Linux XDP (eXpress Data Path) and the FreeBSD network stack;
  • WebAssembly to provide AOT-compiled User Defined Functions, event hooks, implement modules, you can use your preferred language to perform operations server side;
  • Replication groups and replica tags, tag data client side or use server side events to tag the data and determine how they will be replicated;
  • Active-Active last-write-wins data replication, it's a cache, write to any node of a replication group to which the replication tags are assigned, no need to worry it;

It's possible to find more information in the docs' folder.

The platform is written in C, validated via unit tests, Valgrind and integration tests, it's also built with a set of compiler options to fortify the builds (#85).

Currently, it runs only on Linux, on Intel or AMD cpus and requires a kernel v5.7 or newer, will be ported to other platforms once will become more feature complete.

Please be aware that

cachegrand is not production ready and not feature complete, plenty of basic functionalities are being implemented, the documentation is lacking as well as it's being re-written, please don't open issues for missing documentation.

The status of the project is tracked via GitHub using the project management board.

Issues & contributions

Please if you find any bug, malfunction or regression feel free to open an issue or to fork the repository and submit your PRs! If you do open an Issue for a crash, if possible please enable sentry.io in the configuration file and try to reproduce the crash, a minidump will be automatically uploaded on sentry.io. Also, if you have built cachegrand from the source, please attach the compiled binary to the issue as well as sentry.io knows nothing of your own compiled binaries.

Performances

The platform is regularly benchmarked as part of the development process to ensure that no regressions slip through, it's possible to find more details in the documentation.

How to install

Distro packages

Packages are currently not available, they are planned to be created for the v0.3 milestone.

Build from source

Instructions on how to build cachegrand from the sources are available in the documentation

Configuration

cachegrand comes with a default configuration but for production use please review the documentation to ensure an optimal deployment.

Running cachegrand

cachegrand doesn't need to run as root but please review the configuration section to ensure that enough lockable memory has been allowed, enough files can be opened and that the slab allocator has been enabled and enough huge pages have been provided

Before trying to start cachegrand, take a look to the performance tips available in the docs' section as they might provide a valuable help!

Help

$ ./cachegrand-server --help
Usage: cachegrand-server [OPTION...]

  -c, --config-file=FILE     Config file (default config file
                             /usr/local/etc/cachegrand/cachegrand.conf )
  -l, --log-level=LOG LEVEL  log level (error, warning, info, verbose, debug)
  -?, --help                 Give this help list
      --usage                Give a short usage message

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

Start it locally

/path/to/cachegrand-server -c /path/to/cachegrand.yaml
[2022-06-05T10:26:08Z][INFO       ][program] cachegrand-server version v0.1.0 (built on 2022-07-05T10:26:07Z)
[2022-06-05T10:26:08Z][INFO       ][program] > Release build, compiled using GCC v10.3.0
[2022-06-05T10:26:08Z][INFO       ][program] > Hashing algorithm in use t1ha2
[2022-06-05T10:26:08Z][INFO       ][config] Loading the configuration from ../../etc/cachegrand.yaml
[2022-06-05T10:26:08Z][INFO       ][program] Ready to accept connections

Docker

Download the example config file

curl https://raw.githubusercontent.com/danielealbano/cachegrand/main/etc/cachegrand.yaml.skel -o /path/to/cachegrand.yaml

Edit it with your preferred editor and then start cachegrand using the following command

docker run \
  -v /path/to/cachegrand.yaml:/etc/cachegrand/cachegrand.yaml \
  --ulimit memlock=-1:-1 \
  --ulimit nofile=262144:262144 \
  -p 6379:6379 \
  --rm \
  cachegrand/cachegrand-server:latest