DXLog: Fast object logging, reorganization and recovery.

DXLog is developed by the operating systems group of the department of computer science of the Heinrich-Heine-University Düsseldorf. DXLog is stand-alone and can be used with existing Java applications but is also part of the distributed in-memory key-value store DXRAM.

DXLog allows logging of many small (and also large) objects to disk in a very efficient manner. DXLog is a local component which either receives objects to log by passing a ByteBuffer storing one or more objects or by passing a DXNet message which is deserialized according to the message type (for more information see DXNet. DXLog is not responsible for the replication scheme, object affiliation or sending/receiving of objects. If you are interested in this, have a look at DXRAM. DXLog automatically assigns version numbers to objects for validation purposes as logs are reorganized periodically or on demand. Furthermore, DXLog provides a very fast object recovery enabling reading, validating, error-checking millions of objects per second stored in logs on disk.

Important

DXLog is a research project that's under development. We do not recommend using the system in a production environment or with production data without having an external backup. Expect to encounter bugs. However, we are looking forward to bug reports and code contributions.

Features

A novel two-stage logging approach enabling fast recovery and providing high throughput while being memory efficient
A backup-side version control based on epochs to reduce memory consumption without impairing lookup performance
A highly concurrent log cleaning concept designed for handling many small data objects
A fast parallel recovery of servers storing hundreds of millions of small data objects
Optimized for SSDs

Architecture

The architecture of DXLog is thoroughly described in the following publications:

Logging: High Throughput Log-based Replication for Many Small In-memory Objects
Recovery: Fast Parallel Recovery of Many Small In-memory Objects
More details: DXRAM's Fault-Tolerance Mechanisms Meet High Speed I/O Devices

For an overview of all threads and their dependencies refer to Threads.

Special classes:

BackupRangeCatalog: This class collects all backup ranges and enables adding/removing backup ranges. A backup range contains a secondary log buffer and a version buffer which both control access to their log (secondary log or version log). For instance, to access a secondary, the BackupRangeCatalog is used to get the corresponding secondary log buffer which provides access to the secondary log.
Scheduler: This class allows communication between threads of different packages (e.g., the WriterThread triggers the ReorganizationThread to clean a secondary log).
DirectByteBufferWrapper: Depending on the hard drive access we need different ByteBuffers (heap for RandomAccessFile, direct otherwise) and access information (the array or address). This class wraps a ByteBuffer and the access information. Furthermore, the ByteBuffer is created when a DirectByteBufferWrapper is instantiated. Created direct ByteBuffers are always page-aligned.
WriteBufferTests: This class provides tests for the WriteBuffer. To enable the tests, execute DXLog with assertions ("-ea").

How to Build and Run

Requirements

DXLog requires Java 1.8 to run on a Linux distribution of your choice (MacOSX might work as well but is not supported, officially).

Building

The script build.sh bootstraps our build system which is using gradle to build DXLog. The build output is located in build/dist either as directory (dxlog) or zip-package (dxlog.zip).

LogThroughputTest: Logging and Recovery Benchmark

The dxlog jar-file contains a built in benchmark that can be run to evaluate the performance of DXLog locally on a single node.

Deploy the build output to your cluster and run DXLog by executing the script dxlog in the bin subfolder:

./bin/dxlog ./config/dxlog.json

If there is no configuration file, it will create one with default values before starting the benchmark.

The hard drive access can be configured to use either a RandomAccessFile accessing files in your file system (directory configurable, see "Usage information"), O_Direct bypassing the kernel's page cache (still using the file system) or by writing to and reading from a raw device. Using a raw device requires several steps for preparation:

Use an empty partition
If executed in nspawn container: add "--capability=CAP_SYS_MODULE --bind-ro=/lib/modules" to systemd-nspawn command in boot script
Get root access
mkdir /dev/raw
cd /dev/raw/
mknod raw1 c 162 1
modprobe raw
If /dev/raw/rawctl was not created: mknod /dev/raw/rawctl c 162 0
raw /dev/raw/raw1 /dev/empty partition
Execute DXLog as root user ("sudo -P" for nfs)

Usage information:

Args: <config_file> <log directory> <backup range size> <chunk count> <chunk size> <batch size> <workload> <number of updates> <enable recovery> <use recovery dummy>
  config_file: Path to the config file to use (e.g. ./config/dxlog.json). Creates new config with default value if file does not exist
  log directory: Path of directory to store logs in
  backup range size: The size of a backup range (half the size of a secondary log)
  chunk count: The number of chunks to log
  chunk size: The size of the chunks
  batch size: The number of chunks logged in a batch (minor impact on logging)
  workload: Workload to execute
     none: Finish after logging phase
     sequential: Update chunks in sequential order
     random: Update chunks randomly
     zipf: Update chunks according to zipf distribution
     hotncold: Update chunks according to hot-and-cold distribution
  number of updates: The number of updates
  enable recovery: True if all logged chunks should be recovered from disk after the update phase
  use recovery dummy: False to store all recovered chunks in DXMem, True to avoid writing to memory

For example, to run workload random, log 100000 chunks, update 1000000 chunks, 32 byte chunks size, run the following command:

./bin/dxlog ./config/dxlog.json /media/ssd/dxram_log/ 268435456 100000 32 10 random 1000000 true false

When using this benchmark for evaluation make sure logs from previous runs are removed from disk and space is freed:

rm /media/ssd/dxram_log/* && sudo fstrim -v /media/ssd/ && sleep 2 && ./bin/dxlog ./config/dxlog.json /media/ssd/dxram_log/ ...

Name	Name	Last commit message	Last commit date
Latest commit Stefan Nothaas Set dxbuild to 0.3.0 for release Mar 1, 2019 1192e27 · Mar 1, 2019 History 30 Commits
doc	doc	Added documentation (not complete)	Oct 14, 2018
gradle/wrapper	gradle/wrapper	First version of DXLog (functional).	Sep 24, 2018
src/main	src/main	Minor logger format fix	Dec 19, 2018
.gitignore	.gitignore	First version of DXLog (functional).	Sep 24, 2018
.travis.yml	.travis.yml	Typo fix	Mar 1, 2019
LICENSE	LICENSE	First version of DXLog (functional).	Sep 24, 2018
LICENSE_SHORT	LICENSE_SHORT	First version of DXLog (functional).	Sep 24, 2018
README.md	README.md	Updated Readme	Oct 23, 2018
artifactory.gradle	artifactory.gradle	First version of DXLog (functional).	Sep 24, 2018
bintray.gradle	bintray.gradle	First version of DXLog (functional).	Sep 24, 2018
build.gradle	build.gradle	Set dxbuild to 0.3.0 for release	Mar 1, 2019
build.sh	build.sh	build.sh Add clean type	Sep 28, 2018
gradle.properties	gradle.properties	First version of DXLog (functional).	Sep 24, 2018
gradlew	gradlew	Add execute permission to gradle wrapper	Oct 5, 2018
publish.gradle	publish.gradle	First version of DXLog (functional).	Sep 24, 2018
settings.gradle	settings.gradle	First version of DXLog (functional).	Sep 24, 2018
types.gradle	types.gradle	First version of DXLog (functional).	Sep 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DXLog: Fast object logging, reorganization and recovery.

Important

Features

Architecture

How to Build and Run

Requirements

Building

LogThroughputTest: Logging and Recovery Benchmark

License

About

Releases 3

Packages

Contributors 3

Languages

License

hhu-bsinfo/dxlog

Folders and files

Latest commit

History

Repository files navigation

DXLog: Fast object logging, reorganization and recovery.

Important

Features

Architecture

How to Build and Run

Requirements

Building

LogThroughputTest: Logging and Recovery Benchmark

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 3

Languages

Packages