Benchmark Configurations

Jung-Sang Ahn edited this page May 3, 2015 · 1 revision

All options for running ForestDB-Benchmark are configured in bench_config.ini file that should exist in the same directory to the benchmark executable file.

Same as ordinary INI files, it consists of several sections that also consist of several property-value pairs.

The example below is the default configuration setting:

[document]
ndocs = 1000000

[log]
filename = logs/ops_log

[db_config]
cache_size_MB = 2048
compaction_mode = auto
wbs_init_MB = 256
wbs_bench_MB = 4
bloom_bits_per_key = 0
compaction_style = level
fdb_wal = 4096
wt_type = b-tree
compression = false

[db_file]
filename = data/dummy
nfiles = 1

[population]
nthreads = 8
batchsize = 4096

[threads]
readers = 1
iterators = 0
writers = 1
reader_ops = 0
writer_ops = 0

[key_length]
distribution = normal
median = 32
standard_deviation = 2

[prefix]
level = 0
nprefixes = 100
distribution = uniform
lower_bound = 4
upper_bound = 12

[body_length]
distribution = normal
median = 512
standard_deviation = 32
compressibility = 100

[operation]
warmingup = 0
duration = 60
#nops = 1000000

batch_distribution = zipfian
batch_parameter1 = 0.0
batch_parameter2 = 8

batchsize_distribution = normal

read_batchsize_median = 5
read_batchsize_standard_deviation = 1

iterate_batchsize_median = 1000
iterate_batchsize_standard_deviation = 100

write_batchsize_median = 16
write_batchsize_standard_deviation = 2

write_ratio_percent = 1000
write_type = sync

[compaction]
threshold = 50
period = 15

[latency_monitor]
rate = 100
max_samples = 1000000

[document]

  • ndocs: The total number of documents (i.e., the working set size).

[log]

  • filename: The prefix of the path names of log files. The 32-bit Unix time value and .txt extension will be appended at the actual log file name (e.g. ops_log_1412565169.txt).

[db_config]

This section defines DB module-specific configurations.

  • cache_size_MB: The size of buffer cache that each DB module manages by itself, in the unit of MiB. Note that Couchstore does not have its own cache so that this property will be ignored in couch_bench.

  • compaction_mode: The compaction mode of ForestDB.

    compaction_mode = auto: ForestDB will enable auto-compaction daemon.

    compaction_mode = manual: The auto-compaction daemon in ForestDB is disabled, but the benchmark program will periodically invoke compaction.

  • wbs_init_MB: The size of write buffer for LevelDB and RocksDB in the unit of MiB, during the initial bulk load.

  • wbs_bench_MB: The size of write buffer for LevelDB and RocksDB in the unit of MiB, during the benchmark.

  • bloom_bits_per_key: The average bits per key for bloom filters in LevelDB and RocksDB. Bloom filters are disabled if this value is set to zero.

  • compaction_style: The compaction style of RocksDB.

    compaction_style = level: Level-style compaction (default).

    compaction_style = universal: Universal-style compaction.

  • fdb_wal: The size of WAL in ForestDB, in the unit of the number of documents.

  • wt_type: The type of indexing scheme in WiredTiger.

    wt_type = b-tree: Use B+tree-based indexing.

    wt_type = lsm-tree: Use LSM-tree-based indexing.

  • compression: Enable/disable document compression using Snappy.

[db_file]

  • filename: The prefix of the path names of DB instances.

  • nfiles: The number of DB instances that will be concurrently accessed and modified. Suppose that there are m DB instances, n documents are evenly divided and inserted into each DB instance. As a result, each DB instance will accommodate n/m documents.

    • LevelDB and RocksDB open up to 1,000 files simultaneously per instance, thus the maximum file open limit needs to be enlarged as follows: ulimit -n [max_file_open_value]

[population]

  • nthreads: The number of concurrent threads that populate each DB instance during the initial load. If this value is larger than the number of DB instances (i.e., [db_file]:nfiles), it is automatically re-adjusted to nfiles.

  • batchsize: The write batch size during the initial load.

[threads]

  • readers: The number of reader threads for the benchmark.

  • iterators: The number of iterator (range scan) threads for the benchmark.

  • writers: The number of writer threads for the benchmark.

  • reader_ops: The throughput of reader (including iterator) threads in the unit of operations per second.

    reader_ops = 0: Reader threads will be performed at their maximum capacity.

    reader_ops = 1000: Reader threads will be performed at the constant throughput of 1,000 ops/sec.

  • writer_ops: The throughput of writer threads in the unit of operations per second.

    writer_ops = 0: Writer threads will be performed at their maximum capacity.

    writer_ops = 1000: Writer threads will be performed at the constant throughput of 1,000 ops/sec.

[key_length]

This section specifies the range of document key length.

  • distribution: The distribution of key length.

    • distribution = normal: The key length generation will follow a standard normal distribution.

      median: Median (mean) value of key length.

      standard_deviation: Standard deviation of the key length distribution.

      For example, if median=32 and standard_deviation=2, the average key length will be 32 bytes, and 68.2%, 95.4%, and 99.7% of key lengths will be ranged between 30--34, 28--36, and 26--38 bytes, respectively.

    • distribution = uniform: The key length generation will follow a uniform distribution.

      lower_bound: Lower boundary value of key length.

      upper_bound: Upper boundary value of key length.

      For example, if lower_bound=28 and upper_bound=34, the key lengths will be uniformly distributed in the range of 28--34 bytes.

[prefix]

ForestDB-Benchmark supports a nested-prefix key generation, in order to make a group of key shares common prefix strings.

  • level: The depth of nested prefixes.

    level = 0: There is no common prefix between keys (e.g. aaa, bbb, ccc, and ddd).

    level = 1: One level of common prefixes between keys (e.g. aaa, aab, bba, and bbb)

    level = 2: Two levels of common prefixes between keys (e.g. aaa, aab, aba, abb, baa, bab, bba, and bbb)

  • nprefixes: The number of common prefixes for each level.

  • distribution: The distribution of prefix lengths for each level. Configuring this options is same as in [key_length] section.

[body_length]

This section specifies the range of document body (i.e., value) length. The details are same as in [key_length] section.

  • compressibility: The proportion of the length of compressible strings to the entire document body size. If this value is set to 100, the entire contents of the document body is made up with compressible data. By contrast, document body will not be compressible if this value is set to zero.

[operation]

  • warmingup: The duration of warming-up phase in the unit of second. The benchmark status during the warming-up phase will not be reflected to the benchmark results.

  • duration: The total running time of the benchmark in the unit of second.

  • nops: The total number of operations to be performed during the benchmark. It can be used together with duration.

    duration = 60: Run the benchmark for 60 seconds.

    nops = 1000000: Perform 1M operations.

    duration = 60
    nops = 1000000: The benchmark terminates 1) after 60 seconds of running, or 2) after 1M operations are performed.

  • batch_distribution: The distribution of operations to be performed.

    • batch_distribution = zipfian: Operations will follow a Zipfian (Zipf's) distribution.

      batch_parameter1: The locality parameter that characterizes the distribution. If we set this value to 0, the distribution will be exactly same to a uniform random. As the parameter gets larger, the locality of distribution becomes higher. If this value is larger than 2.0, the locality will be extremely high so that only a few documents will be mainly accessed.

      batch_parameter2: The size of a group. A group consists of several documents that share same locality.

    • batch_distribution = uniform: Operations will be performed uniform randomly over all documents.

  • batchsize_distribution: The distribution of read/write batch size.

    • batchsize_distribution = normal: Each operation batch size will follow a standard normal distribution.

      read_batchsize_median: The median value of the read batch size.

      read_betchsize_standard_deviation: The standard deviation of the read batch size.

      iterate_batchsize_median: The median value of the range scan batch size.

      iterate_batchsize_standard_deviation: The standard deviation of the range scan batch size.

      write_batchsize_median: The median value of the write batch size.

      write_betchsize_standard_deviation: The standard deviation of the write batch size.

    • batchsize_distribution = uniform: Each operation batch size will follow a standard normal distribution.

      read_batchsize_lower_bound: The lower boundary value of the read batch size.

      read_batchsize_upper_bound: The upper boundary value of the read batch size.

      iterate_batchsize_lower_bound: The lower boundary value of the range scan batch size.

      iterate_batchsize_upper_bound: The upper boundary value of the range scan batch size.

      write_batchsize_lower_bound: The lower boundary value of the write batch size.

      write_batchsize_upper_bound: The upper boundary value of the write batch size.

  • write_ratio_percent: The ratio of write operations to the total number of operations, in the unit of percent. If this value is set to a value that larger than 0 and smaller than 100, both reader and writer threads wait for the others, to strictly maintain the given ratio, and their configured throughput will be ignored. If this value is set to larger than 100, the ratio will be unrestricted so that reader and writer threads will be performed at their configured throughput.

    write_ratio_percent = 0: Only read operations will be performed.

    write_ratio_percent = 100: Only write operations will be performed.

    write_ratio_percent = 1000: Unrestricted ratio (maximum capacity).

  • write_type

    write_type = sync: DB modules will perform synchronous writes.

    write_type = async: DB modules will perform asynchronous writes.

[compaction]

  • threshold: The compaction threshold used in both ForestDB and Couchstore. If this value is set to zero, compaction will be disabled.

  • period: The period that compaction will be triggered. This option is used for both ForestDB (compaction) and WiredTiger (checkpoint).

[latency_monitor]

  • rate: The sampling rate for latency monitoring, in the unit of Hertz.

  • max_samples: The maximum number of samples reside in memory. If the number of samples exceeds the limit, old samples are automatically removed in a circular manner.