Skip to content

YCSB on 9.1G Movie Data

rockeet edited this page Aug 23, 2018 · 15 revisions

中文版

English

Table of Contents

  • 1.Introduction
  • 2.Test Method
    • 2.1.Hardware
  • 3.Write Performance
  • 4.Read Performance
    • 4.1.Data Much Smaller than Memory(Memory 64GB)
    • 4.2.Data Smaller than Memory(Memory 8GB)
    • 4.3.Data Lager than Memory(Memory 4GB)
    • 4.4.Data Much Larger than Memory(Memory 2GB)

1.Introduction

We've embedded TerarkDB into MongoDB's community distribution as storage engine,and will keep publishing benchmark reports in the future.

  • TerarkDB is a storage engine that uses RocksDB with our own SST (Static Sorted Table)
  • Mongo-Rocks is a MongoDB adapter that can uses RocksDB as a storage engine under MongoDB.
  • Mongo-Terocks is a modified Mongo-Rocks that uses TerarkDB.
  • MongoDB's version is r3.5.4

2.Test Method

  • Test Tool
  • Test Dataset
  • Test Dataset Detail
    • About 9.1GB
    • About 8 millions records
    • About 1KB each record
  • Storage Engines
  • Read testing is under Uniform Distribution and Zipf Distribution
  • We've tested 95/99 percentile latency of reading

2.1.Hardware

  • Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  • Kingston 16G @ 2133 MHz x4 (64G Total)
  • SanDisk SD8SBAT256G1122 (256G SSD)
  • Ubuntu 16.04.2 LTS

3.Write Performance

  • The graphs below are Write Speed and 95/99 Percentile LatencyInsert OPS Insert Latency

4.Read Performance

Before we start read performance testing, we write all data into database, then restart test server and begin our test.

Note that, all of the tests are under Uniform Distribution and Zipf Distribution except the one with 2GB memory.

  • The memory limitation is archived by virtual machine.
  • RocksDB enabled allow_mmap_reads option,BlockSize is 4k.
  • WiredTiger and TerarkDB uses default configurations.

4.1.Data Much Smaller than Memory(Memory 64GB)

  • These graphs are Compressed Data Size and Memory Usage
  • Since the compression ratio is not related to the memory of the reading test, we won't report it in the rest of this article. Storage Size Memory Usage
  • All the tests behind will use the same dataset.
  • YCSB client uses 240% CPU in the whole process. Read QPS Unlimited Read Latency Unlimited

4.2.Data Smaller than Memory(Memory 8GB)

  • Under this scenario, data is a little smaller than memory, we set database cache 4G (to cache de-compressed data).
    • (Wiredtiger and RocksDB both recommend use half of the physical memory as cache).
  • TerarkDB only needs 2.84G memory,much smaller than 8G, read performance is not affected.
  • 95/99 percentile latency of reading is under Uniform Distribution Read QPS 8G Read Latency_8G

4.3.Data Lager than Memory(Memory 4GB)

  • Data is a little larger than memory, we set the database cache to 2G
  • TerarkDB only needs 2.84G memory,much smaller than 4G, so TerarkDB is not affected.
  • 95/99 percentile latency of reading is under Uniform Distribution Read QPS 4G Read Latency_4G

4.4.Data Much Lager than Memory(Memory 2GB)

  • All engines don't have enough memory and will be affected
  • Bottleneck is file system IO, all engines' performance drop down rapidly
  • 95/99 percentile latency of reading is under Uniform Distribution Read QPS 2G Read Latency_2G



English

中文版

目录

  • 1.前言
  • 2.测试方式
    • 2.1.硬件配置
  • 3.写性能
  • 4.读性能
    • 4.1.数据远小于内存(内存64GB)
    • 4.2.数据略小于内存(内存8GB)
    • 4.3.数据略大于内存(内存4GB)
    • 4.4.数据远大于内存(内存2GB)

1.前言

我们将 TerarkDB 集成到了 MongoDB 社区版中,后续我们会逐步发布性能测试报告。

  • TerarkDB 是我们替换了 RocksDB 的 SST (Static Sorted Table) 后的产品
  • Mongo-Rocks 是 Facebook 官方适配 RocksDB 作为 MongoDB 存储引擎的产品
  • Mongo-Terocks 指使用 TerarkDB 的 Mongo-Rocks
  • MongoDB 版本为 r3.5.4

2.测试方式

  • 测试工具
  • 测试数据
  • 测试数据集尺寸
    • 约为9.1GB
    • 约800万条数据
    • 平均每条数据大约1KB
  • 测试使用的引擎
  • 读性能测试均是均匀分布齐普夫(Zipf)分布测试
  • 这里测量了读95/99分位延迟数据

2.1.硬件配置

  • Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  • Kingston 16G @ 2133 MHz x4 (64G Total)
  • SanDisk SD8SBAT256G1122 (256G SSD)
  • Ubuntu 16.04.2 LTS

3.写性能

  • 以下为写入速度95/99分位延迟数据Insert OPS Insert Latency

4.读性能

我们在开始读性能测试之前,首先批量的将所有数据写入数据库,然后重启服务器后开始测试。需要注意的是,除了数据远小于内存,其它的的读测试均是均匀分布齐普夫分布测试。

  • 内存受限情况,我们使用虚拟机达成
  • 其中 RocksDB 开启 allow_mmap_reads 选项,BlockSize 为 4k
  • WiredTiger 与 TerarkDB 使用默认配置选项

4.1.数据远小于内存(内存64GB)

  • 以下为数据压缩后大小内存占用

  • 由于压缩后数据库的尺寸(Storage Size)与读测试的内存限制无关,后面不再重复 Storage Size 图表 Storage Size Memory Usage

  • 后续所有测试都使用同一份数据

  • YCSB客户端全程占用 240% 以上CPU Read QPS Unlimited Read Latency Unlimited

4.2.数据略小于内存(内存8GB)

  • 此种情况下内存比数据略大,设置数据库专用缓存(缓存解压后的数据) 4G
    • (Wiredtiger 和 RocksDB 官方均推荐配置该缓存占物理内存一半)
  • TerarkDB 需要的内存只有 2.84G,远小于8G,不影响性能
  • 读95/99分位延迟数据为均匀分布测试结果 Read QPS 8G Read Latency_8G

4.3.数据略大于内存(内存4GB)

  • 此种情况下内存比数据略大,设置缓存2G
  • TerarkDB 需要的内存只有 2.84G,远小于4G,不影响性能
  • 读95/99分位延迟数据为均匀分布测试结果 Read QPS 4G Read Latency_4G

4.4.数据大于内存(内存2GB)

  • 此种情况下所有存储引擎都达不到需要的内存
  • 瓶颈在于文件IO,所有引擎的速度严重下降
  • 读95/99分位延迟数据为均匀分布测试结果 Read QPS 2G Read Latency_2G