Igor Canadi edited this page Feb 12, 2018 · 7 revisions

Welcome to MongoRocks wiki

MongoRocks is MongoDB with RocksDB as a storage engine. It's plugging into MongoDB though the storage engine API, which was released as part of MongoDB 3.0: http://docs.mongodb.org/manual/faq/storage/

To report bugs, ask questions or leave feedback, please use MongoRocks Google Group: https://groups.google.com/forum/#!forum/mongo-rocks

Repositories

In version 3.0, RocksDB code is part of main Mongo repository. In 3.2 and going forward, the code for MongoRocks will be a separate module. There are currently two repositories:

  • https://github.com/mongodb-partners/mongo-rocks -- This is the repository for MongoRocks module for versions 3.2 and going forward. It's still in active development, as is MongoDB version 3.2.
  • https://github.com/mongodb-partners/mongo -- This is the fork of MongoDB repository and it's used for developing version 3.0. There is one development branch and many release branches:
  • v3.0-fb -- Development branch. This is where all the new commits and fixes go.
  • v3.0.8-mongorocks -- MongoRocks 3.0.8 release. We will keep releasing new versions in format 'v3.0.x-mongorocks'

This all matters to you if you wish to compile from source. If not, we will publish binaries to our Google Group regularly.

Features

For the most part, running MongoDB with RocksDB storage engine should be transparent to the user. However, there are some cool features and configuration options that can make your experience even better.

Backups

RocksDB's files are immutable. This means that backups are easy and fast: 1. Find a list of live files, 2. hard link to a different directory (copy if the destination is on the different file system). To issue a backup, you can call:

db.adminCommand({setParameter:1, rocksdbBackup: "/var/lib/mongodb/backup/1"})

(Yes, we're aware that it's a bit silly to use setParameter API to issue backups. We're planning to move to MongoDB's command API in 3.2)

This will create a directory /var/lib/mongodb/backup/1 (it should not exist before) and hardlink all the relevant files. You can then copy those files to S3 or HDFS in the background. We're building a tool that will incrementally backup MongoRocks to S3. Keep an eye on MongoRocks Google Group for the announcement.

Compact the database instance

RocksDB's writes are very fast because bulk of the work is done in the background in a process called compaction. Compactions are automatically triggered when the state of LSM tree becomes non-ideal. However, you can also trigger the compaction manually. After the compaction is done your reads will be faster and space used on disk a bit smaller (approximately 10%). To schedule a manual compaction, you can call:

db.adminCommand({setParameter:1, rocksdbCompact: 1})

Configuration

Configuring RocksDB is a bit of an art. We hope that the default configuration will be good for most cases, but you can always get better performance by tuning, especially if your workload is special in some way.

There are couple of parameters you configure:

  • --rocksdbCacheSizeGB or storage.rocksdb.cacheSizeGB -- size of RocksDB's block cache. By default 30% of RAM. We keep uncompressed pages in the block cache and compressed pages in the kernel's page cache. You can also configure block cache size dynamically by calling: db.adminCommand({setParameter:1, rocksdbRuntimeConfigCacheSizeGB: 10})
  • --rocksdbCompression or storage.rocksdb.compression -- compression. By default this is snappy. Other available options are none and zlib. If your binary doesn't support the requested compression, opening the database will fail.
  • --rocksdbConfigString or storage.rocksdb.configString -- through this parameter you can configure all other RocksDB options. The format is the same as option string described here: https://github.com/facebook/rocksdb/wiki/Option-String-and-Option-Map#option-string
  • --rocksdbMaxWriteMBPerSec or storage.rocksdb.maxWriteMBPerSec -- default is 1024. RocksDB compactions can create spiky writes to IO, which can cause higher P99 storage read latency. You can use this option to smooth our the writes. For example, if you set this to 100MB/s, RocksDB will make sure to never write more than 100MB/s to storage. That way writes will be smoother and there will be storage bandwidth available for reads to go through. You can also change this value dynamically by calling db.adminCommand({setParameter:1, rocksdbRuntimeConfigMaxWriteMBPerSec:30})
  • --rocksdbCrashSafeCounters or storage.rocksdb.crashSafeCounters -- false by default. This means that if your database performs an unclean shutdown, the counters for number of records in a collections might be wrong. You can correct them with MongoDB's validate call. This is similar to WiredTiger behavior. If you set this option to true, then we'll make sure that counters are correct even after a crash. Write performance might suffer a bit, of course.

Things to be careful about

In this section, we'll write about any issues that might happen when running MongoRocks and how to fix them. Currently, we're aware of one thing to be careful about.

Number of file descriptors

By default, Linux kernel allows each process to use only 1024 file descriptors. MongoRocks is configured in such a way that it's using 32MB files, so if your database is 1TB in size, you'll need 32K files. Before running MongoRocks, please increase the number of file descriptors that MongoDB process can use. Here's the recommended setting of ulimit from MongoDB's docs: http://docs.mongodb.org/manual/reference/ulimit/#recommended-ulimit-settings

Monitoring and debugging

Run db.serverStatus()["rocksdb"] and enjoy. We'll write a separate wiki page explaining what all of this means. In the meantime, you can start here: https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#compaction-stats

Blog posts

These blog posts describe Parse's experiences with MongoDB with RocksDB:

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.