Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 2289: Tune RocksDB settings #2763

Conversation

RaulGracia
Copy link
Contributor

Change log description
This PR exposes 3 parameters of RocksDB as "turning knobs" in the Pravega configuration file (writeBufferSizeMB, readCacheSizeMB, and cacheBlockSizeKB) that will take effect when instantiating RocksDB cache instances (RocksDBCache.java). We expect these parameters to be enough for a user to set an adequate RocksDB configuration in terms of memory usage/performance for a specific deployment.

Purpose of the change
Fixes #2289.

What the code does
This PR exposes in the Pravega configuration file parameters to tune RocksDB. To keep configuration complexity low, we decided to expose three main "turning knobs":

  • writeBufferSizeMB: RocksDB allows to buffer writes in-memory (memtables) to improve write performance, thus executing an async flush process of writes to disk. This parameter bounds the maximum amount of memory devoted to absorb writes.
  • readCacheSizeMB: RocksDB caches (uncompressed) data blocks in memory to serve read requests with high performance in case of a cache hit. This parameter bounds the maximum amount of memory devoted to cache uncompressed data blocks.
  • cacheBlockSizeKB: RocksDB stores data in memory related to internal indexes (e.g., it may range between 5% to 30% of the total memory consumption depending on the configuration and data at hand). The size of the internal indexes in RocksDB mainly depend on the size of cached data blocks. If you increase this parameter, the number of blocks will decrease, so the index size will also reduce linearly (but increasing read amplification).

Note that this PR only provides the means for configuring RocksDB. While some local tests have been done, there is still a need for a thorough analysis to understand the impact of RocksDB configuration on the performance of a Pravega deployment.

How to verify it
All system test should pass as before. With the default settings, longevity runs should exhibit a similar behavior as the one described in #2760.

@RaulGracia
Copy link
Contributor Author

@RaulGracia RaulGracia force-pushed the issue-2289-rocksdb-performance-and-memory-limit branch from 407607a to 131b2b6 Compare July 27, 2018 17:13
@codecov
Copy link

codecov bot commented Jul 27, 2018

Codecov Report

Merging #2763 into master will increase coverage by 0.08%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2763      +/-   ##
============================================
+ Coverage     78.18%   78.27%   +0.08%     
- Complexity     7112     7156      +44     
============================================
  Files           547      548       +1     
  Lines         27924    28102     +178     
  Branches       2619     2634      +15     
============================================
+ Hits          21833    21997     +164     
+ Misses         4554     4549       -5     
- Partials       1537     1556      +19
Impacted Files Coverage Δ Complexity Δ
...egmentstore/storage/impl/rocksdb/RocksDBCache.java 68.8% <100%> (+4.22%) 14 <1> (ø) ⬇️
...gmentstore/storage/impl/rocksdb/RocksDBConfig.java 100% <100%> (ø) 7 <3> (+3) ⬆️
...o/pravega/client/stream/impl/SegmentWithRange.java 84.61% <0%> (-7.06%) 8% <0%> (ø)
...oller/server/rpc/auth/StrongPasswordProcessor.java 87.87% <0%> (-6.07%) 9% <0%> (-1%)
...a/io/pravega/controller/store/stream/ZKStream.java 96.12% <0%> (-2.89%) 138% <0%> (+6%)
...tore/server/reading/RedirectedReadResultEntry.java 86.88% <0%> (-1.64%) 25% <0%> (-1%)
...ga/controller/task/Stream/StreamMetadataTasks.java 83.44% <0%> (-1.02%) 133% <0%> (-2%)
.../server/logs/SegmentMetadataUpdateTransaction.java 88.39% <0%> (-0.9%) 82% <0%> (-1%)
.../io/pravega/client/stream/impl/ControllerImpl.java 82.71% <0%> (-0.28%) 129% <0%> (ø)
...a/segmentstore/server/logs/OperationProcessor.java 84.92% <0%> (ø) 40% <0%> (+1%) ⬆️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5360823...3fabd1a. Read the comment docs.

…cksDB and added statistics tracking.

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
… of RocksDB. Closing statistics object.

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
…in RocksDB that makes the memory used for index and filters bounded by block cache size.

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
…age.

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
…ache.

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
@RaulGracia RaulGracia force-pushed the issue-2289-rocksdb-performance-and-memory-limit branch from 0f17023 to 3cddce9 Compare July 28, 2018 08:53
@@ -55,6 +66,9 @@
private final String dbDir;
private final String logId;
private final Consumer<String> closeCallback;
private final Integer writeBufferSizeMB;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"int", not "Integer".

fix below too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in RocksDBCache.java and RocksDBConfig.java.

# number of blocks will decrease, so the index size will also reduce linearly. For an explanation in depth of these
# parameters, we refer to: https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB.

#rocksdb.writeBufferSizeMB=64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really helpful to a sys admin if each of these config settings has its own individual description. What I've done throughout this file was to simply copy the Javadoc from each of these here, and add a Recommended value to each.

This is in addition to the blob of text you added just above this.

Copy link
Contributor Author

@RaulGracia RaulGracia Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added the description of each parameter with the comment available in RocksDBConfig.java. Moreover, I think that a "recommended value" for these parameters is somewhat complex in this context as it may depend on various aspects (e.g., resources, requirements). For this reason, I will create a Wiki page in the Pravega repository (https://github.com/pravega/pravega/wiki/RocksDB-Configuration) including experiments and guidelines for users to configure RocksDB depending on their needs. Does this sound reasonable?

Signed-off-by: Raúl Gracia <raul.gracia@emc.com>
@andreipaduroiu andreipaduroiu merged commit 0cc47c3 into pravega:master Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants