MySQLOnRocksDB/mysql-5.6
forked from facebook/mysql-5.6

Loading…
MyRocks data size is greater than InnoDB #80
Excellent, can you also send me the values of any rocksdb config options set in my.cnf? Will take me a few hours to respond.
There is no rocksdb configuration in my.cnf, so all is default.
Can you tell me what is in your RocksDB LOG file (name "LOG") for "Compression algorithms supported"? Mine shows:
2015/06/04-04:33:23.366528 7faff86d38c0 Compression algorithms supported:
2015/06/04-04:33:23.366530 7faff86d38c0 Snappy supported: 1
2015/06/04-04:33:23.366531 7faff86d38c0 Zlib supported: 1
2015/06/04-04:33:23.366540 7faff86d38c0 Bzip supported: 1
2015/06/04-04:33:23.366542 7faff86d38c0 LZ4 supported: 1
Could you try the following my.cnf settings and share results?
This makes MyRocks use zlib level 2 compression for most levels (compression_per_level and compression_opts), and locates files more efficiently (level_compaction_dynamic_level_bytes). By default RocksDB block size is 4KB, and increasing to 16KB will reduce some space.
rocksdb_block_size=16384
rocksdb_max_total_wal_size=4096000000
rocksdb_block_cache_size=12G
rocksdb_default_cf_options=write_buffer_size=128m;target_file_size_base=32m;max_bytes_for_level_base=512m;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=10;level0_stop_writes_trigger=15;max_write_buffer_number=4;compression_per_level=kNoCompression:kNoCompression:kSnappyCompression:kZlibCompression:kZlibCompression:kZlibCompression:kZlibCompression;compression_opts=-14:2:0;block_based_table_factory={cache_index_and_filter_blocks=1;filter_policy=bloomfilter:10:false;whole_key_filtering=0;};prefix_extractor=capped:20;level_compaction_dynamic_level_bytes=true;optimize_filters_for_hits=true
Yoshi - before trying to tune we need to confirm that compression was enabled during his RocksDB build. Then we can tune. MyRocks has a lousy RocksDB configuration and this issue can be kept open for that. Using my test server the defaults are:
Started an instance locally with default my.cnf:
Options.max_open_files: 5000
Options.max_background_compactions: 1
Options.max_background_flushes: 1
--> max_open_files should be larger, max_background_compactions and max_background_flushes should be >= 4 for many systems
Compression algorithms supported:
Snappy supported: 1
Zlib supported: 1
Bzip supported: 1
LZ4 supported: 1
cache_index_and_filter_blocks: 1
index_type: 0
hash_index_allow_collision: 1
checksum: 1
no_block_cache: 0
block_cache: 0x7faff4c88078
block_cache_size: 8388608
block_cache_compressed: (nil)
block_size: 4096
block_size_deviation: 10
block_restart_interval: 16
filter_policy: nullptr
format_version: 2
--> block_size should be larger, since many blocks will be compressed and 4kb compressed to much less than 4kb can waste IO when file system page size is 4k
Options.write_buffer_size: 4194304
Options.max_write_buffer_number: 2
Options.compression: Snappy
Options.num_levels: 7
--> should use a larger value for write_buffer_size, default is 4M, maybe 64M
Options.min_write_buffer_number_to_merge: 1
--> probably OK
Options.level0_file_num_compaction_trigger: 4
Options.level0_slowdown_writes_trigger: 20
Options.level0_stop_writes_trigger: 24
--> probably OK
Options.target_file_size_base: 2097152
Options.max_bytes_for_level_base: 10485760
--> ugh, maybe 32MB for target_file_size_base and 512MB for max_bytes_for_level_base. Default here means that sizeof(L0) is 10M
Options.level_compaction_dynamic_level_bytes: 0
--> we want this to be 1
Options.soft_rate_limit: 0.00
Options.hard_rate_limit: 0.00
--> want these to be set, maybe 2.5 for soft and 3.0 for hard
This issue due to: *.sst not properly cleaned on DROP DATABASE.
I clean the .rocksdb dir and re-install database and to the same benchmark with yoshinorim's configurations, it's OK now to me:
datasize: 19GB (snappy)
one 33MB sst dump:
$./sst_dump --show_properties --file=../../myrocks_mysql/data/.rocksdb/002120.sst
from [] to []
Process ../../myrocks_mysql/data/.rocksdb/002120.sst
Sst file format: block-based
Table Properties:
------------------------------
# data blocks: 3840
# entries: 310960
raw key size: 4975360
raw average key size: 16.000000
raw value size: 57838560
raw average value size: 186.000000
data block size: 33559362
index block size: 126490
filter block size: 0
(estimated) table size: 33685852
filter policy name: rocksdb.BuiltinBloomFilter
# deleted keys: 0
(4975360+57838560)/33559362 ~1.8X ratio
Another question: how to find the mapping between table and *.sst files?
We have not started implementing mappings between table and *.sst files yet. I'll file another task to track this.
We can also dump out some of the rocksdb configuration options out through information schema, rather than have to look through the rocksdb LOG file:
select * from information_schema.rocksdb_cf_options;
The db options are mostly available through:
show global variables like 'rocksdb%';
@BohuTANG : BTW you can get MyRocks each table size via usual MySQL commands (SHOW TABLE STATUS or SELECT FROM information_schema.tables). Use these commands and compare compression ratio between tables. MyRocks calculates statistics every 600 seconds, and can be configured via rocksdb_stats_dump_period_sec global variable. And note that SHOW TABLE STATUS / I_S do not include size in Memstore (we're working in progress to include size from Memstore, not only from *sst).
BohuTANG, optimize table t1; will run manual compaction for the table. However, if you already dropped the table, I can't think of an easy way to trigger compaction. One thing which you can do is to stop mysql and then use the ldb tool to run compaction. If the space is important than the deletion speed, may be you can do truncate table t1;optimize table t1;drop table t1; after https://reviews.facebook.net/D39579 is pushed.
There is no correspondence between .sst files and tables or databases. The data is spread out among sst files in the order of insertion and then intermixed through compaction process.
Yoshi, I have this task: #55 to expose what is stored in each sst files through the information schema.
From our benchmarks under the same datasets for MyRocks/InnoDB/TokuDB, data sizes are:
All configuration of MyRocks is in defaults, the 'show engine rocksdb status' as follows:
and