-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid trivial move if SST file is far smaller than the target size #8306
Comments
One possibility is the small files are trivial moved from L0 -> L6 (as they're already sorted, compaction is skipped), especially in your log, cumulative compaction write is 0. |
I have 928 lines with
The only possible way is the manual compaction where the files are finally merged/compacted together, but that takes long time and my biggest problem that the whole DB is not accessible during this time. It would be great when they would do during normal work. |
One suggestion we have is to set rocksdb/include/rocksdb/options.h Line 216 in a639c02
Because it's different from your other level compression (snappy from your log), it will force the compaction on the last level, to reduce the file number and also reduce your storage footprint (zstd compress much better than snappy).
Just fyi. you may need to set manual compaction rocksdb/include/rocksdb/options.h Line 1578 in a639c02
In general, Rocksdb try to reduce write amplification by avoid any unnecessary compaction. The trade-off is more small files.
Manual compaction is triggered by the user and running in the background, the DB is still accessible during that time.
I guess you have level_compaction_dynamic_level_bytes enabled, which again try to reduce write amplification by moving the sst files to higher level faster. rocksdb/include/rocksdb/advanced_options.h Lines 472 to 473 in a4919d6
I think your usecase is special:
So I think change |
I need to reopen this issue, sadly your suggestions did not help. You can see here that an SST file has been created and immediately trivial moved from lvl1 to lvl6 without any compaction:
I know that the data size is little, but the result is simple CRAZY:
The files NEED to be merged together and the DB did not do that and i dont know why not..
|
Your bottommost_compression is still disabled:
|
Okay, but I dont need that. At this time all levels (except 0) will be compressed by snappy and i'm fine with that. Even the compression (should) not do anything with the COMPACTION. There is two questions:
|
FYI., for Rocksdb, bottommost level is the highest level, Level 6 in your case.
Rocksdb try to reduce the write amplification, merging small files to a large one needs to do IO, which increases the write amplification. target_file_size is a target, to reduce IO, it may not always be the target file size. |
Yeah, i have much bigger problem when i have 60k files in my directory. FYI after manual compaction will have about 15k files and about 20-30GB less size which are matters a lot.. you know, lot of small files greatly reduces performance. |
Maybe we should change RocksDB to not do trivial move on input files whose size is way different from the configured target output file size. I can think of two cases this condition would be expected: (1) compacting from L0 to base level; or (2) compacting base level+ to the next level when |
Hey! Sadly the |
It only do non-trivial-move compaction for the bottommost level, not the other levels. On the other hand, bottommost level has the majority of your data, so it should compact most of the file together. You may need to run a manual compaction after changing the compression option. You can check if there's still small files compressed with LZ4, or bottommost files are not compressed with LZ4. |
@ajkr how should we improve the compaction picker to do that? If a small SST is selected for compaction and has no overlap in the next level, should it add or wait another file on the same input level to do the compaction? or force to compaction with other files on the output level which has no overlap? |
Hey, sorry for the late response, i have just resynced my database with the latest options. A long LOG file, options and stats: My DB is now 616 GB with 69754 file. After compaction this would be about 24k file and 610 GB. |
DB sync almost done.
|
My db has been synced and uses: |
It'd be easier to expand inputs to include one more file on the same input level. |
Greetings everyone!
I'm using
RocksDB version: 6.8.1
with many column families. I would like to grab just one CF out for examine, because i can not figure out why doesnt do the compaction what i excepted. This CF got just INSERT without any DELETE OP, and the DB got just few restarts during his life.LOG stats about this CF:
Options for this CF:
So the target file size should be 128MB, but when we calculate level nr 6 file sizes then i have average 2.5mb. Optimal it should be just about 8 files in the level nr 6.
Thanks for the help!
The text was updated successfully, but these errors were encountered: