Summary: Cleanable objects will perform the registered cleanups when
they are destructed. We however rather to delay this cleaning like when
we are gathering the merge operands. Current approach is to create the
Cleanable object on heap (instead of on stack) and delay deleting it.
By allowing Cleanables to delegate their cleanups to another cleanable
object we can delay the cleaning without however the need to craete the
cleanable object on heap and keeping it around. This patch applies this
technique for the cleanups of BlockIter and shows improved performance
for some in-memory benchmarks:
+1.8% for merge worklaod, +6.4% for non-merge workload when the merge
operator is specified.
https://our.intern.facebook.com/intern/tasks?t=15168163
Non-merge benchmark:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks=fillrandom
--num=1000000 -value_size=100 -compression_type=none
Reading random with no merge operator specified:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none 2>&1
Before patch:
readrandom [AVG 5 runs] : 2959194 ops/sec; 207.4 MB/sec
readrandom [MEDIAN 5 runs] : 2945102 ops/sec; 206.4 MB/sec
After patch:
readrandom [AVG 5 runs] : 2954630 ops/sec; 207.0 MB/sec
readrandom [MEDIAN 5 runs] : 2949443 ops/sec; 206.7 MB/sec
Reading random with a dummy merge operator specified:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none -merge_operator=put
Before patch:
readrandom [AVG 5 runs] : 2801713 ops/sec; 196.3 MB/sec
readrandom [MEDIAN 5 runs] : 2798286 ops/sec; 196.1 MB/sec
After patch:
readrandom [AVG 5 runs] : 2981616 ops/sec; 208.9 MB/sec
readrandom [MEDIAN 5 runs] : 2989652 ops/sec; 209.5 MB/sec
Merge benchmark:
TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench
--benchmarks=mergerandom --num=1000000 -value_size=100
compression_type=none --merge_keys=100000 -merge_operator=max
TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none -merge_operator=max
Before patch:
readrandom [AVG 5 runs] : 942688 ops/sec; 10.4 MB/sec
readrandom [MEDIAN 5 runs] : 941847 ops/sec; 10.4 MB/sec
After patch:
readrandom [AVG 5 runs] : 960135 ops/sec; 10.6 MB/sec
readrandom [MEDIAN 5 runs] : 959413 ops/sec; 10.6 MB/sec