Delegate Cleanables #1711

maysamyabandeh · 2016-12-22T02:28:22Z

Summary: Cleanable objects will perform the registered cleanups when
they are destructed. We however rather to delay this cleaning like when
we are gathering the merge operands. Current approach is to create the
Cleanable object on heap (instead of on stack) and delay deleting it.

By allowing Cleanables to delegate their cleanups to another cleanable
object we can delay the cleaning without however the need to craete the
cleanable object on heap and keeping it around. This patch applies this
technique for the cleanups of BlockIter and shows improved performance
for some in-memory benchmarks:
+1.8% for merge worklaod, +6.4% for non-merge workload when the merge
operator is specified.
https://our.intern.facebook.com/intern/tasks?t=15168163

Non-merge benchmark:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks=fillrandom
--num=1000000 -value_size=100 -compression_type=none

Reading random with no merge operator specified:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none 2>&1

Before patch:
readrandom [AVG 5 runs] : 2959194 ops/sec; 207.4 MB/sec
readrandom [MEDIAN 5 runs] : 2945102 ops/sec; 206.4 MB/sec
After patch:
readrandom [AVG 5 runs] : 2954630 ops/sec; 207.0 MB/sec
readrandom [MEDIAN 5 runs] : 2949443 ops/sec; 206.7 MB/sec

Reading random with a dummy merge operator specified:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none -merge_operator=put

Before patch:
readrandom [AVG 5 runs] : 2801713 ops/sec; 196.3 MB/sec
readrandom [MEDIAN 5 runs] : 2798286 ops/sec; 196.1 MB/sec
After patch:
readrandom [AVG 5 runs] : 2981616 ops/sec; 208.9 MB/sec
readrandom [MEDIAN 5 runs] : 2989652 ops/sec; 209.5 MB/sec

Merge benchmark:
TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench
--benchmarks=mergerandom --num=1000000 -value_size=100
compression_type=none --merge_keys=100000 -merge_operator=max

TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench
--benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000
--reads=10000000 --cache_size=10000000000 -threads=32
-compression_type=none -merge_operator=max

Before patch:
readrandom [AVG 5 runs] : 942688 ops/sec; 10.4 MB/sec
readrandom [MEDIAN 5 runs] : 941847 ops/sec; 10.4 MB/sec
After patch:
readrandom [AVG 5 runs] : 960135 ops/sec; 10.6 MB/sec
readrandom [MEDIAN 5 runs] : 959413 ops/sec; 10.6 MB/sec

facebook-github-bot · 2016-12-22T02:29:58Z

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: Cleanable objects will perform the registered cleanups when they are destructed. We however rather to delay this cleaning like when we are gathering the merge operands. Current approach is to create the Cleanable object on heap (instead of on stack) and delay deleting it. By allowing Cleanables to delegate their cleanups to another cleanable object we can delay the cleaning without however the need to craete the cleanable object on heap and keeping it around. This patch applies this technique for the cleanups of BlockIter and shows improved performance for some in-memory benchmarks: +1.8% for merge worklaod, +6.4% for non-merge workload when the merge operator is specified. https://our.intern.facebook.com/intern/tasks?t=15168163 Non-merge benchmark: TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks=fillrandom --num=1000000 -value_size=100 -compression_type=none Reading random with no merge operator specified: TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000 --reads=10000000 --cache_size=10000000000 -threads=32 -compression_type=none 2>&1 Before patch: readrandom [AVG 5 runs] : 2959194 ops/sec; 207.4 MB/sec readrandom [MEDIAN 5 runs] : 2945102 ops/sec; 206.4 MB/sec After patch: readrandom [AVG 5 runs] : 2954630 ops/sec; 207.0 MB/sec readrandom [MEDIAN 5 runs] : 2949443 ops/sec; 206.7 MB/sec Reading random with a dummy merge operator specified: TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000 --reads=10000000 --cache_size=10000000000 -threads=32 -compression_type=none -merge_operator=put Before patch: readrandom [AVG 5 runs] : 2801713 ops/sec; 196.3 MB/sec readrandom [MEDIAN 5 runs] : 2798286 ops/sec; 196.1 MB/sec After patch: readrandom [AVG 5 runs] : 2981616 ops/sec; 208.9 MB/sec readrandom [MEDIAN 5 runs] : 2989652 ops/sec; 209.5 MB/sec Merge benchmark: TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench --benchmarks=mergerandom --num=1000000 -value_size=100 compression_type=none --merge_keys=100000 -merge_operator=max TEST_TMPDIR=/dev/shm/v100nocomp-merge/ ./db_bench --benchmarks="readseq,readrandom[X5]" --use_existing_db --num=1000000 --reads=10000000 --cache_size=10000000000 -threads=32 -compression_type=none -merge_operator=max Before patch: readrandom [AVG 5 runs] : 942688 ops/sec; 10.4 MB/sec readrandom [MEDIAN 5 runs] : 941847 ops/sec; 10.4 MB/sec After patch: readrandom [AVG 5 runs] : 960135 ops/sec; 10.6 MB/sec readrandom [MEDIAN 5 runs] : 959413 ops/sec; 10.6 MB/sec

facebook-github-bot · 2016-12-22T19:16:55Z

@maysamyabandeh updated the pull request - view changes - changes since last import

facebook-github-bot · 2016-12-22T19:19:45Z

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

maysamyabandeh · 2016-12-22T20:26:00Z

The sandcastle job fails with this error:

make: *** No rule to make target `table/cleanable_test.o', needed by `cleanable_test'.  Stop.

Not sure what is the problem since it complies fine in my dev machine and finds the implicit rule as for other objects in the Makefile. Here is the change to the Makefile:

diff --git a/Makefile b/Makefile
index d768647..11220ff 100644
--- a/Makefile
+++ b/Makefile
@@ -324,6 +324,7 @@ TESTS = \
        db_properties_test \
        db_table_properties_test \
        autovector_test \
+       cleanable_test \
        column_family_test \
        table_properties_collector_test \
        arena_test \
@@ -1147,6 +1148,9 @@ full_filter_block_test: table/full_filter_block_test.o $(LIBOBJECTS) $(TESTHARNE
 log_test: db/log_test.o $(LIBOBJECTS) $(TESTHARNESS)
        $(AM_LINK)
 
+cleanable_test: table/cleanable_test.o $(LIBOBJECTS) $(TESTHARNESS)
+       $(AM_LINK)
+
 table_test: table/table_test.o $(LIBOBJECTS) $(TESTHARNESS)
        $(AM_LINK)

@IslamAbdelRahman any idea?

IslamAbdelRahman

I like this a lot

IslamAbdelRahman · 2016-12-22T20:22:53Z

Makefile

@@ -1147,6 +1148,9 @@ full_filter_block_test: table/full_filter_block_test.o $(LIBOBJECTS) $(TESTHARNE
 log_test: db/log_test.o $(LIBOBJECTS) $(TESTHARNESS)
 	$(AM_LINK)

+cleanable_test: table/cleanable_test.o $(LIBOBJECTS) $(TESTHARNESS)


looks to me this is not needed

IslamAbdelRahman · 2016-12-22T20:24:27Z

db/pinned_iterators_manager.h

@@ -16,7 +16,7 @@ namespace rocksdb {
 // PinnedIteratorsManager will be notified whenever we need to pin an Iterator
 // and it will be responsible for deleting pinned Iterators when they are
 // not needed anymore.
-class PinnedIteratorsManager {
+class PinnedIteratorsManager : public Cleanable {


now the cleanup code will be called in the destructor, I think we must change that so that it's called in ReleasePinnedData()

facebook-github-bot · 2016-12-22T21:42:26Z

@maysamyabandeh updated the pull request - view changes - changes since last import

facebook-github-bot · 2016-12-22T21:45:16Z

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

IslamAbdelRahman · 2016-12-23T01:00:33Z

table/iterator.cc

+  cleanup_.next = nullptr;
+}
+
+// TODO(myabandeh): if the list is too long we should maintain a tail pointer


Let's add the possible optimizations/bottlenecks we discussed offline in the comments here

maysamyabandeh · 2016-12-23T02:38:23Z

The sandcastle tests failures seems false positives. The core dump shows the following:

Core was generated by `./external_sst_file_test --gtest_filter=ExternalSSTFileTest.PickedLevelDynamic'.

(gdb) bt
#0  0x00002ab31ceb7cfa in  ()
#1  0x00000000005283a5 in __gnu_cxx::__exchange_and_add_single (__val=-1, __mem=0x7ffc1ce68714)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/ext/atomicity.h:68
#2  0x00000000005283a5 in __gnu_cxx::__exchange_and_add_dispatch (__val=-1, __mem=0x7ffc1ce68714)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/ext/atomicity.h:84
#3  0x00000000005283a5 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (this=0x7ffc1ce68708)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/bits/shared_ptr_base.h:163
#4  0x00000000005283a5 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() (this=0x7ffc1ce68288, __in_chrg=<optimized out>)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/bits/shared_ptr_base.h:666
#5  0x00000000005283a5 in std::__shared_ptr<rocksdb::FilterPolicy const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() (this=0x7ffc1ce68280, __in_chrg=<optimized out>)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/bits/shared_ptr_base.h:914
#6  0x00000000005283a5 in std::shared_ptr<rocksdb::FilterPolicy const>::~shared_ptr() (this=0x7ffc1ce68280, __in_chrg=<optimized out>)
    at /mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.9.x/centos6-native/108cf83/include/c++/4.9.x-google/bits/shared_ptr.h:93
#7  0x00000000005283a5 in rocksdb::anon::OptionsOverride::~OptionsOverride() (this=0x7ffc1ce68280, __in_chrg=<optimized out>) at ./db/db_test_util.h:114
#8  0x00000000005283a5 in rocksdb::DBTestBase::OptionsForLogIterTest() (this=<optimized out>) at db/db_test_util.cc:951
#9  0x00000000005285ac in rocksdb::DBTestBase::ChangeCompactOptions() (this=0x7ffc1ce68700) at db/db_test_util.cc:166
#10 0x0000000000000000 in  ()

which starts at this line

166         auto options = CurrentOptions();

Which seems irrelevant to the changes made by this patch. Moreover it passed when run locally:

[?0] myabandeh@devvm17154:~/rocksdb[cleanabledelagate)]$ ./external_sst_file_test --gtest_filter=ExternalSSTFileTest.PickedLevelDynamic
Note: Google Test filter = ExternalSSTFileTest.PickedLevelDynamic
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ExternalSSTFileTest
[ RUN      ] ExternalSSTFileTest.PickedLevelDynamic
[       OK ] ExternalSSTFileTest.PickedLevelDynamic (10 ms)
[----------] 1 test from ExternalSSTFileTest (10 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (10 ms total)
[  PASSED  ] 1 test.

The failure from universal compaction test also seems irrelevant:

[----------] 1 test from UniversalCompactionNumLevels/DBTestUniversalCompaction
[ RUN      ] UniversalCompactionNumLevels/DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest2/1
db/db_universal_compaction_test.cc:1068: Failure
Value of: 0
Expected: non_trivial_move
Which is: 1

and it passes when run locally.

and minor refactoring

facebook-github-bot · 2016-12-23T03:07:37Z

@maysamyabandeh updated the pull request - view changes - changes since last import

facebook-github-bot · 2016-12-23T03:08:17Z

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

IslamAbdelRahman

LGTM, I have minor comments

IslamAbdelRahman · 2016-12-27T19:26:27Z

table/cleanable_test.cc

+//  LICENSE file in the root directory of this source tree. An additional grant
+//  of patent rights can be found in the PATENTS file in the same directory.
+//
+// Copyright (c) 2011 The LevelDB Authors. All rights reserved.


I dont think we need to add this LevelDB section for new files

IslamAbdelRahman · 2016-12-28T23:46:13Z

table/block_based_table_reader.cc

-          biter = &stack_biter;
-          NewDataBlockIterator(rep_, read_options, iiter.value(), biter);
-        }
+        biter = &stack_biter;


let's use remove BlockIter* biter in line 1542
and use BlockIter biter; here only. I think that how it used to be before I changed it

IslamAbdelRahman · 2016-12-28T23:52:51Z

table/iterator.cc

-Cleanable::~Cleanable() {
+Cleanable::~Cleanable() { DoCleanup(); }
+
+void Cleanable::Reset() {


Let's just have one function Reset(), instead of having 2

Reset is more expensive than DoCleanup as it does two value assignment too. We do not need that when the object is being destructed. I was not sure about penalty but since cleanup is heavily used my preference was not to add any cost to its destruction.

I don't think the 2 extra assignments should be a concern (except if benchmarks showed that they are of course)
This was just a suggestion, final decision is up to you

You are right but since Cleanable is inherited in many objects, any small increase in construction/deconstruction cost could potentially be observable. I am more inclined to keep the cost the same as it was before.

IslamAbdelRahman · 2016-12-29T00:00:58Z

table/iterator.cc

+    cleanup_.function = c->function;
+    cleanup_.arg1 = c->arg1;
+    cleanup_.arg2 = c->arg2;
+    delete c;


I think the caller should be responsible for cleanup, what do you think ?
It's also confusing that we cleanup in one branch and not in the other

I went back and forth between these two approaches and finally went with doing the delete inside RegisterCleanup. My argument was that there are two cases:

RegisterCleanup takes the ownership and adds the object to its linkedlist

RegisterCleanup does not need the object and want it deleted
I found it more consistent if RegisterCleanup takes the full ownership and takes care of both cases. Otherwise it would take ownership when it needs the object and would not do that when it wanted to be deleted, which is an inconsistent behavior.

Sounds good to me

IslamAbdelRahman · 2016-12-29T00:04:00Z

table/iterator.cc

+  }
+  Cleanup* c = &cleanup_;
+  other->RegisterCleanup(c->function, c->arg1, c->arg2);
+  c = c->next;


I think it's better to remove line 64,65 and just enter the loop directly
this is also related to my comment of not doing cleanup inside RegisterCleanup

But the process for delegating the first Cleanup is different from the rest: the first is an embedded object and its values needs to be copied while the rest are on heap and they can be simply be added to the destination linked list.

yes, my bad

IslamAbdelRahman · 2016-12-29T00:05:00Z

db/pinned_iterators_manager.h

@@ -16,7 +16,7 @@ namespace rocksdb {
 // PinnedIteratorsManager will be notified whenever we need to pin an Iterator
 // and it will be responsible for deleting pinned Iterators when they are
 // not needed anymore.
-class PinnedIteratorsManager {
+class PinnedIteratorsManager : public Cleanable {


I feel it's better to have a Cleanable data member

We can go either way but it would be great if we discuss it and have pros and cons listed. The pros of inheritance are:

Simplicity: no need to add the functions and the glue them to the Cleanable data member.

What would be the cons of this approach, or what would the pros of data member approach?

I just felt it's better to have it as a data member because everywhere in the code Cleanable was doing Cleanup in the destructor. now we support to do the cleanup and reuse the Cleanable object. It just felt to me that having this behavior more explicit (having a data member instead of inheritance) is better.
but final decision is up to you

I do not have a strong opinion here. Let's go with your instinct and make it a data member.

On second thought it seems easier if we keep the current inheritance scheme. The reason is that currently we can easily pass PinnedIteratorManager as a Cleanable to DelegateCleanupsTo function:

biter.DelegateCleanupsTo(pinned_iters_mgr);

void Cleanable::DelegateCleanupsTo(Cleanable* other)

Without having PinnedIteratorManager manager we would have two choices:

Cleanable::DelegateCleanupsTo(PinnedIteratorManager* other), which could cause circular dependency and would need more code restructuring

PinnedIteratorManager::DelegateCleanupsFrom(Cleanable*) which would require making PinnedIteratorManager a friend class of Cleanable to be able to access its internals, which does not look clean to me.

facebook-github-bot · 2016-12-29T01:29:52Z

@maysamyabandeh updated the pull request - view changes - changes since last import

IslamAbdelRahman

Thanks a lot @maysamyabandeh

IslamAbdelRahman · 2016-12-29T20:12:25Z

table/iterator.cc

+  }
+  Cleanup* c = &cleanup_;
+  other->RegisterCleanup(c->function, c->arg1, c->arg2);
+  c = c->next;


yes, my bad

IslamAbdelRahman · 2016-12-29T20:18:59Z

table/iterator.cc

+    cleanup_.function = c->function;
+    cleanup_.arg1 = c->arg1;
+    cleanup_.arg2 = c->arg2;
+    delete c;


Sounds good to me

IslamAbdelRahman · 2016-12-29T20:20:56Z

table/iterator.cc

-Cleanable::~Cleanable() {
+Cleanable::~Cleanable() { DoCleanup(); }
+
+void Cleanable::Reset() {


I don't think the 2 extra assignments should be a concern (except if benchmarks showed that they are of course)
This was just a suggestion, final decision is up to you

IslamAbdelRahman · 2016-12-29T20:24:56Z

db/pinned_iterators_manager.h

@@ -16,7 +16,7 @@ namespace rocksdb {
 // PinnedIteratorsManager will be notified whenever we need to pin an Iterator
 // and it will be responsible for deleting pinned Iterators when they are
 // not needed anymore.
-class PinnedIteratorsManager {
+class PinnedIteratorsManager : public Cleanable {


I just felt it's better to have it as a data member because everywhere in the code Cleanable was doing Cleanup in the destructor. now we support to do the cleanup and reuse the Cleanable object. It just felt to me that having this behavior more explicit (having a data member instead of inheritance) is better.
but final decision is up to you

facebook-github-bot · 2016-12-29T23:44:40Z

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

maysamyabandeh requested a review from IslamAbdelRahman December 22, 2016 02:29

maysamyabandeh force-pushed the cleanabledelagate branch from 117ab32 to c43e0bc Compare December 22, 2016 19:16

IslamAbdelRahman reviewed Dec 22, 2016

View reviewed changes

maysamyabandeh added 3 commits December 22, 2016 13:09

add the forgotten table/cleanable_test.cc

7255c30

Separate cleanup from constructor in Cleanable

3c72b36

Reset the base Cleanable upon ReleasePinnedData invocation

c495ea9

IslamAbdelRahman reviewed Dec 23, 2016

View reviewed changes

Explain the case for long list of cleanups

8d1ef37

and minor refactoring

IslamAbdelRahman reviewed Dec 29, 2016

View reviewed changes

maysamyabandeh mentioned this pull request Dec 29, 2016

PinnableSlice #1732

Closed

apply comments

056784e

facebook-github-bot added the CLA Signed label Dec 29, 2016

IslamAbdelRahman approved these changes Dec 29, 2016

View reviewed changes

facebook-github-bot closed this in 0712d54 Dec 29, 2016

Delegate Cleanables #1711

Delegate Cleanables #1711

Conversation

maysamyabandeh commented Dec 22, 2016

facebook-github-bot commented Dec 22, 2016

facebook-github-bot commented Dec 22, 2016

facebook-github-bot commented Dec 22, 2016

maysamyabandeh commented Dec 22, 2016

IslamAbdelRahman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 22, 2016

facebook-github-bot commented Dec 22, 2016

Choose a reason for hiding this comment

maysamyabandeh commented Dec 23, 2016 • edited

facebook-github-bot commented Dec 23, 2016

facebook-github-bot commented Dec 23, 2016

IslamAbdelRahman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maysamyabandeh Dec 29, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 29, 2016

IslamAbdelRahman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 29, 2016

maysamyabandeh commented Dec 23, 2016 •

edited

maysamyabandeh Dec 29, 2016 •

edited