Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delegate Cleanables #1711

Closed
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,7 @@ TESTS = \
db_properties_test \
db_table_properties_test \
autovector_test \
cleanable_test \
column_family_test \
table_properties_collector_test \
arena_test \
Expand Down Expand Up @@ -1147,6 +1148,9 @@ full_filter_block_test: table/full_filter_block_test.o $(LIBOBJECTS) $(TESTHARNE
log_test: db/log_test.o $(LIBOBJECTS) $(TESTHARNESS)
$(AM_LINK)

cleanable_test: table/cleanable_test.o $(LIBOBJECTS) $(TESTHARNESS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks to me this is not needed

$(AM_LINK)

table_test: table/table_test.o $(LIBOBJECTS) $(TESTHARNESS)
$(AM_LINK)

Expand Down
4 changes: 3 additions & 1 deletion db/pinned_iterators_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ namespace rocksdb {
// PinnedIteratorsManager will be notified whenever we need to pin an Iterator
// and it will be responsible for deleting pinned Iterators when they are
// not needed anymore.
class PinnedIteratorsManager {
class PinnedIteratorsManager : public Cleanable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now the cleanup code will be called in the destructor, I think we must change that so that it's called in ReleasePinnedData()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's better to have a Cleanable data member

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can go either way but it would be great if we discuss it and have pros and cons listed. The pros of inheritance are:

  1. Simplicity: no need to add the functions and the glue them to the Cleanable data member.

What would be the cons of this approach, or what would the pros of data member approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just felt it's better to have it as a data member because everywhere in the code Cleanable was doing Cleanup in the destructor. now we support to do the cleanup and reuse the Cleanable object. It just felt to me that having this behavior more explicit (having a data member instead of inheritance) is better.
but final decision is up to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have a strong opinion here. Let's go with your instinct and make it a data member.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought it seems easier if we keep the current inheritance scheme. The reason is that currently we can easily pass PinnedIteratorManager as a Cleanable to DelegateCleanupsTo function:

biter.DelegateCleanupsTo(pinned_iters_mgr);
void Cleanable::DelegateCleanupsTo(Cleanable* other)

Without having PinnedIteratorManager manager we would have two choices:

  • Cleanable::DelegateCleanupsTo(PinnedIteratorManager* other), which could cause circular dependency and would need more code restructuring
  • PinnedIteratorManager::DelegateCleanupsFrom(Cleanable*) which would require making PinnedIteratorManager a friend class of Cleanable to be able to access its internals, which does not look clean to me.

public:
PinnedIteratorsManager() : pinning_enabled(false) {}
~PinnedIteratorsManager() {
Expand Down Expand Up @@ -69,6 +69,8 @@ class PinnedIteratorsManager {
release_func(ptr);
}
pinned_ptrs_.clear();
// Also do cleanups from the base Cleanable
Cleanable::Reset();
}

private:
Expand Down
10 changes: 10 additions & 0 deletions include/rocksdb/iterator.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ class Cleanable {
// not abstract and therefore clients should not override it.
typedef void (*CleanupFunction)(void* arg1, void* arg2);
void RegisterCleanup(CleanupFunction function, void* arg1, void* arg2);
void DelegateCleanupsTo(Cleanable* other);
// DoCleanup and also resets the pointers for reuse
void Reset();

protected:
struct Cleanup {
Expand All @@ -45,6 +48,13 @@ class Cleanable {
Cleanup* next;
};
Cleanup cleanup_;
// It also becomes the owner of c
void RegisterCleanup(Cleanup* c);

private:
// Performs all the cleanups. It does not reset the pointers. Making it
// private to prevent misuse
inline void DoCleanup();
};

class Iterator : public Cleanable {
Expand Down
27 changes: 5 additions & 22 deletions table/block_based_table_reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1559,15 +1559,8 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key,
break;
} else {
BlockIter stack_biter;
if (pin_blocks) {
// We need to create the BlockIter on heap because we may need to
// pin it if we encounterd merge operands
biter = static_cast<BlockIter*>(
NewDataBlockIterator(rep_, read_options, iiter.value()));
} else {
biter = &stack_biter;
NewDataBlockIterator(rep_, read_options, iiter.value(), biter);
}
biter = &stack_biter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use remove BlockIter* biter in line 1542
and use BlockIter biter; here only. I think that how it used to be before I changed it

NewDataBlockIterator(rep_, read_options, iiter.value(), biter);

if (read_options.read_tier == kBlockCacheTier &&
biter->status().IsIncomplete()) {
Expand Down Expand Up @@ -1596,22 +1589,12 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key,
}
s = biter->status();

if (pin_blocks) {
if (get_context->State() == GetContext::kMerge) {
// Pin blocks as long as we are merging
pinned_iters_mgr->PinIterator(biter);
} else {
delete biter;
}
biter = nullptr;
} else {
// biter is on stack, Nothing to clean
if (pin_blocks && get_context->State() == GetContext::kMerge) {
// Pin blocks as long as we are merging
biter->DelegateCleanupsTo(pinned_iters_mgr);
}
}
}
if (pin_blocks && biter != nullptr) {
delete biter;
}
if (s.ok()) {
s = iiter.status();
}
Expand Down
187 changes: 187 additions & 0 deletions table/cleanable_test.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree. An additional grant
// of patent rights can be found in the PATENTS file in the same directory.
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we need to add this LevelDB section for new files

// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.

#include <functional>

#include "port/port.h"
#include "port/stack_trace.h"
#include "rocksdb/iostats_context.h"
#include "rocksdb/perf_context.h"
#include "util/testharness.h"
#include "util/testutil.h"

namespace rocksdb {

class CleanableTest : public testing::Test {};

// Use this to keep track of the cleanups that were actually performed
void Multiplier(void* arg1, void* arg2) {
int* res = reinterpret_cast<int*>(arg1);
int* num = reinterpret_cast<int*>(arg2);
*res *= *num;
}

// the first Cleanup is on stack and the rest on heap, so test with both cases
TEST_F(CleanableTest, Register) {
int n2 = 2, n3 = 3;
int res = 1;
{ Cleanable c1; }
// ~Cleanable
ASSERT_EQ(1, res);

res = 1;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
}
// ~Cleanable
ASSERT_EQ(2, res);

res = 1;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.RegisterCleanup(Multiplier, &res, &n3); // res = 2 * 3;
}
// ~Cleanable
ASSERT_EQ(6, res);
}

// the first Cleanup is on stack and the rest on heap,
// so test all the combinations of them
TEST_F(CleanableTest, Delegation) {
int n2 = 2, n3 = 3, n5 = 5, n7 = 7;
int res = 1;
{
Cleanable c2;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.DelegateCleanupsTo(&c2);
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(2, res);

res = 1;
{
Cleanable c2;
{
Cleanable c1;
c1.DelegateCleanupsTo(&c2);
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(1, res);

res = 1;
{
Cleanable c2;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.RegisterCleanup(Multiplier, &res, &n3); // res = 2 * 3;
c1.DelegateCleanupsTo(&c2);
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(6, res);

res = 1;
{
Cleanable c2;
c2.RegisterCleanup(Multiplier, &res, &n5); // res = 5;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.RegisterCleanup(Multiplier, &res, &n3); // res = 2 * 3;
c1.DelegateCleanupsTo(&c2); // res = 2 * 3 * 5;
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(30, res);

res = 1;
{
Cleanable c2;
c2.RegisterCleanup(Multiplier, &res, &n5); // res = 5;
c2.RegisterCleanup(Multiplier, &res, &n7); // res = 5 * 7;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.RegisterCleanup(Multiplier, &res, &n3); // res = 2 * 3;
c1.DelegateCleanupsTo(&c2); // res = 2 * 3 * 5 * 7;
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(210, res);

res = 1;
{
Cleanable c2;
c2.RegisterCleanup(Multiplier, &res, &n5); // res = 5;
c2.RegisterCleanup(Multiplier, &res, &n7); // res = 5 * 7;
{
Cleanable c1;
c1.RegisterCleanup(Multiplier, &res, &n2); // res = 2;
c1.DelegateCleanupsTo(&c2); // res = 2 * 5 * 7;
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(70, res);

res = 1;
{
Cleanable c2;
c2.RegisterCleanup(Multiplier, &res, &n5); // res = 5;
c2.RegisterCleanup(Multiplier, &res, &n7); // res = 5 * 7;
{
Cleanable c1;
c1.DelegateCleanupsTo(&c2); // res = 5 * 7;
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(35, res);

res = 1;
{
Cleanable c2;
c2.RegisterCleanup(Multiplier, &res, &n5); // res = 5;
{
Cleanable c1;
c1.DelegateCleanupsTo(&c2); // res = 5;
}
// ~Cleanable
ASSERT_EQ(1, res);
}
// ~Cleanable
ASSERT_EQ(5, res);
}

} // namespace rocksdb

int main(int argc, char** argv) {
rocksdb::port::InstallStackTraceHandler();
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
58 changes: 56 additions & 2 deletions table/iterator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,18 @@ Cleanable::Cleanable() {
cleanup_.next = nullptr;
}

Cleanable::~Cleanable() {
Cleanable::~Cleanable() { DoCleanup(); }

void Cleanable::Reset() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have one function Reset(), instead of having 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset is more expensive than DoCleanup as it does two value assignment too. We do not need that when the object is being destructed. I was not sure about penalty but since cleanup is heavily used my preference was not to add any cost to its destruction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the 2 extra assignments should be a concern (except if benchmarks showed that they are of course)
This was just a suggestion, final decision is up to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right but since Cleanable is inherited in many objects, any small increase in construction/deconstruction cost could potentially be observable. I am more inclined to keep the cost the same as it was before.

DoCleanup();
cleanup_.function = nullptr;
cleanup_.next = nullptr;
}

void Cleanable::DoCleanup() {
if (cleanup_.function != nullptr) {
(*cleanup_.function)(cleanup_.arg1, cleanup_.arg2);
for (Cleanup* c = cleanup_.next; c != nullptr; ) {
for (Cleanup* c = cleanup_.next; c != nullptr;) {
(*c->function)(c->arg1, c->arg2);
Cleanup* next = c->next;
delete c;
Expand All @@ -31,6 +39,52 @@ Cleanable::~Cleanable() {
}
}

// If the entire linked list was on heap we could have simply add attach one
// link list to another. However the head is an embeded object to avoid the cost
// of creating objects for most of the use cases when the Cleanable has only one
// Cleanup to do. We could put evernything on heap if benchmarks show no
// negative impact on performance.
// Also we need to iterate on the linked list since there is no pointer to the
// tail. We can add the tail pointer but maintainin it might negatively impact
// the perforamnce for the common case of one cleanup where tail pointer is not
// needed. Again benchmarks could clarify that.
// Even without a tail pointer we could iterate on the list, find the tail, and
// have only that node updated without the need to insert the Cleanups one by
// one. This however would be redundant when the source Cleanable has one or a
// few Cleanups which is the case most of the time.
// TODO(myabandeh): if the list is too long we should maintain a tail pointer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the possible optimizations/bottlenecks we discussed offline in the comments here

// and have the entire list (minus the head that has to be inserted separately)
// merged with the target linked list at once.
void Cleanable::DelegateCleanupsTo(Cleanable* other) {
assert(other != nullptr);
if (cleanup_.function == nullptr) {
return;
}
Cleanup* c = &cleanup_;
other->RegisterCleanup(c->function, c->arg1, c->arg2);
c = c->next;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to remove line 64,65 and just enter the loop directly
this is also related to my comment of not doing cleanup inside RegisterCleanup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the process for delegating the first Cleanup is different from the rest: the first is an embedded object and its values needs to be copied while the rest are on heap and they can be simply be added to the destination linked list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, my bad

while (c != nullptr) {
Cleanup* next = c->next;
other->RegisterCleanup(c);
c = next;
}
cleanup_.function = nullptr;
cleanup_.next = nullptr;
}

void Cleanable::RegisterCleanup(Cleanable::Cleanup* c) {
assert(c != nullptr);
if (cleanup_.function == nullptr) {
cleanup_.function = c->function;
cleanup_.arg1 = c->arg1;
cleanup_.arg2 = c->arg2;
delete c;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the caller should be responsible for cleanup, what do you think ?
It's also confusing that we cleanup in one branch and not in the other

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth between these two approaches and finally went with doing the delete inside RegisterCleanup. My argument was that there are two cases:

  1. RegisterCleanup takes the ownership and adds the object to its linkedlist
  2. RegisterCleanup does not need the object and want it deleted
    I found it more consistent if RegisterCleanup takes the full ownership and takes care of both cases. Otherwise it would take ownership when it needs the object and would not do that when it wanted to be deleted, which is an inconsistent behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

} else {
c->next = cleanup_.next;
cleanup_.next = c;
}
}

void Cleanable::RegisterCleanup(CleanupFunction func, void* arg1, void* arg2) {
assert(func != nullptr);
Cleanup* c;
Expand Down