Skip to content

Commit

Permalink
MB-52131: Optimize CheckpointCursor::operator< from O(num_items) to O(1)
Browse files Browse the repository at this point in the history
CheckpointCursor's operator< is currently O(n) in the case where two
cursors are pointing at the same item.

This is due to the fact that if they point to the same Checkpoint and
seqno, we must check their distance along the Checkpoint (as one may
be pointing to a meta-item which shares a seqno but it ahread of the
cursor) - however to do so requires calling std::distance(begin(),
cursorIterator), and as the CheckpointCursor itorator is not random
access (it's essentially a std::list::iterator) we must perform an
O(n) iteration from the beginning of the container to the cursor.

This results in the following performance characteristics as the
Checkpoint grows in items:

    Running ./ep_engine_benchmarks
    Run on (10 X 24.0102 MHz CPU s)
    CPU Caches:
      L1 Data 64 KiB (x10)
      L1 Instruction 128 KiB (x10)
      L2 Unified 4096 KiB (x5)
    Load Average: 6.28, 11.37, 18.63
    ---------------------------------------------------------------------------------
    Benchmark                                       Time             CPU   Iterations
    ---------------------------------------------------------------------------------
    CheckpointBench/GetLowestCursor/1            13.8 ns         13.7 ns     51002565
    CheckpointBench/GetLowestCursor/10           23.5 ns         23.4 ns     30190632
    CheckpointBench/GetLowestCursor/100           200 ns          200 ns      3489427
    CheckpointBench/GetLowestCursor/1000         2115 ns         2112 ns       332576
    CheckpointBench/GetLowestCursor/10000       37857 ns        37799 ns        18850
    CheckpointBench/GetLowestCursor/100000     270262 ns       269972 ns         2537

However, as part of patch "MB-48506: Introduce
CheckpointCursor::distance" (74b9cf4) we added a 'distance' member to
each CheckpointCursor which caches its distance from the start of the
container, and is updated whenever a cursor moves.

As such, we can use the cached distanece value in
heckpointCursor::operator<, reducing the cost to O(1) in the case
where two cursors point to the same item:

    Running ./ep_engine_benchmarks
    Run on (10 X 24.1207 MHz CPU s)
    CPU Caches:
      L1 Data 64 KiB (x10)
      L1 Instruction 128 KiB (x10)
      L2 Unified 4096 KiB (x5)
    Load Average: 7.73, 12.32, 19.30
    ---------------------------------------------------------------------------------
    Benchmark                                       Time             CPU   Iterations
    ---------------------------------------------------------------------------------
    CheckpointBench/GetLowestCursor/1            13.8 ns         13.8 ns     50626686
    CheckpointBench/GetLowestCursor/10           13.9 ns         13.8 ns     50958011
    CheckpointBench/GetLowestCursor/100          13.8 ns         13.8 ns     50932798
    CheckpointBench/GetLowestCursor/1000         13.9 ns         13.9 ns     50392340
    CheckpointBench/GetLowestCursor/10000        14.0 ns         14.0 ns     50426827
    CheckpointBench/GetLowestCursor/100000       13.9 ns         13.8 ns     50859526

(cherry picked from commit 5b28136)

Change-Id: I1814155d7eff9773ab29c30f6b76836963a75ac8
Reviewed-on: https://review.couchbase.org/c/kv_engine/+/174849
Well-Formed: Restriction Checker
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
  • Loading branch information
daverigby committed May 24, 2022
1 parent 68384dc commit df5d562
Show file tree
Hide file tree
Showing 2 changed files with 63 additions and 15 deletions.
64 changes: 58 additions & 6 deletions engines/ep/benchmarks/vbucket_bench.cc
Original file line number Diff line number Diff line change
Expand Up @@ -153,12 +153,15 @@ class CheckpointBench : public EngineFixture {
* @param numItems
* @param valueSize
*/
void loadItemsAndMoveCursor(size_t numItems, size_t valueSize);
void loadItemsAndMovePersistenceCursor(size_t numItems, size_t valueSize);

CheckpointList extractClosedUnrefCheckpoints(CheckpointManager&);

CheckpointManager::ExtractItemsResult extractItemsToExpel(
CheckpointManager&);

std::shared_ptr<CheckpointCursor> getLowestCursor(
CheckpointManager& manager);
};

/**
Expand Down Expand Up @@ -457,6 +460,12 @@ CheckpointManager::ExtractItemsResult CheckpointBench::extractItemsToExpel(
return manager.extractItemsToExpel(lh);
}

std::shared_ptr<CheckpointCursor> CheckpointBench::getLowestCursor(
CheckpointManager& manager) {
std::lock_guard<std::mutex> lh(manager.queueLock);
return manager.getLowestCursor(lh);
}

void CheckpointBench::queueItem(const std::string& key,
const std::string& value) {
queued_item item{new Item(StoredDocKey(key, CollectionID::Default),
Expand All @@ -472,8 +481,8 @@ void CheckpointBench::queueItem(const std::string& key,
item, GenerateBySeqno::Yes, GenerateCas::Yes, nullptr));
}

void CheckpointBench::loadItemsAndMoveCursor(size_t numItems,
size_t valueSize) {
void CheckpointBench::loadItemsAndMovePersistenceCursor(size_t numItems,
size_t valueSize) {
auto& vb = *engine->getKVBucket()->getVBucket(vbid);
auto& manager = *vb.checkpointManager;

Expand Down Expand Up @@ -518,7 +527,7 @@ BENCHMARK_DEFINE_F(CheckpointBench, ExtractClosedUnrefCheckpoints)

// Open checkpoint never removed, so create numCheckpoints+1 for
// removing numCheckpoints
loadItemsAndMoveCursor(numCheckpoints + 1, 0);
loadItemsAndMovePersistenceCursor(numCheckpoints + 1, 0);
ASSERT_EQ(numCheckpoints + 1, manager.getNumCheckpoints());

// Benchmark
Expand Down Expand Up @@ -553,7 +562,7 @@ BENCHMARK_DEFINE_F(CheckpointBench, GetCursorsToDrop)
while (state.KeepRunning()) {
state.PauseTiming();

loadItemsAndMoveCursor(numCheckpoints, 0);
loadItemsAndMovePersistenceCursor(numCheckpoints, 0);
ASSERT_EQ(numCheckpoints, manager.getNumCheckpoints());

// Benchmark
Expand All @@ -571,6 +580,40 @@ BENCHMARK_DEFINE_F(CheckpointBench, GetCursorsToDrop)
}
}

/**
* Benchmark looking up the lowest cursor, when there are two cursors at the
* same position. The lowest cursor is needed during item expel (to know
* where to expel up to).
* Prior to MB-52131 this was an O(n) operation where N is the distance of the
* cursor from the beginning of the checkpoint; so for large checkpoints where
* two cursors were both "up to date" (e.g. replication cursors which are both
* pointing at high-seqno) then getLowestCursor was costly.
*/
BENCHMARK_DEFINE_F(CheckpointBench, GetLowestCursor)
(benchmark::State& state) {
// We want a single checkpoint, but with a large number of items in it.
const size_t _1B = 1000 * 1000 * 1000;
engine->getConfiguration().setCheckpointMaxSize(_1B);
const size_t numItems = state.range(0);
auto& manager = *engine->getKVBucket()->getVBucket(vbid)->checkpointManager;

// Register a second cursor (modelling replication), and then move both
// persistence and test cursor to end of Checkpoint.
auto resisterResult = manager.registerCursorBySeqno(
"test_cursor", 0, CheckpointCursor::Droppable::Yes);
loadItemsAndMovePersistenceCursor(numItems, 0);
std::vector<queued_item> items;
auto cursor = resisterResult.cursor.lock();
ASSERT_TRUE(cursor);
manager.getItemsForCursor(
cursor.get(), items, std::numeric_limits<size_t>::max());

// Benchmark: Request the lowest cursor. Before the fix this was O(numItems)
while (state.KeepRunning()) {
benchmark::DoNotOptimize(getLowestCursor(manager));
}
}

BENCHMARK_DEFINE_F(CheckpointBench, ExtractItemsToExpel)
(benchmark::State& state) {
const auto ckptType = CheckpointType(state.range(0));
Expand All @@ -596,7 +639,7 @@ BENCHMARK_DEFINE_F(CheckpointBench, ExtractItemsToExpel)

// Checkpoint high-seqno never expelled, so load numItems+1 for
// expelling numItems
loadItemsAndMoveCursor(numItems + 1, 1024);
loadItemsAndMovePersistenceCursor(numItems + 1, 1024);
ASSERT_EQ(1, manager.getNumCheckpoints());
ASSERT_EQ(numItems + 1, manager.getNumOpenChkItems());
// Note: Checkpoint type set after loading items, as the above call
Expand Down Expand Up @@ -754,3 +797,12 @@ static void ExtractItemsArgs(benchmark::internal::Benchmark* b) {
BENCHMARK_REGISTER_F(CheckpointBench, ExtractItemsToExpel)
->Apply(ExtractItemsArgs)
->Iterations(1);

// Arguments: numItems
BENCHMARK_REGISTER_F(CheckpointBench, GetLowestCursor)
->Args({1})
->Args({10})
->Args({100})
->Args({1000})
->Args({10000})
->Args({100000});
14 changes: 5 additions & 9 deletions engines/ep/src/checkpoint_cursor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -123,20 +123,16 @@ bool operator<(const CheckpointCursor& a, const CheckpointCursor& b) {
return false;
}

// Same checkpoint and seqno, measure distance from start of checkpoint.
const auto a_distance =
std::distance((*a.currentCheckpoint)->begin(), a.currentPos);
const auto b_distance =
std::distance((*b.currentCheckpoint)->begin(), b.currentPos);
return a_distance < b_distance;
// Same checkpoint and seqno, use distance from start of checkpoint.
return a.distance < b.distance;
}

std::ostream& operator<<(std::ostream& os, const CheckpointCursor& c) {
os << "CheckpointCursor[" << &c << "] with"
<< " name:" << c.name
<< " currentCkpt:{id:" << (*c.currentCheckpoint)->getId()
<< " state:" << to_string((*c.currentCheckpoint)->getState())
<< "} currentSeq:" << (*c.currentPos)->getBySeqno() << " distance:"
<< std::distance((*c.currentCheckpoint)->begin(), c.currentPos);
<< "} currentSeq:" << (*c.currentPos)->getBySeqno()
<< " distance:" << c.distance;
return os;
}
}

0 comments on commit df5d562

Please sign in to comment.