Browse files

MB-4595 Schedule backfill for a fresh client with empty data

There is a case that causes data loss after rebalance in
the cluster running for a very short period:

1) Start the one node cluster and load a small number of items
into the node. The node will still have the open checkpoint with
id 1 for each vbucket.
2) Shutdown the node and start it again. After warmup, the node
still has the open checkpoint 1 for each vbucket, but each
checkpoint doesn't have any items in its datastructure.
3) Add another fresh node to the cluster and rebalane in. In this
case, no backfill tasks are scheduled for vbucket takeovers
because the fresh node starts with the open checkpoint id 1 and
the original node still has the open checkpoint id 1.

Consequently, each vbucket takeover is completed without backfill
but does not send any items from the open checkpoint 1 because
it doesn't have any items. This causes data loss after rebalance.

To resolve the above issue, we always schedule the backfill for
the fresh client with empty data (i.e., checkpoint id 1).
The better and more comprehensive solution would be to restore the
open checkpoint as part of warmup, which we will provide soon.

Change-Id: Ia115dd7f4cd10bfb79c9c2593366935bea508ea5
Reviewed-by: Michael Wiederhold <>
Tested-by: Chiyoung Seo <>
  • Loading branch information...
1 parent 0929fe0 commit 303ab54372e422b122116a85f2f084071b1491ff @chiyoung chiyoung committed Mar 22, 2012
Showing with 17 additions and 4 deletions.
  1. +4 −0
  2. +2 −0 checkpoint.hh
  3. +1 −0
  4. +10 −4
@@ -427,6 +427,10 @@ std::list<std::string> CheckpointManager::getTAPCursorNames() {
return cursor_names;
+bool CheckpointManager::tapCursorExists(const std::string &name) {
+ return tapCursors.find(name) != tapCursors.end();
bool CheckpointManager::isCheckpointCreationForHighMemUsage(const RCPtr<VBucket> &vbucket) {
bool forceCreation = false;
double memoryUsed = static_cast<double>(stats.getTotalMemoryUsed());
@@ -323,6 +323,8 @@ public:
std::list<std::string> getTAPCursorNames();
+ bool tapCursorExists(const std::string &name);
* Queue an item to be written to persistent layer.
* @param item the item to be persisted.
@@ -3025,6 +3025,7 @@ static enum test_result test_tap_takeover(ENGINE_HANDLE *h, ENGINE_HANDLE_V1 *h1
+ case TAP_OPAQUE:
case TAP_NOOP:
@@ -417,11 +417,17 @@ void TapProducer::registerTAPCursor(std::map<uint16_t, uint64_t> &lastCheckpoint
+ // If this tap connection is for a new client with checkpoint 1, we should always
+ // schedule backfill because the tap producer could be restarted with the open
+ // checkpoint 1, but not restore the items in the open checkpoint.
+ bool empty_client =
+ !vb->checkpointManager.tapCursorExists(name) && (chk_id_to_start == 1);
// Check if the unified queue contains the checkpoint to start with.
- if(vb && !vb->checkpointManager.registerTAPCursor(name,
- tapCheckpointState[vbid].currentCheckpointId,
- closedCheckpointOnly, registeredTAPClient)) {
- // Backfill is required because the checkpoint to start with doesn't exist in memory
+ bool chk_exists = vb->checkpointManager.registerTAPCursor(name,
+ chk_id_to_start,
+ closedCheckpointOnly,
+ registeredTAPClient);
+ if(!chk_exists || empty_client) {
uint64_t chk_id;
tap_checkpoint_state cstate;

0 comments on commit 303ab54

Please sign in to comment.