Skip to content

Commit

Permalink
CompactRange() refit level should confirm destination level is not em…
Browse files Browse the repository at this point in the history
…pty (facebook#7261)

Summary:
There is potential data race related CompactRange() with level refitting. After the compaction step and refitting step, some automatic compaction could put data to the destination level and cause the DB to be corrupted. Fix the bug by checking the target level to be empty.

Pull Request resolved: facebook#7261

Test Plan: Add a unit test, which would fail with "Corruption: L1 have overlapping ranges '666F6F' seq:6, type:1 vs. '626172' seq:2, type:1", and now it succeeds.

Reviewed By: ajkr

Differential Revision: D23142269

fbshipit-source-id: 28bc14d5ac934c192260b23a4ce3f10a95e3ee91
  • Loading branch information
siying authored and codingrhythm committed Mar 5, 2021
1 parent 6c9205b commit e6be465
Show file tree
Hide file tree
Showing 3 changed files with 101 additions and 11 deletions.
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
### Bug fixes
* Fix a performance regression introduced in 6.4 that makes a upper bound check for every Next() even if keys are within a data block that is within the upper bound.
* Fix a possible corruption to the LSM state (overlapping files within a level) when a `CompactRange()` for refitting levels (`CompactRangeOptions::change_level == true`) and another manual compaction are executed in parallel.
* Fix a bug where a level refitting in CompactRange() might race with an automatic compaction that puts the data to the target level of the refitting. The bug has been there for years.

### New Features
* A new option `std::shared_ptr<FileChecksumGenFactory> file_checksum_gen_factory` is added to `BackupableDBOptions`. The default value for this option is `nullptr`. If this option is null, the default backup engine checksum function (crc32c) will be used for creating, verifying, or restoring backups. If it is not null and is set to the DB custom checksum factory, the custom checksum function used in DB will also be used for creating, verifying, or restoring backups, in addition to the default checksum function (crc32c). If it is not null and is set to a custom checksum factory different than the DB custom checksum factory (which may be null), BackupEngine will return `Status::InvalidArgument()`.
Expand Down
77 changes: 77 additions & 0 deletions db/db_compaction_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,16 @@ class DBCompactionDirectIOTest : public DBCompactionTest,
DBCompactionDirectIOTest() : DBCompactionTest() {}
};

// Param = true : target level is non-empty
// Param = false: level between target level and source level
// is not empty.
class ChangeLevelConflictsWithAuto
: public DBCompactionTest,
public ::testing::WithParamInterface<bool> {
public:
ChangeLevelConflictsWithAuto() : DBCompactionTest() {}
};

namespace {

class FlushedFileCollector : public EventListener {
Expand Down Expand Up @@ -5442,6 +5452,73 @@ TEST_F(DBCompactionTest, UpdateUniversalSubCompactionTest) {
ASSERT_TRUE(has_compaction);
}

TEST_P(ChangeLevelConflictsWithAuto, TestConflict) {
// A `CompactRange()` may race with an automatic compaction, we'll need
// to make sure it doesn't corrupte the data.
Options options = CurrentOptions();
options.level0_file_num_compaction_trigger = 2;
Reopen(options);

ASSERT_OK(Put("foo", "v1"));
ASSERT_OK(Put("bar", "v1"));
ASSERT_OK(Flush());
ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());

{
CompactRangeOptions cro;
cro.change_level = true;
cro.target_level = 2;
ASSERT_OK(dbfull()->CompactRange(cro, nullptr, nullptr));
}
ASSERT_EQ("0,0,1", FilesPerLevel(0));

// Run a qury to refitting to level 1 while another thread writing to
// the same level.
SyncPoint::GetInstance()->LoadDependency({
// The first two dependencies ensure the foreground creates an L0 file
// between the background compaction's L0->L1 and its L1->L2.
{
"DBImpl::CompactRange:BeforeRefit:1",
"AutoCompactionFinished1",
},
{
"AutoCompactionFinished2",
"DBImpl::CompactRange:BeforeRefit:2",
},
});
SyncPoint::GetInstance()->EnableProcessing();

std::thread auto_comp([&] {
TEST_SYNC_POINT("AutoCompactionFinished1");
ASSERT_OK(Put("bar", "v2"));
ASSERT_OK(Put("foo", "v2"));
ASSERT_OK(Flush());
ASSERT_OK(Put("bar", "v3"));
ASSERT_OK(Put("foo", "v3"));
ASSERT_OK(Flush());
dbfull()->TEST_WaitForCompact();
TEST_SYNC_POINT("AutoCompactionFinished2");
});

{
CompactRangeOptions cro;
cro.change_level = true;
cro.target_level = GetParam() ? 1 : 0;
// This should return non-OK, but it's more important for the test to
// make sure that the DB is not corrupted.
dbfull()->CompactRange(cro, nullptr, nullptr);
}
auto_comp.join();
// Refitting didn't happen.
SyncPoint::GetInstance()->DisableProcessing();

// Write something to DB just make sure that consistency check didn't
// fail and make the DB readable.
}

INSTANTIATE_TEST_CASE_P(ChangeLevelConflictsWithAuto,
ChangeLevelConflictsWithAuto, testing::Bool());

TEST_F(DBCompactionTest, ChangeLevelCompactRangeConflictsWithManual) {
// A `CompactRange()` with `change_level == true` needs to execute its final
// step, `ReFitLevel()`, in isolation. Previously there was a bug where
Expand Down
34 changes: 23 additions & 11 deletions db/db_impl/db_impl_compaction_flush.cc
Original file line number Diff line number Diff line change
Expand Up @@ -845,6 +845,9 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
}

if (options.change_level) {
TEST_SYNC_POINT("DBImpl::CompactRange:BeforeRefit:1");
TEST_SYNC_POINT("DBImpl::CompactRange:BeforeRefit:2");

ROCKS_LOG_INFO(immutable_db_options_.info_log,
"[RefitLevel] waiting for background threads to stop");
DisableManualCompaction();
Expand Down Expand Up @@ -1280,20 +1283,29 @@ Status DBImpl::ReFitLevel(ColumnFamilyData* cfd, int level, int target_level) {
}

auto* vstorage = cfd->current()->storage_info();
if (to_level > level) {
if (level == 0) {
return Status::NotSupported(
"Cannot change from level 0 to other levels.");
}
// Check levels are empty for a trivial move
for (int l = level + 1; l <= to_level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
if (to_level != level) {
if (to_level > level) {
if (level == 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
"Cannot change from level 0 to other levels.");
}
// Check levels are empty for a trivial move
for (int l = level + 1; l <= to_level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
}
}
} else {
// to_level < level
// Check levels are empty for a trivial move
for (int l = to_level; l < level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
}
}
}
}
if (to_level != level) {
ROCKS_LOG_DEBUG(immutable_db_options_.info_log,
"[%s] Before refitting:\n%s", cfd->GetName().c_str(),
cfd->current()->DebugString().data());
Expand Down

0 comments on commit e6be465

Please sign in to comment.