Some checkpoints cannot be opened with `kAbsoluteConsistency` WAL recovery mode #12670

andlr · 2024-05-16T20:45:19Z

Expected behavior

Database can be opened from a checkpoint with wal_recovery_mode=kAbsoluteConsistency

Actual behavior

Due to a few data race issues, sometimes active WAL file gets copied in inconsistent state.
Database open fails with one of these errors when wal_recovery_mode=kAbsoluteConsistency:

Corruption: truncated record body
Corruption: error reading trailing data

Steps to reproduce the behavior

Initially I wrote this heavy and flaky test, which sometimes reproduces this issue:

TEST_F(CheckpointTest, WalCorruption) {
  Options options = CurrentOptions();
  options.wal_recovery_mode = WALRecoveryMode::kAbsoluteConsistency;

  Reopen(options);

  const auto threads_num = 32;
  const auto checkpoints_to_create = 200;
  std::atomic<int> thread_num(0);
  std::vector<port::Thread> threads;
  port::RWMutex mutex;
  bool finished = false;

  std::function<void()> write_func = [&]() {
    int a = thread_num.fetch_add(1);
    bool stop_worker = false;

    while (!stop_worker) {
      for (auto i = 0; i < 10000; ++i) {
        std::string key = "foo" + std::to_string(a) + "_" + std::to_string(i);
        ASSERT_OK(Put(key, "bar"));
      }

      mutex.ReadLock();
      stop_worker = finished;
      mutex.ReadUnlock();
    }
  };

  for (auto i = 0; i < threads_num; ++i) {
    threads.emplace_back(write_func);
  }

  std::vector<std::string> snapshot_names;
  for (auto i = 0; i < checkpoints_to_create; ++i) {
    const auto snapshot_name =
        test::PerThreadDBPath(env_, "snap_" + std::to_string(i));
    std::unique_ptr<Checkpoint> checkpoint;
    Checkpoint* checkpoint_ptr;
    ASSERT_OK(Checkpoint::Create(db_, &checkpoint_ptr));
    checkpoint.reset(checkpoint_ptr);

    ASSERT_OK(checkpoint->CreateCheckpoint(snapshot_name));
    snapshot_names.push_back(snapshot_name);
  }

  mutex.WriteLock();
  finished = true;
  mutex.WriteUnlock();

  for (auto& t : threads) {
    t.join();
  }

  Close();

  options.skip_stats_update_on_db_open = true;
  options.skip_checking_sst_file_sizes_on_db_open = true;
  options.max_open_files = 10;

  for (const auto& snapshot_name : snapshot_names) {
    DB* snapshot_db = nullptr;
    ASSERT_OK(DB::Open(options, snapshot_name, &snapshot_db));
    ASSERT_OK(snapshot_db->Close());
    delete snapshot_db;
  }
}

But I've also wrote more precise unit tests using sync points, so I'll include them into my PR with a suggested fix.

Conditions to reproduce are:

wal_size_for_flush is non-zero, so the WAL file gets copied during checkpoint;
while checkpoint is in progress, there are write operations happening in the background;
wal_recovery_mode = WALRecoveryMode::kAbsoluteConsistency when opening DB from the checkpoint.

This happens because size of the active WAL file is captured at a random moment:

truncated record body error happens when WAL file size is captured right after WritableFileWriter flush when in-memory buffer no longer has space for new data
error reading trailing data happens, when WAL record gets broken down into multiple physical records, and WAL file size was captured before last fragment has been written.

The text was updated successfully, but these errors were encountered:

andlr linked a pull request May 16, 2024 that will close this issue

Copy current WAL in consistent state during checkpoint #12671

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some checkpoints cannot be opened with `kAbsoluteConsistency` WAL recovery mode #12670

Some checkpoints cannot be opened with `kAbsoluteConsistency` WAL recovery mode #12670

andlr commented May 16, 2024 •

edited

Loading

Some checkpoints cannot be opened with kAbsoluteConsistency WAL recovery mode #12670

Some checkpoints cannot be opened with kAbsoluteConsistency WAL recovery mode #12670

Comments

andlr commented May 16, 2024 • edited Loading

Expected behavior

Actual behavior

Steps to reproduce the behavior

Some checkpoints cannot be opened with `kAbsoluteConsistency` WAL recovery mode #12670

Some checkpoints cannot be opened with `kAbsoluteConsistency` WAL recovery mode #12670

andlr commented May 16, 2024 •

edited

Loading