Skip to content

Commit

Permalink
Do not truncate WAL if in read_only mode (#8313)
Browse files Browse the repository at this point in the history
Summary:
I noticed ```openat``` system call with ```O_WRONLY``` flag and ```sync_file_range``` and ```truncate``` on WAL file when using ```rocksdb::DB::OpenForReadOnly``` by way of ```db_bench --readonly=true --benchmarks=readseq --use_existing_db=1 --num=1 ...```

Noticed in ```strace``` after seeing the last modification time of the WAL file change after each run (with ```--readonly=true```).

  I think introduced by 7d7f144 from #8122

I added a test to catch the WAL file being truncated and the modification time on it changing.
I am not sure if a mock filesystem with mock clock could be used to avoid having to sleep 1.1s.
The test could also check the set of files is the same and that the sizes are also unchanged.

Before:

```
[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
db/db_basic_test.cc:182: Failure
Expected equality of these values:
  file_mtime_after_readonly_reopen
    Which is: 1621611136
  file_mtime_before_readonly_reopen
    Which is: 1621611135
  file is: 000010.log
[  FAILED  ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)
```

After:

```
[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
[       OK ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)
```

Pull Request resolved: #8313

Reviewed By: pdillinger

Differential Revision: D28656925

Pulled By: jay-zhuang

fbshipit-source-id: ea9e215cb53e7c830e76bc5fc75c45e21f12a1d6
  • Loading branch information
thatsafunnyname authored and facebook-github-bot committed May 27, 2021
1 parent dfa6b40 commit c75ef03
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 1 deletion.
3 changes: 2 additions & 1 deletion db/db_impl/db_impl_open.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1259,7 +1259,8 @@ Status DBImpl::RecoverLogFiles(const std::vector<uint64_t>& wal_numbers,
// If there's no data in the WAL, or we flushed all the data, still
// truncate the log file. If the process goes into a crash loop before
// the file is deleted, the preallocated space will never get freed.
GetLogSizeAndMaybeTruncate(wal_numbers.back(), true, nullptr)
const bool truncate = !read_only;
GetLogSizeAndMaybeTruncate(wal_numbers.back(), truncate, nullptr)
.PermitUncheckedError();
}
}
Expand Down
55 changes: 55 additions & 0 deletions db/db_wal_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2001,6 +2001,61 @@ TEST_F(DBWALTest, TruncateLastLogAfterRecoverWALEmpty) {
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->ClearAllCallBacks();
}

TEST_F(DBWALTest, ReadOnlyRecoveryNoTruncate) {
constexpr size_t kKB = 1024;
Options options = CurrentOptions();
options.env = env_;
options.avoid_flush_during_recovery = true;
if (mem_env_) {
ROCKSDB_GTEST_SKIP("Test requires non-mem environment");
return;
}
if (!IsFallocateSupported()) {
return;
}

// create DB and close with file truncate disabled
std::atomic_bool enable_truncate{false};

SyncPoint::GetInstance()->SetCallBack(
"PosixWritableFile::Close", [&](void* arg) {
if (!enable_truncate) {
*(reinterpret_cast<size_t*>(arg)) = 0;
}
});
SyncPoint::GetInstance()->EnableProcessing();

DestroyAndReopen(options);
size_t preallocated_size =
dbfull()->TEST_GetWalPreallocateBlockSize(options.write_buffer_size);
ASSERT_OK(Put("foo", "v1"));
VectorLogPtr log_files_before;
ASSERT_OK(dbfull()->GetSortedWalFiles(log_files_before));
ASSERT_EQ(1, log_files_before.size());
auto& file_before = log_files_before[0];
ASSERT_LT(file_before->SizeFileBytes(), 1 * kKB);
// The log file has preallocated space.
auto db_size = GetAllocatedFileSize(dbname_ + file_before->PathName());
ASSERT_GE(db_size, preallocated_size);
Close();

// enable truncate and open DB as readonly, the file should not be truncated
// and DB size is not changed.
enable_truncate = true;
ASSERT_OK(ReadOnlyReopen(options));
VectorLogPtr log_files_after;
ASSERT_OK(dbfull()->GetSortedWalFiles(log_files_after));
ASSERT_EQ(1, log_files_after.size());
ASSERT_LT(log_files_after[0]->SizeFileBytes(), 1 * kKB);
ASSERT_EQ(log_files_after[0]->PathName(), file_before->PathName());
// The preallocated space should NOT be truncated.
// the DB size is almost the same.
ASSERT_NEAR(GetAllocatedFileSize(dbname_ + file_before->PathName()), db_size,
db_size / 100);
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
#endif // ROCKSDB_FALLOCATE_PRESENT
#endif // ROCKSDB_PLATFORM_POSIX

Expand Down

0 comments on commit c75ef03

Please sign in to comment.