-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not truncate WAL if in read_only mode #8313
Conversation
Noticed ```openat``` system call with ```O_WRONLY``` flag and ```sync_file_range``` and ```truncate``` on WAL file when using ```rocksdb::DB::OpenForReadOnly``` by way of ```db_bench --readonly=true --benchmarks=readseq --use_existing_db=1 --num=1 ...``` I think introduced by facebook@7d7f144
Summary: Currently, we only truncate the latest alive WAL files when the DB is opened. If the latest WAL file is empty or was flushed during Open, its not truncated since the file will be deleted later on in the Open path. However, before deletion, a new WAL file is created, and if the process crash loops between the new WAL file creation and deletion of the old WAL file, the preallocated space will keep accumulating and eventually use up all disk space. To prevent this, always truncate the latest WAL file, even if its empty or the data was flushed. Tests: Add unit tests to db_wal_test Pull Request resolved: #8122 Reviewed By: riversand963 Differential Revision: D27366132 Pulled By: anand1976 fbshipit-source-id: f923cc03ef033ccb32b140d36c6a63a8152f0e8e
Thanks @thatsafunnyname for the contribution. Would you please add an unittest for this? |
This catches the WAL file being truncated and the modification time on it changing. I am not sure if a mock filesystem with mock clock could be used to avoid having to sleep 1.1s. The test could also check the set of files is the same and that the sizes are also unchanged. Before: [ RUN ] DBBasicTest.ReadOnlyReopenMtimeUnchanged db/db_basic_test.cc:182: Failure Expected equality of these values: file_mtime_after_readonly_reopen Which is: 1621611136 file_mtime_before_readonly_reopen Which is: 1621611135 file is: 000010.log [ FAILED ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms) After: [ RUN ] DBBasicTest.ReadOnlyReopenMtimeUnchanged [ OK ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)
The ci/circleci: build-linux-clang10-ubsan tests failed with "No space left on device". |
No worry, it's a known issue, not related to this PR. |
I updated test to be more specify about WAL file truncate. |
@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@thatsafunnyname has updated the pull request. You must reimport the pull request before landing. |
@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@thatsafunnyname has updated the pull request. You must reimport the pull request before landing. |
@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@jay-zhuang merged this pull request in c75ef03. |
I noticed
openat
system call withO_WRONLY
flag andsync_file_range
andtruncate
on WAL file when usingrocksdb::DB::OpenForReadOnly
by way ofdb_bench --readonly=true --benchmarks=readseq --use_existing_db=1 --num=1 ...
Noticed in
strace
after seeing the last modification time of the WAL file change after each run (with--readonly=true
).I think introduced by 7d7f144 from #8122
I added a test to catch the WAL file being truncated and the modification time on it changing.
I am not sure if a mock filesystem with mock clock could be used to avoid having to sleep 1.1s.
The test could also check the set of files is the same and that the sizes are also unchanged.
Before:
After: