Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not truncate WAL if in read_only mode #8313

Closed
wants to merge 13 commits into from

Conversation

thatsafunnyname
Copy link
Contributor

@thatsafunnyname thatsafunnyname commented May 18, 2021

I noticed openat system call with O_WRONLY flag and sync_file_range and truncate on WAL file when using rocksdb::DB::OpenForReadOnly by way of db_bench --readonly=true --benchmarks=readseq --use_existing_db=1 --num=1 ...

Noticed in strace after seeing the last modification time of the WAL file change after each run (with --readonly=true).

I think introduced by 7d7f144 from #8122

I added a test to catch the WAL file being truncated and the modification time on it changing.
I am not sure if a mock filesystem with mock clock could be used to avoid having to sleep 1.1s.
The test could also check the set of files is the same and that the sizes are also unchanged.

Before:

[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
db/db_basic_test.cc:182: Failure
Expected equality of these values:
  file_mtime_after_readonly_reopen
    Which is: 1621611136
  file_mtime_before_readonly_reopen
    Which is: 1621611135
  file is: 000010.log
[  FAILED  ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)

After:

[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
[       OK ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)

Noticed ```openat``` system call with ```O_WRONLY``` flag and ```sync_file_range``` and ```truncate``` on WAL file when using ```rocksdb::DB::OpenForReadOnly``` by way of ```db_bench --readonly=true --benchmarks=readseq --use_existing_db=1 --num=1 ...```
  I think introduced by facebook@7d7f144
thatsafunnyname referenced this pull request May 18, 2021
Summary:
Currently, we only truncate the latest alive WAL files when the DB is opened. If the latest WAL file is empty or was flushed during Open, its not truncated since the file will be deleted later on in the Open path. However, before deletion, a new WAL file is created, and if the process crash loops between the new WAL file creation and deletion of the old WAL file, the preallocated space will keep accumulating and eventually use up all disk space. To prevent this, always truncate the latest WAL file, even if its empty or the data was flushed.

Tests:
Add unit tests to db_wal_test

Pull Request resolved: #8122

Reviewed By: riversand963

Differential Revision: D27366132

Pulled By: anand1976

fbshipit-source-id: f923cc03ef033ccb32b140d36c6a63a8152f0e8e
@jay-zhuang
Copy link
Contributor

Thanks @thatsafunnyname for the contribution. Would you please add an unittest for this?

This catches the WAL file being truncated and the modification time on it changing.
I am not sure if a mock filesystem with mock clock could be used to avoid having to sleep 1.1s.
The test could also check the set of files is the same and that the sizes are also unchanged.

Before:
 
[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
db/db_basic_test.cc:182: Failure
Expected equality of these values:
  file_mtime_after_readonly_reopen
    Which is: 1621611136
  file_mtime_before_readonly_reopen
    Which is: 1621611135
  file is: 000010.log
[  FAILED  ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)


After:
 
[ RUN      ] DBBasicTest.ReadOnlyReopenMtimeUnchanged
[       OK ] DBBasicTest.ReadOnlyReopenMtimeUnchanged (1108 ms)
@thatsafunnyname
Copy link
Contributor Author

The ci/circleci: build-linux-clang10-ubsan tests failed with "No space left on device".
I will trigger another run with a whitespace change.

@jay-zhuang
Copy link
Contributor

The ci/circleci: build-linux-clang10-ubsan tests failed with "No space left on device".
I will trigger another run with a whitespace change.

No worry, it's a known issue, not related to this PR.

@jay-zhuang
Copy link
Contributor

I updated test to be more specify about WAL file truncate.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@thatsafunnyname has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@jay-zhuang jay-zhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@facebook-github-bot
Copy link
Contributor

@thatsafunnyname has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jay-zhuang merged this pull request in c75ef03.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants