-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compaction with timestamp: input boundaries #6645
Compaction with timestamp: input boundaries #6645
Conversation
c605b86
to
c838bcc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
c838bcc
to
924c370
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
cc @hliu18 |
35562fc
to
56be440
Compare
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @riversand963 ! Looks good, just some minor comments (please see below). Also, one general question: I suppose we'll need changes in other places like e.g. CompactionPicker
, CompactionIterator
, FileIndexer
etc.; will those be handled in a separate PR?
} | ||
ASSERT_OK(Flush()); | ||
// Wait for compaction to finish | ||
ASSERT_OK(dbfull()->TEST_WaitForCompact()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're relying on automatic compactions here, right? I'm wondering if it would be better if we called CompactRange
with for the range consisting only of key 99 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they will go through the same code path in LevelCompactionBuilder
, thus should be the same.
Summary: During compaction, timestamp should also be taken into accound when computing overlapping ranges. If not, point lookup and range scan will return incorrect results. Test Plan: make check
56be440
to
7b0155e
Compare
@riversand963 has updated the pull request. Re-import the pull request |
Thanks @ltamasi for the detailed review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @riversand963 !
@riversand963 merged this pull request in 0c05624. |
Summary: The subcompaction boundary picking logic does not currently guarantee that all user keys that differ only by timestamp get processed by the same subcompaction. This can cause issues with the `CompactionIterator` state machine: for instance, one subcompaction that processes a subset of such KVs might drop a tombstone based on the KVs it sees, while in reality the tombstone might not have been eligible to be optimized out. (See also #6645, which adjusted the way compaction inputs are picked for the same reason.) Pull Request resolved: #8393 Test Plan: Ran `make check` and the crash test script with timestamps enabled. Reviewed By: jay-zhuang Differential Revision: D29071635 Pulled By: ltamasi fbshipit-source-id: f6c72442122b4e581871e096fabe3876a9e8a5a6
Towards making compaction logic compatible with user timestamp.
When computing boundaries and overlapping ranges for inputs of compaction, We need to compare SSTs by user key without timestamp.
Test plan (devserver):
Several individual tests: