-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WritePreparedTxn] CompactionIterator sees consistent view of which keys are committed #9830
Conversation
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
283f6a4
to
34e58b1
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I recall discussing this a while back; didn't realize it was this easy :) |
I should have been more specific about the PR as "CompactionIterator sees consistent view of what's committed". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @riversand963 !
34e58b1
to
e71ce12
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Thanks @ltamasi for the review! |
This PR does not affect the functionality of
DB
and write-committed transactions.CompactionIterator
usesKeyCommitted(seq)
to determine if a key in the database is committed.As the name 'write-committed' implies, if write-committed policy is used, a key exists in the database only if
it is committed. In fact, the implementation of
KeyCommitted()
is as follows:With that being said, we focus on write-prepared/write-unprepared transactions.
A few notes:
snapshot_checker_
to determine data visibility. We also require that all writes go through transaction API instead of the rawWriteBatch
+Write
, thus at most one uncommitted version of one user key can exist in the database.CompactionIterator
outputs a key as long as the key is uncommitted.Due to the above reasons, it is possible that
CompactionIterator
decides to output an uncommitted key withoutdoing further checks on the key (
NextFromInput()
). By the time the key is being prepared for output, the key becomescommitted because the
snapshot_checker_(seq, kMaxSequence)
becomes true in the implementation ofKeyCommitted()
.Then
CompactionIterator
will try to zero its sequence number and hit assertion error if the key is a tombstone.To fix this issue, we should make the
CompactionIterator
see a consistent view of the input keys. Note thatfor write-prepared/write-unprepared, the background flush/compaction jobs already take a "job snapshot" before starting
processing keys. The job snapshot is released only after the entire flush/compaction finishes. We can use this snapshot
to determine whether a key is committed or not with minor change to
KeyCommitted()
.As a result, whether a key is committed or not will remain a constant throughout compaction, causing no trouble
for
CompactionIterator
s assertions.Test plan:
make check