-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Blob DB] Add ValueType::kTypeBlobIndex #2886
Conversation
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! In general, it looks good to me. I look forward to more unit tests to verify it. Also, run basic db_bench tests against ramfs and make sure there isn't regression. Maybe @sagar0 should take another look and I'll defer to him to accept.
db/compaction_iterator.cc
Outdated
@@ -25,6 +25,8 @@ CompactionEventListener::CompactionListenerValueType fromInternalValueType( | |||
kSingleDelete; | |||
case kTypeRangeDeletion: | |||
return CompactionEventListener::CompactionListenerValueType::kRangeDelete; | |||
case kTypeBlobValue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just call it kTypeBlobIndex.
db/db_iter.cc
Outdated
"Encounter unexpected blob value. Please open DB with " | ||
"rocksdb::blob_db::BlobDB instead."); | ||
valid_ = false; | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have allow_blob
flag, what will happen? If it's not too bad, maybe we should avoid this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the question is what to do if we see a blob value during scan. Here I set the iterator to invalid and return error status. What to you suggest to handle the case?
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
23b53e5
to
753da23
Compare
@yiwu-arbug has updated the pull request. |
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work. looks good to me.
minor nits, but not blockers.
table/get_context.cc
Outdated
state_ = kBlobIndex; | ||
return false; | ||
} | ||
// intential fallthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/intentional
table/get_context.cc
Outdated
state_ = kBlobIndex; | ||
return false; | ||
} | ||
// intential fallthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be a source of potential bugs in the future, as the code is split between the two cases. Why not move the code between lines 117 -121 to be also under case kTypeValue, and keep case kTypeBlobIndex empty (of course with the comment "intentional fall through").
@yiwu-arbug has updated the pull request. View: changes, changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiwu-arbug is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiwu-arbug is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Add kTypeBlobIndex value type, which will be used by blob db only, to insert a (key, blob_offset) KV pair. The purpose is to 1. Make it possible to open existing rocksdb instance as blob db. Existing value will be of kTypeIndex type, while value inserted by blob db will be of kTypeBlobIndex. 2. Make rocksdb able to detect if the db contains value written by blob db, if so return error. 3. Make it possible to have blob db optionally store value in SST file (with kTypeValue type) or as a blob value (with kTypeBlobIndex type). The root db (DBImpl) basically pretended kTypeBlobIndex are normal value on write. On Get if is_blob is provided, return whether the value read is of kTypeBlobIndex type, or return Status::NotSupported() status if is_blob is not provided. On scan allow_blob flag is pass and if the flag is true, return wether the value is of kTypeBlobIndex type via iter->IsBlob(). Changes on blob db side will be in a separate patch. Closes #2886 Differential Revision: D5838431 Pulled By: yiwu-arbug fbshipit-source-id: 3c5306c62bc13bb11abc03422ec5cbcea1203cca
* Fix Get does not return super version on error Summary: This is caught when I was testing facebook#2886. Closes facebook#2907 Differential Revision: D5863153 Pulled By: yiwu-arbug fbshipit-source-id: 8c54759ba1a0dc101f24ab50423e35731300612d * fix data race Summary: Fix a TSAN failure in `DBRangeDelTest.ValidLevelSubcompactionBoundaries`: https://gist.github.com/miasantreble/712e04b4de2ff7f193c98b1acf07e899 Closes facebook#3691 Differential Revision: D7541400 Pulled By: miasantreble fbshipit-source-id: b0b4538980bce7febd0385e61d6e046580bcaefb * check return status for Sync() and Append() calls to avoid corruption Summary: Right now in `SyncClosedLogs`, `CopyFile`, and `AddRecord`, where `Sync` and `Append` are invoked in a loop, the error status are not checked. This could lead to potential corruption as later calls will overwrite the error status. Closes facebook#3740 Differential Revision: D7678848 Pulled By: miasantreble fbshipit-source-id: 4b0b412975989dfe80348f73217b9c4122a4bd77 * Fix race condition between log_.erase and log_.back Summary: log_ contract specifies that it should not be modified unless both mutex_ and log_write_mutex_ are held. log_.erase however does that with only holding mutex_. This causes a race condition with two_write_queues since logs_.back is read with holding only log_write_mutex_ (which is correct according to logs_ contract) but logs_.erase is called concurrently. This is probably the cause of logs_.back returning nullptr in facebook#3852 although I could not reproduce it. Fixes facebook#3852 Closes facebook#3859 Differential Revision: D8026103 Pulled By: maysamyabandeh fbshipit-source-id: ee394e00fe4aa520d884c5ef87981e9d6b5ccb28 * Sync CURRENT file during checkpoint (facebook#4322) Summary: For the CURRENT file forged during checkpoint, we were forgetting to `fsync` or `fdatasync` it after its creation. This PR fixes it. Differential Revision: D9525939 Pulled By: ajkr fbshipit-source-id: a505483644026ee3f501cfc0dcbe74832165b2e3
…race (#6973) Summary: The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version. Pull Request resolved: #6973 Test Plan: Will run stress tests for a while to make sure there is no general regression. Reviewed By: ajkr Differential Revision: D22029348 fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
…race (#6973) Summary: The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version. Pull Request resolved: #6973 Test Plan: Will run stress tests for a while to make sure there is no general regression. Reviewed By: ajkr Differential Revision: D22029348 fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
…race (#6973) Summary: The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version. Pull Request resolved: #6973 Test Plan: Will run stress tests for a while to make sure there is no general regression. Reviewed By: ajkr Differential Revision: D22029348 fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
Summary:
Add kTypeBlobIndex value type, which will be used by blob db only, to insert a (key, blob_offset) KV pair. The purpose is to
The root db (DBImpl) basically pretended kTypeBlobIndex are normal value on write. On Get if is_blob is provided, return whether the value read is of kTypeBlobIndex type, or return Status::NotSupported() status if is_blob is not provided. On scan allow_blob flag is pass and if the flag is true, return wether the value is of kTypeBlobIndex type via iter->IsBlob().
Changes on blob db side will be in a separate patch.