Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for writing wide-column entities #10242

Closed
wants to merge 54 commits into from

Conversation

ltamasi
Copy link
Contributor

@ltamasi ltamasi commented Jun 23, 2022

Summary:
The patch builds on #9915 and adds
a new API called PutEntity that can be used to write a wide-column entity
to the database. The new API is added to both DB and WriteBatch. Note
that currently there is no way to retrieve these entities; more precisely, all
read APIs (Get, MultiGet, and iterator) return NotSupported when they
encounter a wide-column entity that is required to answer a query. Read-side
support (as well as other missing functionality like Merge, compaction filter,
and timestamp support) will be added in later PRs.

Test Plan:
make check

…y when reading from write batch; add KV protection logic
@facebook-github-bot
Copy link
Contributor

@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ltamasi ltamasi changed the title Add API for writing a wide-column entities Add API for writing wide-column entities Jun 23, 2022
@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -519,7 +520,8 @@ void CompactionIterator::NextFromInput() {
// In the previous iteration we encountered a single delete that we could
// not compact out. We will keep this Put, but can drop it's data.
// (See Optimization 3, below.)
if (ikey_.type != kTypeValue && ikey_.type != kTypeBlobIndex) {
if (ikey_.type != kTypeValue && ikey_.type != kTypeBlobIndex &&
ikey_.type != kTypeWideColumnEntity) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means, if a logical entity spans across n column families, then the entity can go through the compaction filter n times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is part of the SingleDelete processing logic (not compaction filter), right?

Otherwise, yes, when an entity is split across column families, the compaction filter would get called for each part separately. (It might actually be a different compaction filter for each, since compaction filter is a CF option.)

@@ -533,7 +535,7 @@ void CompactionIterator::NextFromInput() {
assert(false);
}

if (ikey_.type == kTypeBlobIndex) {
if (ikey_.type == kTypeBlobIndex || ikey_.type == kTypeWideColumnEntity) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should compaction convert a wide column entity into a normal PUT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also related to SingleDelete, namely "optimization 3" where we preserve a SingleDelete + Put combination for the purposes of txn conflict checking but clear out the Put's value to save space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I misread the code before. Thanks

}

Status WriteBatch::PutEntity(ColumnFamilyHandle* column_family,
const Slice& key, const WideColumns& columns) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is new API, I wonder if we should enforce column_family != nullptr from the beginning. Not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, we can definitely do that. Thanks!

@@ -113,6 +113,13 @@ class WriteBatchWithIndex : public WriteBatchBase {
Status Put(ColumnFamilyHandle* column_family, const Slice& key,
const Slice& ts, const Slice& value) override;

Status PutEntity(ColumnFamilyHandle* /* column_family */,
const Slice& /* key */,
const WideColumns& /* columns */) override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can at least enforce column_family != nullptr.

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ltamasi for the PR! Took another pass and left some comments.

@@ -62,6 +63,9 @@ std::pair<WriteBatch, Status> GetWriteBatch(ColumnFamilyHandle* cf_handle,
case WriteBatchOpType::kMerge:
s = wb.Merge(cf_handle, "key", "val");
break;
case WriteBatchOpType::kPutEntity:
s = wb.PutEntity(cf_handle, "key", {{"foo", "bar"}, {"hello", "world"}});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe consider

s = wb.PutEntity(cf_handle, "key", {{"attr_name1", "bar"}, {"attr_name2", "world"}});

const Slice& name = column.name();
if (name.size() >
static_cast<size_t>(std::numeric_limits<uint32_t>::max())) {
return Status::InvalidArgument("Wide column name too long");
}
if (i > 0 && columns[i - 1].name().compare(name) >= 0) {
return Status::Corruption("Wide columns out of order");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we sort the column names in WriteBatchInternal, is it possible that the column names here are out-of-order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be called from other places too but yeah, strictly out of order is not possible during PutEntity. Duplicates are possible though.

@@ -74,6 +74,7 @@ class GetContext {
kCorrupt,
kMerge, // saver contains the current merge result (the operands)
kUnexpectedBlobIndex,
kUnexpectedWideColumnEntity,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be removed in the future if wide column is generally supported in RocksDB? If so, maybe put a TODO.

@@ -533,7 +535,7 @@ void CompactionIterator::NextFromInput() {
assert(false);
}

if (ikey_.type == kTypeBlobIndex) {
if (ikey_.type == kTypeBlobIndex || ikey_.type == kTypeWideColumnEntity) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I misread the code before. Thanks

virtual Status PutEntityCF(uint32_t /* column_family_id */,
const Slice& /* key */,
const Slice& /* entity */) {
return Status::InvalidArgument("PutEntityCF not implemented");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotSupported or InvalidArgument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change it to NotSupported


class DBWideBasicTest : public DBTestBase {
protected:
DBWideBasicTest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: explicit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it makes much of a difference in this context but sure, we can make it explicit

@ltamasi
Copy link
Contributor Author

ltamasi commented Jun 25, 2022

Thanks so much for the review @riversand963 !

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ltamasi has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ltamasi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants