Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail DB::Open if hashing features enabled with incompatible Comparator #10293

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@
* `rocksdb_level_metadata_t` and its and its get functions & destroy function.
* `rocksdb_file_metadata_t` and its and get functions & destroy functions.
* Add suggest_compact_range() and suggest_compact_range_cf() to C API.
* Minor changes to C API for custom comparators.

### Behavior changes
* DB::Open will now fail if a `Comparator` satisfying `CanKeysWithDifferentByteContentsBeEqual()` is used with a feature that hashes keys or prefixes, such as Bloom filters or hash indices. RocksDB can return wrong results if these hashing-based features are used with a comparator that can treat different byte strings as equal. Because of this change, some existing custom comparators might need to add an override for `CanKeysWithDifferentByteContentsBeEqual()` to return false if appropriate.
hx235 marked this conversation as resolved.
Show resolved Hide resolved
hx235 marked this conversation as resolved.
Show resolved Hide resolved
* When using block cache strict capacity limit (`LRUCache` with `strict_capacity_limit=true`), DB operations now fail with Status code `kAborted` subcode `kMemoryLimit` (`IsMemoryLimit()`) instead of `kIncomplete` (`IsIncomplete()`) when the capacity limit is reached, because Incomplete can mean other specific things for some operations. In more detail, `Cache::Insert()` now returns the updated Status code and this usually propagates through RocksDB to the user on failure.
* NewClockCache calls temporarily return an LRUCache (with similar characteristics as the desired ClockCache). This is because ClockCache is being replaced by a new version (the old one had unknown bugs) but this is still under development.
* Add two functions `int ReserveThreads(int threads_to_be_reserved)` and `int ReleaseThreads(threads_to_be_released)` into `Env` class. In the default implementation, both return 0. Newly added `xxxEnv` class that inherits `Env` should implement these two functions for thread reservation/releasing features.
Expand Down
27 changes: 17 additions & 10 deletions db/c.cc
Original file line number Diff line number Diff line change
Expand Up @@ -295,12 +295,13 @@ struct rocksdb_comparator_t : public Comparator {
void (*destructor_)(void*);
int (*compare_)(void*, const char* a, size_t alen, const char* b,
size_t blen);
const char* (*name_)(void*);
int (*compare_ts_)(void*, const char* a_ts, size_t a_tslen, const char* b_ts,
size_t b_tslen);
int (*compare_without_ts_)(void*, const char* a, size_t alen,
unsigned char a_has_ts, const char* b, size_t blen,
unsigned char b_has_ts);
const char* name_;
unsigned char ckwdbcbe_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with the full name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full name on snake case is really really long. It's easy to see what it means with code navigation


rocksdb_comparator_t() : Comparator() {}

Expand Down Expand Up @@ -329,7 +330,11 @@ struct rocksdb_comparator_t : public Comparator {
b.data(), b.size(), b_has_ts);
}

const char* Name() const override { return (*name_)(state_); }
const char* Name() const override { return name_; }

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return ckwdbcbe_ != 0;
}

// No-ops since the C binding does not support key shortening methods.
void FindShortestSeparator(std::string*, const Slice&) const override {}
Expand Down Expand Up @@ -4183,20 +4188,19 @@ void rocksdb_compactionfilterfactory_destroy(
}

rocksdb_comparator_t* rocksdb_comparator_create(
void* state,
void (*destructor)(void*),
int (*compare)(
void*,
const char* a, size_t alen,
const char* b, size_t blen),
const char* (*name)(void*)) {
void* state, void (*destructor)(void*),
int (*compare)(void*, const char* a, size_t alen, const char* b,
size_t blen),
const char* name,
unsigned char can_keys_with_different_byte_contents_be_equal) {
rocksdb_comparator_t* result = new rocksdb_comparator_t;
result->state_ = state;
result->destructor_ = destructor;
result->compare_ = compare;
result->name_ = name;
result->compare_ts_ = nullptr;
result->compare_without_ts_ = nullptr;
result->ckwdbcbe_ = can_keys_with_different_byte_contents_be_equal;
return result;
}

Expand All @@ -4211,14 +4215,17 @@ rocksdb_comparator_t* rocksdb_comparator_with_ts_create(
int (*compare_without_ts)(void*, const char* a, size_t alen,
unsigned char a_has_ts, const char* b,
size_t blen, unsigned char b_has_ts),
const char* (*name)(void*), size_t timestamp_size) {
const char* name,
unsigned char can_keys_with_different_byte_contents_be_equal,
size_t timestamp_size) {
rocksdb_comparator_t* result = new rocksdb_comparator_t(timestamp_size);
result->state_ = state;
result->destructor_ = destructor;
result->compare_ = compare;
result->compare_ts_ = compare_ts;
result->compare_without_ts_ = compare_without_ts;
result->name_ = name;
result->ckwdbcbe_ = can_keys_with_different_byte_contents_be_equal;
return result;
}

Expand Down
8 changes: 2 additions & 6 deletions db/c_test.c
Original file line number Diff line number Diff line change
Expand Up @@ -225,11 +225,6 @@ static int CmpCompare(void* arg, const char* a, size_t alen,
return r;
}

static const char* CmpName(void* arg) {
(void)arg;
return "foo";
}

// Custom compaction filter
static void CFilterDestroy(void* arg) { (void)arg; }
static const char* CFilterName(void* arg) {
Expand Down Expand Up @@ -675,7 +670,8 @@ int main(int argc, char** argv) {
((int) geteuid()));

StartPhase("create_objects");
cmp = rocksdb_comparator_create(NULL, CmpDestroy, CmpCompare, CmpName);
cmp = rocksdb_comparator_create(NULL, CmpDestroy, CmpCompare, "foo",
/*can_keys_...=*/0);
dbpath = rocksdb_dbpath_create(dbpathname, 1024 * 1024);
env = rocksdb_create_default_env();

Expand Down
10 changes: 10 additions & 0 deletions db/column_family.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1428,6 +1428,16 @@ Status ColumnFamilyData::ValidateOptions(
"FIFO compaction only supported with max_open_files = -1.");
}

if (cf_options.comparator->CanKeysWithDifferentByteContentsBeEqual()) {
// Checks for incompatible hashing (see also
// BlockBasedTableFactory::ValidateOptions)
if (cf_options.memtable_prefix_bloom_size_ratio > 0.0) {
return Status::InvalidArgument(
"Memtable Bloom filter not compatible with comparator where "
"different byte contents can be equal");
}
}

return s;
}

Expand Down
9 changes: 9 additions & 0 deletions db/comparator_db_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ class DoubleComparator : public Comparator {
void FindShortSuccessor(std::string* /*key*/) const override {}
};

// Hash order first, then lexicographic order when hashes match
class HashComparator : public Comparator {
public:
HashComparator() {}
Expand All @@ -216,6 +217,10 @@ class HashComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};

class TwoStrComparator : public Comparator {
Expand Down Expand Up @@ -248,6 +253,10 @@ class TwoStrComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};
} // namespace

Expand Down
8 changes: 8 additions & 0 deletions db/db_bloom_filter_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3139,6 +3139,10 @@ class BackwardBytewiseComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};

const BackwardBytewiseComparator kBackwardBytewiseComparator{};
Expand Down Expand Up @@ -3363,6 +3367,10 @@ class WeirdComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};
const WeirdComparator kWeirdComparator{};

Expand Down
3 changes: 3 additions & 0 deletions db/db_compaction_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3248,6 +3248,9 @@ TEST_P(DBCompactionTestWithParam, ForceBottommostLevelCompaction) {
void FindShortSuccessor(std::string* key) const override {
return BytewiseComparator()->FindShortSuccessor(key);
}
bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
} short_key_cmp;
Options options = CurrentOptions();
options.target_file_size_base = 100000000;
Expand Down
54 changes: 48 additions & 6 deletions db/db_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

#include <algorithm>
#include <set>
#include <string>
#include <thread>
#include <unordered_set>
#include <utility>
Expand Down Expand Up @@ -2135,6 +2136,9 @@ TEST_F(DBTest, ComparatorCheck) {
void FindShortSuccessor(std::string* key) const override {
BytewiseComparator()->FindShortSuccessor(key);
}
bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};
Options new_options, options;
NewComparator cmp;
Expand Down Expand Up @@ -2188,7 +2192,41 @@ TEST_F(DBTest, CustomComparator) {
new_options.write_buffer_size = 4096; // Compact more often
new_options.arena_block_size = 4096;
new_options = CurrentOptions(new_options);
DestroyAndReopen(new_options);

bool expect_open_failure = false;
if (new_options.prefix_extractor && new_options.table_factory->IsInstanceOf(
TableFactory::kPlainTableName())) {
expect_open_failure = true;
}
if (new_options.memtable_prefix_bloom_size_ratio > 0.0) {
expect_open_failure = true;
}
const BlockBasedTableOptions* bbto = nullptr;
if (new_options.table_factory->IsInstanceOf(
TableFactory::kBlockBasedTableName())) {
bbto = new_options.table_factory->GetOptions<BlockBasedTableOptions>();
if (bbto->filter_policy) {
expect_open_failure = true;
} else if (bbto->index_type ==
BlockBasedTableOptions::IndexType::kHashSearch) {
expect_open_failure = true;
} else if (bbto->data_block_index_type ==
BlockBasedTableOptions::DataBlockIndexType::
kDataBlockBinaryAndHash) {
expect_open_failure = true;
}
}

Destroy(last_options_);
{
Status s = TryReopen(new_options);
if (expect_open_failure) {
ASSERT_NOK(s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we at least assert the NOK status is InvalidArg to prevent false positive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do that

continue;
} else {
ASSERT_OK(s);
}
}
CreateAndReopenWithCF({"pikachu"}, new_options);
ASSERT_OK(Put(1, "[10]", "ten"));
ASSERT_OK(Put(1, "[0x14]", "twenty"));
Expand All @@ -2202,15 +2240,19 @@ TEST_F(DBTest, CustomComparator) {
Compact(1, "[0]", "[9999]");
}

std::string k;
for (int i = 0; i < 10000; i += 42) {
k = "[" + std::to_string(i) + "]";
ASSERT_OK(Put(1, k, k));
}
for (int run = 0; run < 2; run++) {
for (int i = 0; i < 1000; i++) {
char buf[100];
snprintf(buf, sizeof(buf), "[%d]", i * 10);
ASSERT_OK(Put(1, buf, buf));
for (int i = 0; i < 10000; i += 42) {
k = "[" + std::to_string(i) + "]";
ASSERT_EQ(k, Get(1, k));
}
Compact(1, "[0]", "[1000000]");
}
} while (ChangeCompactOptions());
} while (ChangeOptions());
}

TEST_F(DBTest, DBOpen_Options) {
Expand Down
7 changes: 5 additions & 2 deletions db/db_test_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,8 @@ bool DBTestBase::ShouldSkipOptions(int option_config, int skip_mask) {
option_config == kPlainTableCappedPrefix ||
option_config == kPlainTableCappedPrefixNonMmap ||
option_config == kPlainTableAllBytesPrefix ||
option_config == kVectorRep || option_config == kHashLinkList ||
option_config == kVectorRepAndMemtableBloom ||
option_config == kHashLinkList ||
option_config == kUniversalCompaction ||
option_config == kUniversalCompactionMultiLevel ||
option_config == kUniversalSubcompactions ||
Expand Down Expand Up @@ -412,10 +413,12 @@ Options DBTestBase::GetOptions(
options.max_sequential_skip_in_iterations = 999999;
set_block_based_table_factory = false;
break;
case kVectorRep:
case kVectorRepAndMemtableBloom:
options.memtable_factory.reset(new VectorRepFactory(100));
options.allow_concurrent_memtable_write = false;
options.unordered_write = false;
options.memtable_prefix_bloom_size_ratio = 0.1;
options.memtable_whole_key_filtering = true;
break;
case kHashLinkList:
options.prefix_extractor.reset(NewFixedPrefixTransform(1));
Expand Down
2 changes: 1 addition & 1 deletion db/db_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -1009,7 +1009,7 @@ class DBTestBase : public testing::Test {
kPlainTableCappedPrefix = 4,
kPlainTableCappedPrefixNonMmap = 5,
kPlainTableAllBytesPrefix = 6,
kVectorRep = 7,
kVectorRepAndMemtableBloom = 7,
kHashLinkList = 8,
kMergePut = 9,
kFilter = 10,
Expand Down
4 changes: 4 additions & 0 deletions db/db_with_timestamp_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,10 @@ class DBBasicTestWithTimestampBase : public DBTestBase {
}
return 0;
}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};

std::string Timestamp(uint64_t low, uint64_t high);
Expand Down
4 changes: 4 additions & 0 deletions db/file_indexer_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ class IntComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};

class FileIndexerTest : public testing::Test {
Expand Down
4 changes: 4 additions & 0 deletions db/prefix_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,10 @@ class TestKeyComparator : public Comparator {
const Slice& /*limit*/) const override {}

void FindShortSuccessor(std::string* /*key*/) const override {}

bool CanKeysWithDifferentByteContentsBeEqual() const override {
return false;
}
};

namespace {
Expand Down
7 changes: 5 additions & 2 deletions include/rocksdb/c.h
Original file line number Diff line number Diff line change
Expand Up @@ -1765,7 +1765,8 @@ extern ROCKSDB_LIBRARY_API rocksdb_comparator_t* rocksdb_comparator_create(
void* state, void (*destructor)(void*),
int (*compare)(void*, const char* a, size_t alen, const char* b,
size_t blen),
const char* (*name)(void*));
const char* name,
unsigned char can_keys_with_different_byte_contents_be_equal);
extern ROCKSDB_LIBRARY_API void rocksdb_comparator_destroy(
rocksdb_comparator_t*);

Expand All @@ -1779,7 +1780,9 @@ rocksdb_comparator_with_ts_create(
int (*compare_without_ts)(void*, const char* a, size_t alen,
unsigned char a_has_ts, const char* b,
size_t blen, unsigned char b_has_ts),
const char* (*name)(void*), size_t timestamp_size);
const char* name,
unsigned char can_keys_with_different_byte_contents_be_equal,
size_t timestamp_size);

/* Filter policy */

Expand Down
13 changes: 7 additions & 6 deletions include/rocksdb/comparator.h
Original file line number Diff line number Diff line change
Expand Up @@ -108,10 +108,11 @@ class Comparator : public Customizable, public CompareInterface {
return false;
}

// return true if two keys with different byte sequences can be regarded
// as equal by this comparator.
// The major use case is to determine if DataBlockHashIndex is compatible
// with the customized comparator.
// Return true if two keys with different byte sequences might be regarded
// as equal by this comparator (safe). Return false only if it is known that
// different byte sequences must be non-equal.
// This is used to detect whether hashing features such as hash indexes and
// Bloom filters are incompatible with the Comparator.
virtual bool CanKeysWithDifferentByteContentsBeEqual() const { return true; }

// if it is a wrapped comparator, may return the root one.
Expand Down Expand Up @@ -157,8 +158,8 @@ class Comparator : public Customizable, public CompareInterface {
// must not be deleted.
extern const Comparator* BytewiseComparator();

// Return a builtin comparator that uses reverse lexicographic byte-wise
// ordering.
// Return a builtin comparator that uses the reverse order of
// BytewiseComparator (still left-to-right lexicographic)
hx235 marked this conversation as resolved.
Show resolved Hide resolved
extern const Comparator* ReverseBytewiseComparator();

} // namespace ROCKSDB_NAMESPACE
1 change: 1 addition & 0 deletions include/rocksdb/filter_policy.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

namespace ROCKSDB_NAMESPACE {

class Comparator;
class Slice;
struct BlockBasedTableOptions;
struct ConfigOptions;
Expand Down
Loading