-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataBlockHashIndex: Standalone Implementation with Unit Test #4139
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwu has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
util/coding.h
Outdated
@@ -32,6 +32,7 @@ namespace rocksdb { | |||
const unsigned int kMaxVarint64Length = 10; | |||
|
|||
// Standard Put... routines append to a string | |||
extern void PutFixed16(std::string* dst, uint16_t value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes here in coding.h might be a good candidate to split into a separate PR, with their own unit test in coding_test.cc, instead of clubbing them with DataBlockHashIndex discussion and implementation.
CMakeLists.txt
Outdated
@@ -921,6 +922,7 @@ if(WITH_TESTS) | |||
table/cleanable_test.cc | |||
table/cuckoo_table_builder_test.cc | |||
table/cuckoo_table_reader_test.cc | |||
table/data_block_hash_index.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be data_block_hash_index_test.cc ... and that's why the windows build failed.
Summary: The first step of the DataBlockHashIndex implementation. A string based hash table is implemented and unit-tested. DataBlockHashIndexBuilder: Add() takes pairs of <key, restart_index>, and formats it into a string when Finish() is called. DataBlockHashIndex: initialized by the formatted string, and can interpret it as a hash table. Supporting Seek(). Test Plan: Unit test: data_block_hash_index_test make check -j 32 Reviewers: Sagar Vemuri
Summary: The Seek() in the initial implementation is inefficient in that it needs vector creation and emplace operations, which can be eliminated by a iteration implementation.
3b44e3c
to
42c5ce6
Compare
@fgwu has updated the pull request. Re-import the pull request |
@fgwu has updated the pull request. Re-import the pull request |
table/data_block_hash_index.h
Outdated
|
||
namespace rocksdb { | ||
|
||
// The format of the datablock hash map is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Lets mention here that this is an experimental feature that is being added to improve point-lookup cost.
- Lets also mention that this Index is per block, and will live within the data-block, to avoid people confusing with the per-table Index blocks.
table/data_block_hash_index.h
Outdated
// | ||
// Each bucket B has the following structure: | ||
// [TAG RESTART_INDEX][TAG RESTART_INDEX]...[TAG RESTART_INDEX] | ||
// where TAG is the hash value of the second hash funtion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets explain here how to locate a key/value by giving a example.
Can you also provide more information here about why you decided to use as second hash function? Put a reference to the paper too from which this double-hashing idea is taken, so that the original authors get the credit.
table/data_block_hash_index.cc
Outdated
const uint32_t kSeed_tag = 214; /* second hash seed */ | ||
|
||
inline uint16_t HashToBucket(const Slice& s, uint16_t num_buckets) { | ||
return (uint16_t)rocksdb::Hash(s.data(), s.size(), kSeed) % num_buckets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast<uint16_t>
table/data_block_hash_index.h
Outdated
private: | ||
const char *data_; | ||
uint16_t size_; | ||
uint16_t num_buckets_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a few comments here about how you came to the conclusion that 2 bytes should be more than enough.
namespace rocksdb { | ||
|
||
const uint32_t kSeed = 2018; | ||
const uint32_t kSeed_tag = 214; /* second hash seed */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you come up with these seeds? Should we use some prime numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Hash()
I used (util/hash.cc
) is not based on prime, instead, it uses bitwise XOR and rotation to calculate the hash value.
Lines 17 to 31 in 1857576
uint32_t Hash(const char* data, size_t n, uint32_t seed) { | |
// Similar to murmur hash | |
const uint32_t m = 0xc6a4a793; | |
const uint32_t r = 24; | |
const char* limit = data + n; | |
uint32_t h = static_cast<uint32_t>(seed ^ (n * m)); | |
// Pick up four bytes at a time | |
while (data + 4 <= limit) { | |
uint32_t w = DecodeFixed32(data); | |
data += 4; | |
h += w; | |
h *= m; | |
h ^= (h >> 16); | |
} |
It does not have to be prime. Examples:
Line 85 in 1857576
return Hash(s.data(), s.size(), 0); |
Line 27 in 1857576
return Hash(s.data(), s.size(), 397); |
/* push a TAG to avoid false postive */ | ||
/* the TAG is the hash function value of another seed */ | ||
uint16_t tag = static_cast<uint16_t>( | ||
rocksdb::Hash(key.data(), key.size(), kSeed_tag)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rocksdb::Hash() returns uint32_t ... so what affect will the loss of precision have on our algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casting to uint16 (effectively chops the higher 16 bit of hash off) has the benefit of reducing the space overhead as well as a more compact bucket that is more CPU cache line friendly. The downside is that there will be more hash collision. The current experiment shows that reducing the TAG length improves throughput. So I assume the benefit overweights the downside.
@fgwu has updated the pull request. Re-import the pull request |
@fgwu has updated the pull request. Re-import the pull request |
@fgwu has updated the pull request. Re-import the pull request |
|
||
private: | ||
uint16_t num_buckets_; | ||
std::vector<std::vector<uint16_t>> buckets_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its fine for now, but lets convert the interior std::vector<uint16_t> to something like struct Bucket {uint16_t tag, uint16_t restart_index}
later, so that it is cleaner and immediately obvious what the fields are. That way you can also avoid doing 2 * sizeof(uint16_t)
at various places, and instead do sizeof(Bucket)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwu is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@fgwu has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwu has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@fgwu has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgwu has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the "land" issues.
…k#4139) Summary: The first step of the `DataBlockHashIndex` implementation. A string based hash table is implemented and unit-tested. `DataBlockHashIndexBuilder`: `Add()` takes pairs of `<key, restart_index>`, and formats it into a string when `Finish()` is called. `DataBlockHashIndex`: initialized by the formatted string, and can interpret it as a hash table. Lookup for a key is supported by iterator operation. Pull Request resolved: facebook#4139 Reviewed By: sagar0 Differential Revision: D8866764 Pulled By: fgwu fbshipit-source-id: 7f015f0098632c65979a22898a50424384730b10
Summary:
The first step of the
DataBlockHashIndex
implementation. A string based hash table is implemented and unit-tested.DataBlockHashIndexBuilder
:Add()
takes pairs of<key, restart_index>
, and formats it into a string whenFinish()
is called.DataBlockHashIndex
: initialized by the formatted string, and can interpret it as a hash table. Lookup for a key is supported by iterator operation.Test Plan:
Unit test:
data_block_hash_index_test
make check -j 32