New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: very minimal code that adds b-tree to the codebase #1596
Conversation
8b6d9be
to
0e8e45d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see some comments, mostly nits.
I actually have not looked yet into the actual search / rebalance logic quite yet, but I'm happy to do that as well ofcourse.
src/core/btree_set.h
Outdated
using KeyCompareTo = DefaultCompareTo<T>; | ||
}; | ||
|
||
template <typename T, typename Policy = BTreePolicy<T> > class BTree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to extend BTreePolicy
in the future?
Because right now it contains Key_t
which is redundant (it's part of BTree
itself) and the default comparer, so maybe make the default comparer a direct template argument?
(that's also how STL does it, like in maps)
Also, no need in modern C++ to have a space between > >
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason behind this - to pass a single template param to the internal classes. That's what absl tree does as well.
I think it's matter of taste.
src/core/btree_set.h
Outdated
using KeyCompareTo = DefaultCompareTo<T>; | ||
}; | ||
|
||
template <typename T, typename Policy = BTreePolicy<T> > class BTree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it BPTree
? B trees are a similar yet different data structure
src/core/btree_set.h
Outdated
size_t Size() const { | ||
return count_; | ||
} | ||
|
||
size_t NodeCount() const { | ||
return num_nodes_; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what's the difference between count_
and num_nodes_
..
num_nodes_
seem to track allocated nodes, whereas count_
tracks the ones actually inserted, but aren't they supposed to be the same? Is it for debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, they're updated correctly. Count is the number of items, num_nodes the number of nodes. Different things in a B+ tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, each node can hold as much as 31 keys, hence the difference.
src/core/detail/btree_internal.h
Outdated
// 5. We assume we store POD types - this greatly reduces the complexity of the generics | ||
// in the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As such, I'd add static_assert(is_pod_v<T>);
somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to use std::is_trivially_copyable_v<T>
instead 😅
src/core/detail/btree_internal.h
Outdated
|
||
namespace detail { | ||
|
||
// Internal classes related to B+tree implementation. The design is largely based on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd drop this file and move its content to btree_set.h
. I think it's easier to read / use, but maybe it's just me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought in the opposite direction: we use it only for a small number of types, so it can be explicitly instantiated for those and the internals can be hidden away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not insult anyone, I will leave it as is :)
src/core/detail/btree_internal.h
Outdated
uint32_t found : 1; | ||
}; | ||
|
||
// Searches for key in the node using binary seach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seach->search
src/core/detail/btree_internal.h
Outdated
unsigned pos; | ||
}; | ||
|
||
Record record_[kMaxDepth]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::array<Record, kMaxDepth>
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I change it but inside I weakly disagree. I feel that array is c++ struct that has advantage in SOME cases but in those it does not, it's just another include.
src/core/btree_set.h
Outdated
size_t Size() const { | ||
return count_; | ||
} | ||
|
||
size_t NodeCount() const { | ||
return num_nodes_; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, they're updated correctly. Count is the number of items, num_nodes the number of nodes. Different things in a B+ tree
src/core/btree_set.h
Outdated
return false; | ||
} | ||
|
||
assert(path.Depth() > 0u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raw assert to avoid being dependant on glog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I do not like having glog in header files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is why I like explicit instantiations 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not mind it. If @chakaz does not have strong objections - I can move this to cc file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always prefer implementations to be in the .cc, like.. always :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously except for when it's impossible, like with templates :(
src/core/detail/btree_internal.h
Outdated
|
||
namespace detail { | ||
|
||
// Internal classes related to B+tree implementation. The design is largely based on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought in the opposite direction: we use it only for a small number of types, so it can be explicitly instantiated for those and the internals can be hidden away
src/core/detail/btree_internal.h
Outdated
|
||
Key_t Key(unsigned index) const { | ||
Key_t res; | ||
memcpy(&res, KeyPtr(index), kKeySize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. If the offset will stay 8 bytes in the future, the keys will be aligned correctly so they can be read and written in-place
src/core/detail/btree_internal.h
Outdated
struct SearchResult { | ||
uint32_t index : 31; | ||
uint32_t found : 1; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only only use it to return result from functions and never place it anywhere besides the stack, surprised you use bitfields here 😵
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right.
src/core/detail/btree_internal.h
Outdated
union { | ||
struct { | ||
uint64_t num_items_ : 7; | ||
uint64_t leaf_ : 1; | ||
uint64_t : 56; | ||
}; | ||
|
||
uint64_t val_; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you use val_ only to cleanly initialize the fields above? Why can't you just init them directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can if you prefer to it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mentioned it because the top-level union is a little confusing, its not obvious that val_
has no other uses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/core/detail/btree_internal.h
Outdated
uint32_t i = 0; | ||
uint32_t n = num_items_; | ||
typename Policy::KeyCompareTo cmp_op; | ||
while (i < n) { | ||
uint32_t j = (i + n) >> 1; | ||
assert(j < n); | ||
|
||
Key_t item = Key(j); | ||
|
||
int cmp_res = cmp_op(key, item); | ||
if (cmp_res == 0) { | ||
return SearchResult{.index = j, .found = 1}; | ||
} | ||
|
||
if (cmp_res < 0) { | ||
n = j; | ||
} else { | ||
i = j + 1; // we never return indices upto j because they are strictly less than key. | ||
} | ||
} | ||
assert(i == n); | ||
|
||
return {.index = i, .found = 0}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little hard to read, instead of i, n, j, >>1
I'd use lo, hi, mid, /2
or smth else
But again, if we allow getting a Key_t* we can just use stl's lower_bound instead of re-writing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rename the variables. I do not want to use lower_bound because of inefficient less
abstraction. I prefer a three-way compare because we gonna use bptree with sds strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I see, so you want it to stop immediately if the item was found when placing the mid point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not understand the question :)
I use cmp_op
once per iteration, where lower_bound must use compare
operator twice.
src/core/detail/btree_internal.h
Outdated
template <typename Policy> | ||
std::pair<BTreeNode<Policy>*, unsigned> BTreeNode<Policy>::RebalanceChild(unsigned pos, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A high level comment on rebalancing would make understanding the code easier
src/core/btree_set.h
Outdated
detail::BTreeNode<Policy>* BTree<T, Policy>::CreateNode(bool leaf) { | ||
num_nodes_++; | ||
void* ptr = mr_->allocate(BTreeNode::kTargetNodeSize, 8); | ||
BTreeNode* node = new (ptr) BTreeNode(leaf); | ||
|
||
return node; | ||
} | ||
|
||
template <typename T, typename Policy> void BTree<T, Policy>::DestroyNode(BTreeNode* node) { | ||
void* ptr = node; | ||
mr_->deallocate(ptr, BTreeNode::kTargetNodeSize, 8); | ||
num_nodes_--; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave a high level comment somewhere above BTreeNode to explain why it offsets its this pointer beyond its size. Shahar is right, we could just keep the buffer in the struct to occupy this space, but it would make implementing point 3 on your todo (smaller nodes for small trees) more difficult and you'd have to revert back to this approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. absl implements small object optimization. In short, the answer is - I embrace the fact that high-level languages can not support dynamically created structs. We usually do not use this approach but when it's needed I am not going to avoid it because it's "not nice". And btw, I used a similar approach in dense_set when allocating metadata for expire time of hash elements.
src/core/detail/btree_internal.h
Outdated
// allocate less then 256 bytes (special case) to avoid relative blowups in memory for | ||
// small trees. | ||
|
||
template <typename Policy> class BTreeNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be immovable and non-copyable
The motivation to have our own b-tree to repalce zskiplist is shown by #1567 Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree. Currently the functionality supports Insert method only to reduce the review complexity. The design decisions behind the data structure are described in src/core/detail/btree_internal.h Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Shahar, I do not think it's required and I asked @dranikpg also not to spend time searching for correctness bugs. It's not that I am sure I do not have bugs :), it's just this code is far from being prod ready anyway and I trust tests, assertions, and absl reference code to guide me through most of the obstacles. |
To be more precise I do not look at absl code (which is incomprehensible, imho), I look at the original library coming from Google: https://code.google.com/archive/p/cpp-btree/ |
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given its "very minimal" I don't have further comments 🙂
The motivation to have our own b-tree to repalce zskiplist is shown by #1567 Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.
Currently the functionality supports Insert method only to reduce the review complexity. The design decisions behind the data structure are described in src/core/detail/btree_internal.h