Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: very minimal code that adds b-tree to the codebase #1596

Merged
merged 2 commits into from Jul 31, 2023
Merged

Conversation

romange
Copy link
Collaborator

@romange romange commented Jul 29, 2023

The motivation to have our own b-tree to repalce zskiplist is shown by #1567 Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.

Currently the functionality supports Insert method only to reduce the review complexity. The design decisions behind the data structure are described in src/core/detail/btree_internal.h

@romange romange added the enhancement New feature or request label Jul 29, 2023
@romange romange requested a review from dranikpg July 29, 2023 18:14
@romange romange force-pushed the AddBtree1 branch 3 times, most recently from 8b6d9be to 0e8e45d Compare July 30, 2023 04:48
Copy link
Collaborator

@chakaz chakaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see some comments, mostly nits.
I actually have not looked yet into the actual search / rebalance logic quite yet, but I'm happy to do that as well ofcourse.

using KeyCompareTo = DefaultCompareTo<T>;
};

template <typename T, typename Policy = BTreePolicy<T> > class BTree {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to extend BTreePolicy in the future?
Because right now it contains Key_t which is redundant (it's part of BTree itself) and the default comparer, so maybe make the default comparer a direct template argument?
(that's also how STL does it, like in maps)

Also, no need in modern C++ to have a space between > > :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason behind this - to pass a single template param to the internal classes. That's what absl tree does as well.
I think it's matter of taste.

using KeyCompareTo = DefaultCompareTo<T>;
};

template <typename T, typename Policy = BTreePolicy<T> > class BTree {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it BPTree? B trees are a similar yet different data structure

Comment on lines 50 to 56
size_t Size() const {
return count_;
}

size_t NodeCount() const {
return num_nodes_;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what's the difference between count_ and num_nodes_..
num_nodes_ seem to track allocated nodes, whereas count_ tracks the ones actually inserted, but aren't they supposed to be the same? Is it for debugging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, they're updated correctly. Count is the number of items, num_nodes the number of nodes. Different things in a B+ tree

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, each node can hold as much as 31 keys, hence the difference.

Comment on lines 34 to 35
// 5. We assume we store POD types - this greatly reduces the complexity of the generics
// in the code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As such, I'd add static_assert(is_pod_v<T>); somewhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to use std::is_trivially_copyable_v<T> instead 😅


namespace detail {

// Internal classes related to B+tree implementation. The design is largely based on the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop this file and move its content to btree_set.h. I think it's easier to read / use, but maybe it's just me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in the opposite direction: we use it only for a small number of types, so it can be explicitly instantiated for those and the internals can be hidden away

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To not insult anyone, I will leave it as is :)

uint32_t found : 1;
};

// Searches for key in the node using binary seach.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seach->search

src/core/detail/btree_internal.h Outdated Show resolved Hide resolved
src/core/detail/btree_internal.h Outdated Show resolved Hide resolved
unsigned pos;
};

Record record_[kMaxDepth];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::array<Record, kMaxDepth> instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it but inside I weakly disagree. I feel that array is c++ struct that has advantage in SOME cases but in those it does not, it's just another include.

src/core/detail/btree_internal.h Outdated Show resolved Hide resolved
Comment on lines 50 to 56
size_t Size() const {
return count_;
}

size_t NodeCount() const {
return num_nodes_;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, they're updated correctly. Count is the number of items, num_nodes the number of nodes. Different things in a B+ tree

return false;
}

assert(path.Depth() > 0u);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw assert to avoid being dependant on glog?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I do not like having glog in header files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is why I like explicit instantiations 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not mind it. If @chakaz does not have strong objections - I can move this to cc file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always prefer implementations to be in the .cc, like.. always :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously except for when it's impossible, like with templates :(


namespace detail {

// Internal classes related to B+tree implementation. The design is largely based on the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in the opposite direction: we use it only for a small number of types, so it can be explicitly instantiated for those and the internals can be hidden away


Key_t Key(unsigned index) const {
Key_t res;
memcpy(&res, KeyPtr(index), kKeySize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. If the offset will stay 8 bytes in the future, the keys will be aligned correctly so they can be read and written in-place

Comment on lines 104 to 107
struct SearchResult {
uint32_t index : 31;
uint32_t found : 1;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only only use it to return result from functions and never place it anywhere besides the stack, surprised you use bitfields here 😵

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right.

Comment on lines 187 to 195
union {
struct {
uint64_t num_items_ : 7;
uint64_t leaf_ : 1;
uint64_t : 56;
};

uint64_t val_;
};
Copy link
Contributor

@dranikpg dranikpg Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you use val_ only to cleanly initialize the fields above? Why can't you just init them directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can if you prefer to it this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mentioned it because the top-level union is a little confusing, its not obvious that val_ has no other uses

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 248 to 271
uint32_t i = 0;
uint32_t n = num_items_;
typename Policy::KeyCompareTo cmp_op;
while (i < n) {
uint32_t j = (i + n) >> 1;
assert(j < n);

Key_t item = Key(j);

int cmp_res = cmp_op(key, item);
if (cmp_res == 0) {
return SearchResult{.index = j, .found = 1};
}

if (cmp_res < 0) {
n = j;
} else {
i = j + 1; // we never return indices upto j because they are strictly less than key.
}
}
assert(i == n);

return {.index = i, .found = 0};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little hard to read, instead of i, n, j, >>1 I'd use lo, hi, mid, /2 or smth else

But again, if we allow getting a Key_t* we can just use stl's lower_bound instead of re-writing it

Copy link
Collaborator Author

@romange romange Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename the variables. I do not want to use lower_bound because of inefficient less abstraction. I prefer a three-way compare because we gonna use bptree with sds strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see, so you want it to stop immediately if the item was found when placing the mid point

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand the question :)
I use cmp_op once per iteration, where lower_bound must use compare operator twice.

Comment on lines 287 to 288
template <typename Policy>
std::pair<BTreeNode<Policy>*, unsigned> BTreeNode<Policy>::RebalanceChild(unsigned pos,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level comment on rebalancing would make understanding the code easier

Comment on lines 267 to 279
detail::BTreeNode<Policy>* BTree<T, Policy>::CreateNode(bool leaf) {
num_nodes_++;
void* ptr = mr_->allocate(BTreeNode::kTargetNodeSize, 8);
BTreeNode* node = new (ptr) BTreeNode(leaf);

return node;
}

template <typename T, typename Policy> void BTree<T, Policy>::DestroyNode(BTreeNode* node) {
void* ptr = node;
mr_->deallocate(ptr, BTreeNode::kTargetNodeSize, 8);
num_nodes_--;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave a high level comment somewhere above BTreeNode to explain why it offsets its this pointer beyond its size. Shahar is right, we could just keep the buffer in the struct to occupy this space, but it would make implementing point 3 on your todo (smaller nodes for small trees) more difficult and you'd have to revert back to this approach

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. absl implements small object optimization. In short, the answer is - I embrace the fact that high-level languages can not support dynamically created structs. We usually do not use this approach but when it's needed I am not going to avoid it because it's "not nice". And btw, I used a similar approach in dense_set when allocating metadata for expire time of hash elements.

// allocate less then 256 bytes (special case) to avoid relative blowups in memory for
// small trees.

template <typename Policy> class BTreeNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be immovable and non-copyable

The motivation to have our own b-tree to repalce zskiplist is shown by #1567
Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.

Currently the functionality supports Insert method only to reduce the review complexity.
The design decisions behind the data structure are described in src/core/detail/btree_internal.h

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
@romange
Copy link
Collaborator Author

romange commented Jul 30, 2023

Please see some comments, mostly nits. I actually have not looked yet into the actual search / rebalance logic quite yet, but I'm happy to do that as well ofcourse.

Shahar, I do not think it's required and I asked @dranikpg also not to spend time searching for correctness bugs. It's not that I am sure I do not have bugs :), it's just this code is far from being prod ready anyway and I trust tests, assertions, and absl reference code to guide me through most of the obstacles.

@romange
Copy link
Collaborator Author

romange commented Jul 30, 2023

To be more precise I do not look at absl code (which is incomprehensible, imho), I look at the original library coming from Google: https://code.google.com/archive/p/cpp-btree/

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Copy link
Contributor

@dranikpg dranikpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given its "very minimal" I don't have further comments 🙂

@romange romange merged commit 723cc62 into main Jul 31, 2023
10 checks passed
@romange romange deleted the AddBtree1 branch July 31, 2023 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants