-
Notifications
You must be signed in to change notification settings - Fork 2.5k
ART: inline row IDs into node pointers #8112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # src/execution/index/art/prefix.cpp
…e, also renaming all occurences of 'swizzled' to 'serialized'
# Conflicts: # .github/config/uncovered_files.csv # src/common/enum_util.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR and results!
I have two small comments
- Should we add a test that stresses the chaining of the leaf? As in inserts one row, stores, restarts, query, query, insert another row (now not inlined but actually a leaf node), repeat until getting a chain, then remove all the way back?
- About the chaining, it can potentially add a lot of random access to data that has a lot of repetition. What about dynamic resizing? The disadvantage ofc of resizing is that it can become a peak bottleneck if resizing big arrays, so maybe a combination of both? :-)
|
@pdet, thanks for having a look! :) I will add that stress test! About the dynamic resizing, I just remembered why that wasn't feasible haha. We have fixed-size allocators for fast node allocations, and they require fixed-size nodes. So different/dynamic node sizes for prefixes/leaves are impossible if we want their memory to be managed by these allocators. And allocating/managing their memory separately is (I think) not worth it/adds too much complexity. |
Oh yeah, that makes 100% sense. |
|
Thanks! |
Overview
This PR supports inlining row IDs into node pointers if a leaf only contains a single row ID, which is in most cases.
Previously, we would create a
Leafnode of 16 bytes, even if we only wanted to store a single row ID in that leaf. Now, we keep that row ID directly in the node pointer to the leaf, removing the need for an additional leaf node. For example, for an ART on 100k unique integers, we previously allocated 100k leaves, i.e.,1,600,000 bytes = 1.6MB. Thus, with this PR, we save 1.6MB. Inlining row IDs also increases the performance of index operations as we traverse one fewer node (theLeaf).Memory and performance improvements
Here are some numbers on memory improvements (tests taken from
test/sql/index/art/memory/test_art_non_linear.test_slow). For 100k values, we save ~1.8MB by inlining (because we allocate blocks of 256KB). For many duplicates, we still need to allocate leaves holding the row IDs so we do not gain any memory improvements.The performance of index operation (slightly) increases for many workloads, specifically for constraint checking; here are a few examples (taken from
benchmark/micro/index/).Next steps
There are still many issues of people running out of memory during
CREATE INDEXoperations (e.g., #8066, #7760, ...). This is most likely due to the overhead in memory allocation in theSinkcalls ofPhysicalCreateIndex. I will try to address this issue next. After, I will continue with the improvements documented in #5865.Implementation details
Leaves
The size of the
Leafnode increased from 16 bytes to 48 bytes. That is because we now store up to four row IDs in that leaf, a pointer to a consecutive leaf, and a count. If we need to store more than four row IDs in a leaf, than we will end up with a chain of leaf nodes, similar to our implementation for prefix nodes.Node pointers
A
Nodepointer consists of 64 bits. Previously, we were using bit fields for ourNodeclass. This approach became more complicated because we either store offset + buffer/block ID in the last 56 bits, or the row ID. Additionally, compiling Windows in a way that respects the bit field constraints is not trivial (here). Therefore, we would end up with 16 bytes on Windows machines instead of the intended 8 bytes. Thus, we decided to perform all bit shifting/AND/OR operations ourselves. TheNodeclass contains the respectiveconstexprs and theGettersandSettersfor the different fields.Maximum row IDs
DuckDB uses temporary row IDs for local changes during transactions. Because we inline the row ID into 56 bits, we adjust the internal maximum numbers for row IDs.
Other changes
swizzledtoserializedGetARTNodeTypetoGetTypeSwizzleablePointerclass