-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Conversation
tx *Tx | ||
buckets map[string]*Bucket | ||
rootNode *node | ||
nodes map[pgid]*node | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pending
slice was used for tracking in-progress split nodes during allocation. This was in case an allocation caused a remap of the mmap, all pending nodes could be dereferenced.
Since the split/spill phases have been separated this is no longer necessary.
I'm still catching up on how the nested buckets work. It seems that a nested bucket is a key/value pair where the value is the simple bucket struct {root pgid, sequence uint64} right? Simple enough. The bit I'm struggling with is what exactly is the purpose of Bucket.spill() ? Why does it need to recurse into nested buckets? |
OK it seems spilling is the creation of the storage pages from the materialized nodes during commit. Nodes should already be split and rebalanced as needed by that time, so they just need to be written out, right? |
@@ -703,13 +765,12 @@ func TestBucket_Delete_Quick(t *testing.T) { | |||
db.View(func(tx *Tx) error { | |||
b := tx.Bucket([]byte("widgets")) | |||
for j, exp := range items { | |||
var value = b.Get(exp.Key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: value := b.Get(exp.Key)
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snormore I've been moving toward using var
since it makes declarations more explicit but you're right -- it's inconsistent. I'll change it back.
LGTM |
@mkobetic Yep. You're correct. Spilling simply writes the nodes to byte slices and those bytes slices get written out to disk. The The spilling gets called on nested buckets only when those nested buckets have been materialized. So if you don't use a bucket in a transaction then it won't be spilled. Changing the spill to be recursive makes it easy to determine the size of a bucket is small enough to fit inline on a parent page. |
This commit refactors the split/spill functionality. The previous implementation attempted to avoid additional allocations by doing some fancy depth sorting and looping trickery but it was really hard to follow/debug/extend.
The new implementation adds a
node.children
slice that is populated with in-memory child nodes. This makes thesplit()
andspill()
code a simple recursive algorithm and will make #94 easy to fix. This refactor is also required to make inline buckets (#124) possible.Although there is an extra allocation required for branch nodes, I would expect it to amortize well over transactions. I'll add line notes to the PR for further explanation.
Note: This has no effect on the file format or the API. It's just an internal change.
/cc @snormore @mkobetic