Refactor split/spill #148

benbjohnson · 2014-05-02T20:06:06Z

This commit refactors the split/spill functionality. The previous implementation attempted to avoid additional allocations by doing some fancy depth sorting and looping trickery but it was really hard to follow/debug/extend.

The new implementation adds a node.children slice that is populated with in-memory child nodes. This makes the split() and spill() code a simple recursive algorithm and will make #94 easy to fix. This refactor is also required to make inline buckets (#124) possible.

Although there is an extra allocation required for branch nodes, I would expect it to amortize well over transactions. I'll add line notes to the PR for further explanation.

Note: This has no effect on the file format or the API. It's just an internal change.

/cc @snormore @mkobetic

benbjohnson · 2014-05-02T20:08:37Z

bucket.go

+	tx       *Tx
+	buckets  map[string]*Bucket
+	rootNode *node
+	nodes    map[pgid]*node
 }


The pending slice was used for tracking in-progress split nodes during allocation. This was in case an allocation caused a remap of the mmap, all pending nodes could be dereferenced.

Since the split/spill phases have been separated this is no longer necessary.

benbjohnson · 2014-05-03T05:34:12Z

https://drone.io/github.com/benbjohnson/bolt/110

mkobetic · 2014-05-03T17:08:28Z

I'm still catching up on how the nested buckets work. It seems that a nested bucket is a key/value pair where the value is the simple bucket struct {root pgid, sequence uint64} right? Simple enough. The bit I'm struggling with is what exactly is the purpose of Bucket.spill() ? Why does it need to recurse into nested buckets?

mkobetic · 2014-05-03T17:11:59Z

OK it seems spilling is the creation of the storage pages from the materialized nodes during commit. Nodes should already be split and rebalanced as needed by that time, so they just need to be written out, right?

snormore · 2014-05-03T19:24:46Z

bucket_test.go

@@ -703,13 +765,12 @@ func TestBucket_Delete_Quick(t *testing.T) {
 				db.View(func(tx *Tx) error {
 					b := tx.Bucket([]byte("widgets"))
 					for j, exp := range items {
+						var value = b.Get(exp.Key)


nitpick: value := b.Get(exp.Key) for consistency.

@snormore I've been moving toward using var since it makes declarations more explicit but you're right -- it's inconsistent. I'll change it back.

snormore · 2014-05-03T19:58:45Z

LGTM

benbjohnson · 2014-05-03T22:19:21Z

@mkobetic Yep. You're correct. Spilling simply writes the nodes to byte slices and those bytes slices get written out to disk. The spill() function calls split() on the nodes because splitting can be somewhat expensive. It has to go through each inode and add up the size. Maybe we can do some caching in the future so we can split as inodes are added.

The spilling gets called on nested buckets only when those nested buckets have been materialized. So if you don't use a bucket in a transaction then it won't be spilled. Changing the spill to be recursive makes it easy to determine the size of a bucket is small enough to fit inline on a parent page.

Refactor split/spill

benbjohnson reviewed May 2, 2014
View reviewed changes

snormore reviewed May 3, 2014
View reviewed changes

Refactor split/spill.

25fea2f

benbjohnson added a commit that referenced this pull request May 3, 2014

Merge pull request #148 from benbjohnson/split-spill

60624c4

Refactor split/spill

benbjohnson merged commit 60624c4 into boltdb:master May 3, 2014

benbjohnson deleted the split-spill branch May 3, 2014 22:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor split/spill #148

Refactor split/spill #148

benbjohnson commented May 2, 2014

benbjohnson May 2, 2014

benbjohnson commented May 3, 2014

mkobetic commented May 3, 2014

mkobetic commented May 3, 2014

snormore May 3, 2014

benbjohnson May 3, 2014

snormore commented May 3, 2014

benbjohnson commented May 3, 2014

Refactor split/spill #148

Refactor split/spill #148

Conversation

benbjohnson commented May 2, 2014

benbjohnson May 2, 2014

Choose a reason for hiding this comment

benbjohnson commented May 3, 2014

mkobetic commented May 3, 2014

mkobetic commented May 3, 2014

snormore May 3, 2014

Choose a reason for hiding this comment

benbjohnson May 3, 2014

Choose a reason for hiding this comment

snormore commented May 3, 2014

benbjohnson commented May 3, 2014