Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this pull request, I introduce the ImmutableBTree and Builder class following the API proposal posted here: #10 (comment).
In this PR, I follow the design doc posted here: #10 (comment)
However, I substitute the copy-on-write set with a context object which includes both the copy-on-write list and the free list. Introducing the context object obviates the need to store a reference to the btree or freelist in each node object and makes the code more readable.
As stated in the API proposal, the Builder type and BTree type share the same methods. I almost got rid of the Builder type in favour of adding a Set() and Build() method directly to the BTree type to save on API load. This would have been a mistake since since read methods on BTree instances are safe to use with multiple goroutines. In order for the Build method to run in constant time, it must mutate state in the Builder. Having one read method in BTree that isn't safe to use with multiple goroutines would cause confusion and could never be reversed later on. By having a separate Builder class, I can document that Builder instances are not safe to use with multiple goroutines. The Builder pattern is already familiar to developers, so developers are less likely to attempt to use a Builder instance with multiple goroutines. The increased API load that the Builder type brings is worth it as it is less likely that developers will misuse it.
The benchmarks for Insert and Delete on BTree are roughly the same. It is hard to tell with the variance between runs, but I estimate that Insert and Delete on BTree with this pull request run 3% slower than before. Although I made every effort to preserve the original btree code, I do expect a slight slow down because the copying of nodes on write reduces to self assignment of child pointers and a no-op function call for mutable btrees. Self assignment and an extra function call is cheap but not completely free. To ensure that the characteristics of insert and delete don't change at all, it would have been necessary for me to copy and paste code rather than reuse code for the immutable case which would not have been good for the long term maintenance of this repo.
The benchmarks for read methods on BTree are the same with this PR as this PR makes no changes to how btrees are read.
With this PR, I added new benchmarks to measure the performance of the Builder class. I added four new benchmarks:
An operation in 1) runs ~5.5 times slower than a standard insert into a mutable btree. This is to be expected because of the copy-on-write needed to preserve immutability.
An operation in 2) runs ~2.75 times slower than a standard insert into a mutable btree. Operations in 2) run faster than operations in 1) because inserting 100 items does not create 100 intermediate immutable btrees. However, each additional item to insert does require making copies of additional nodes.
The characteristics of 3) and 4) with respect to a standard delete are very similar to the characteristics of 10 and 2) with respect to a standard insert.
On my machine, a standard Get takes 530ns from a btree with 10,000 nodes while one of 100 inserts into an immutable btree with 10,000 nodes using a builder takes 1700ns. Just building a mutable copy of a btree with 10,000 nodes would take 5.3ms, but doing 100 inserts with a Builder takes just 0.17ms
These new benchmarks that I added show that making a modified copy of an ImmutableBTree using a Builder is usually much faster than building a brand new tree from scratch. The benchmarks also show that Builder instances avoid creating unnecessary intermediate instances.
I believe that the ImmutableBTree type will make a great addition to this library as it makes it easier to build highly concurrent applications that must maintain a shared BTree. Using ImmutableBTrees makes it easy to do optimistic locking. A client can use a builder to make all the private modifications to the shared btree they want and then do a test and set to commit their changes. To roll back, the client can just give up in the middle. With conventional BTrees the client would have to do locking all the while they make modifications and rolling back on error would involve keeping track and undoing any changes already made.
This BTree library is fantastically written and easy to read and understand. It has saved me lots of time on my project. I hope you find this PR useful.