Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImmutableBTree #13

Closed
wants to merge 1 commit into from
Closed

ImmutableBTree #13

wants to merge 1 commit into from

Conversation

keep94
Copy link
Contributor

@keep94 keep94 commented Oct 7, 2016

In this pull request, I introduce the ImmutableBTree and Builder class following the API proposal posted here: #10 (comment).

In this PR, I follow the design doc posted here: #10 (comment)

However, I substitute the copy-on-write set with a context object which includes both the copy-on-write list and the free list. Introducing the context object obviates the need to store a reference to the btree or freelist in each node object and makes the code more readable.

As stated in the API proposal, the Builder type and BTree type share the same methods. I almost got rid of the Builder type in favour of adding a Set() and Build() method directly to the BTree type to save on API load. This would have been a mistake since since read methods on BTree instances are safe to use with multiple goroutines. In order for the Build method to run in constant time, it must mutate state in the Builder. Having one read method in BTree that isn't safe to use with multiple goroutines would cause confusion and could never be reversed later on. By having a separate Builder class, I can document that Builder instances are not safe to use with multiple goroutines. The Builder pattern is already familiar to developers, so developers are less likely to attempt to use a Builder instance with multiple goroutines. The increased API load that the Builder type brings is worth it as it is less likely that developers will misuse it.

The benchmarks for Insert and Delete on BTree are roughly the same. It is hard to tell with the variance between runs, but I estimate that Insert and Delete on BTree with this pull request run 3% slower than before. Although I made every effort to preserve the original btree code, I do expect a slight slow down because the copying of nodes on write reduces to self assignment of child pointers and a no-op function call for mutable btrees. Self assignment and an extra function call is cheap but not completely free. To ensure that the characteristics of insert and delete don't change at all, it would have been necessary for me to copy and paste code rather than reuse code for the immutable case which would not have been good for the long term maintenance of this repo.

The benchmarks for read methods on BTree are the same with this PR as this PR makes no changes to how btrees are read.

With this PR, I added new benchmarks to measure the performance of the Builder class. I added four new benchmarks:

  1. insert one new item into an immutable btree of 10000 items to get a new immutable btree of 10001 items
  2. insert 100 new items into an immutable btree of 10000 items to get a new immutable btree of 10100 items.
  3. Remove 1 item from an immutable btree of 10000 items to get a new immutable btree of 9999 items.
  4. Remove 100 items from an immutable btree of 10000 items to get a new immutable btree of 9900 items.

An operation in 1) runs ~5.5 times slower than a standard insert into a mutable btree. This is to be expected because of the copy-on-write needed to preserve immutability.

An operation in 2) runs ~2.75 times slower than a standard insert into a mutable btree. Operations in 2) run faster than operations in 1) because inserting 100 items does not create 100 intermediate immutable btrees. However, each additional item to insert does require making copies of additional nodes.

The characteristics of 3) and 4) with respect to a standard delete are very similar to the characteristics of 10 and 2) with respect to a standard insert.

On my machine, a standard Get takes 530ns from a btree with 10,000 nodes while one of 100 inserts into an immutable btree with 10,000 nodes using a builder takes 1700ns. Just building a mutable copy of a btree with 10,000 nodes would take 5.3ms, but doing 100 inserts with a Builder takes just 0.17ms

These new benchmarks that I added show that making a modified copy of an ImmutableBTree using a Builder is usually much faster than building a brand new tree from scratch. The benchmarks also show that Builder instances avoid creating unnecessary intermediate instances.

I believe that the ImmutableBTree type will make a great addition to this library as it makes it easier to build highly concurrent applications that must maintain a shared BTree. Using ImmutableBTrees makes it easy to do optimistic locking. A client can use a builder to make all the private modifications to the shared btree they want and then do a test and set to commit their changes. To roll back, the client can just give up in the middle. With conventional BTrees the client would have to do locking all the while they make modifications and rolling back on error would involve keeping track and undoing any changes already made.

This BTree library is fantastically written and easy to read and understand. It has saved me lots of time on my project. I hope you find this PR useful.

@gconnell
Copy link
Collaborator

Hey, keep,

I've been noodling about this idea for a while, and I keep coming back to the fact that the API complexity is a little much. However, I've recently come across a similar problem with needing to clone btrees and use the clones concurrently in different goroutines (with, as you so presciently mentioned, optimistic locking :D). I've come up with a similar-but-simpler approach to this, see #16. I'd really like your feedback on that, if you have a sec and can take a look. Your PR was definitely a huge inspiration in the final product, but the new approach adds a single Clone() function instead of an entirely new ImmutableBTree type. PTAL if you have a sec, any comments/questions greatly appreciated,

--Graeme

@gconnell gconnell closed this Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants