Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research BTree implementation #3

Closed
bartlomieju opened this issue Jul 19, 2021 · 10 comments
Closed

Research BTree implementation #3

bartlomieju opened this issue Jul 19, 2021 · 10 comments

Comments

@bartlomieju
Copy link
Owner

I did search for crates implementing BTree but found only one crate maintained: https://github.com/nimrodshn/btree

However it's disk-backed and has completely different API than: https://github.com/tidwall/btree

I tried going with BTreeSet and implementing wrapper structs like: OrdByKeyItem and OrdByExpItem but I'm not sure this is a viable path.

@bartlomieju
Copy link
Owner Author

The BTreeSet approach is definitely not viable. I've stubbed out BTree API in https://github.com/bartlomieju/rusty_buntdb/blob/main/src/btree.rs and I believe porting that struct is the best way forward. That said, it's gonna be a non-trivial task, as circular-referencing structs in Rust most often require some hackery and unsafe code.

@tidwall do you have any thoughts on this?

@tidwall
Copy link
Collaborator

tidwall commented Jul 20, 2021

Porting my Go BTree to Rust would be a pretty big undertaking. I also have a C version https://github.com/tidwall/btree.c that may work as a point of reference, or possibly through ffi.

That being said, I would think that the BTreeSet that comes with Rust should suffice.

I'm curious what problem you running into?

@bartlomieju
Copy link
Owner Author

Thanks for pointers. The main problem with BTreeSet is that the only way to "sort" the set is by implementing Ord trait for the objects that are inserted into the set.

It requires to create wrapper structs that implement custom ordering for different trees (db.keys, db.exps, idx.btr) - in your Go implementation those trees carry over a "context" when ascending/descending the tree - AFAIK it's not really feasible to do that with Ord trait unless I'd store that context in each of the structs in the the set, like so:

// `db.keys` (this one has no context)
#[derive(Eq, PartialEq)]
struct OrdByKeyItem {
    item: Arc<DbItem>,
}

// `db.exps`
#[derive(Eq, PartialEq)]
struct OrdByExpItem {
    item: Arc<DbItem>,
    ctx: ExCtx
}

// `idx.btr`
#[derive(Eq, PartialEq)]
struct OrdByIdxItem {
    item: Arc<DbItem>,
    idx: Arc<Index>
}

The only other option would be to always collect all of the items into a Vec before applying custom sorting however I'm worried about performance penalty that would rise from creating those vectors over and over again and discarding them immediately after.

@tidwall
Copy link
Collaborator

tidwall commented Jul 20, 2021

Ah yes I see what you mean.

This is something that I've run into in the past with the rust btree. It's a rigid implementation where its comparison logic is highly bound to the items in the tree itself. Not allowing for things like external contexts, or dynamic comparators.

To do anything tricky, you will need to create new wrapper items, like you are doing, which take up more space per item.

The Go (and C) implementations allow for custom comparators as functions. Which is a more flexible way of doing it.

The nice thing about the Rust way of doing it is that could be a little faster because the comparators from the std::Org trait will get inlined.

@tidwall
Copy link
Collaborator

tidwall commented Jul 20, 2021

I think the bigger issue is that, in some cases, the comparator needs to have a context view of database. Which is the owner of the BTree and thus the item.

That's the circular-reference issue you were talking about.

This is where my brain explodes.

@bartlomieju
Copy link
Owner Author

That's the circular-reference issue you were talking about.

This is where my brain explodes.

Yeah, it's gonna be fun :) I'll lookup the C implementation you've linked and see what I can do. Doing the actual DB parts seems more-or-less straight-forward from now on, but it's pointless without having underlying data structure working, so I'm gonna focus on that bit.

Thanks for your time!

@bartlomieju
Copy link
Owner Author

For now trying to get btree.c work with Rust using FFI: https://github.com/bartlomieju/rusty_btreec

@tidwall
Copy link
Collaborator

tidwall commented Jul 20, 2021

Oh cool. I haven't had a chance to do an FFI library yet myself, so I'm interested in seeing how this works out.

@bartlomieju
Copy link
Owner Author

Going with the FFI version at https://github.com/bartlomieju/rusty_btreec, after some back and forth it works fine; I managed to get first test passing.

I'm sure I'll hit more problems when making the DB work across the threads (needed for background syncing to disk), but I expect it might be solvable with a simple Arc<Mutex<>> around the btree.

@bartlomieju
Copy link
Owner Author

This is resolved, using C based btree over FFI works quite well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants