New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Look into replacing sled
#294
Comments
Such an abstraction exists:
As for now, there aren't any show stoppers that would make me move away from sled and I don't have the bandwidth to support multiple store implementations. |
sled
with heed
sled
Another one: Which has multi-process support, which might be relevant for the iOS use-case where currently two sync loops are going on. One for the notifications and one for the main process. |
note about BonsaiDB, from their docs:
|
don't forget about https://github.com/paritytech/parity-db |
A new contender recently appeared: https://github.com/cberner/redb/ Pure Rust, claims almost as fast as LMDB in the README.md. |
Unfortunately it is mmap-based, so fundamentally incompatible with accessing the DB from multiple processes at the same time, which seems like is going to be a hard requirement for whatever we replace sled with. |
I stand corrected on Bonsai-DBb - since 0.1.0 it's been using nebari, not sled:
It's not clear what that means for multi-process support though (I doubt it has that). |
I'm the author of Sanakirja and reached this thread while reading about news in Matrix. If you need any help let me know. See our benchmarks: https://pijul.org/posts/2021-02-06-rethinking-sanakirja/ |
@P-E-Meunier Thanks for the offer! I have started work on a crypto-store based on Sanakirja, but it's not clear to me how I would use arbitrarily-sized (byte)strings in there, both for keys and values. I got the basic setup working and thought |
This isn't a hard limit, Sanakirja is a framework rather than a database, its main duty is to manage memory (including optional reference counting if you need to fork your datastructures) in an mmapped file, in a fully transactional way. You can write many different datastructures on top of that, B trees are just one particular datastructure, but I've worked on others (ropes, for example). It mostly depends on what operations you want, but one workaround for the 510b limit is to create a new type of pages for storing large chunks of contiguous data, and reference these pages in the tree. The only tricky part is that the allocator now needs to return contiguous pages, but that isn't hard. If you don't need your values to be contiguous in memory (or can afford to copy them rather than work with pointers), things will be easier. One significant advantage you'll gain if you can use the current allocator is that one can implement it on top of more exotic backends: Pijul uses the same code when working on-disk and in zstd-compressed files. You could also potentially use other databases (such as replication/HA tools). Just tell me what you need, I may be able to implement it if it isn't too complicated.
This is at the same time a good and a terrible idea: a good one since it is possibly the most "real-world" use of Sanakirja currently deployed, and a terrible one because at the time of starting Pijul, GATs hadn't even started in unstable Rust. Therefore, I decided to do the compiler's job of monomorphising manually, using macros. But then that gave me too much freedom to express type constraints that made "repos-in-zstd" possible, and these constraints can't be expressed in Rust's type system without the manual monomorphisation. So, Pijul's way is also possibly the most convoluted way to use Sanakirja. |
Looking at your code, Pijul can handle concurrent readers and writers, you don't need to drop your reader before starting a writer. The exact constraint is much more flexible than this, but a first approximation is that you need to drop the readers before committing a writer. |
The biggest concern for us right now is to get something that works. I'm not at all experienced with low-level details of databases, nor do I want to optimize things a lot right now, except we need to be able to open the DB without immediately running out of the memory restrictions for background processes on iOS. We have sled as an existing backend which runs into those limits, which is one reason we are now spending more time on trying out replacements.
I think we can totally afford to copy rather than working with pointers. I guess the alternative would be an
But it reduces contention to drop the reader earlier, right? |
One of my goals was actually to ensure that the database was always at least half full before growing the file, so this could work, depending on how iOS implements mmap (Sanakirja uses multiple maps to avoid this issue).
That could be quite easier, actually, since it doesn't need a new version of the
The only hard contention is on the writers mutex. The readers count is an atomic variable, so depending on the architecture it might result in some contention, but probably not significant. Avoid concurrency inside a thread would make your code easier to reason about though. |
In that case, could you give me some hints as to how to get started with that? |
Well, let me try something first ;-) I believe this can be solved during the end of my lunch break. |
Ok, I just published the allocator (Sanakirja 1.3), although I don't think it will be very useful for now, it seems I'll have to think a bit more about the best way to use it, I tried various things without success. The main issue is that if you're storing bytestrings of widely different sizes, you could end up wasting disk space, so I tried to mitigate against that, but didn't find a convincing way. I'll come back to it in the next few days. |
Sorry, I don't understand – what changed exactly? |
The |
So is that something I can do myself easily enough without understanding how sanakirja works in depth? |
No, but I looked at it again last night and I believe I found a way. I'll keep you posted. |
Alright, using Sanakirja 1.3.1 you can the following: use sanakirja::*;
fn main() {
let env = Env::new_anon(409600000, 1).unwrap();
let mut txn = Env::mut_txn_begin(&env).unwrap();
let mut db: btree::UDb<u64, Slice> = btree::create_db_(&mut txn).unwrap();
btree::put(&mut txn, &mut db, &0, &(&b"blabal bilbli"[..]).into()).unwrap();
let v = vec![b'a'; 5000];
btree::put(&mut txn, &mut db, &1, &(&v[..]).into()).unwrap();
txn.set_root(0, db.db);
txn.commit().unwrap();
let mut txn = Env::mut_txn_begin(&env).unwrap();
let mut db: btree::UDb<u64, Slice> = btree::UDb::from_page(txn.root(0).unwrap());
println!("{:?}", btree::get(&txn, &mut db, &0, None).unwrap().unwrap().1.as_bytes(&txn));
let bb = btree::get(&txn, &mut db, &1, None).unwrap().unwrap().1.as_bytes(&txn).unwrap();
println!("{:?} {:?}", bb.len(), bb.iter().all(|c| *c == b'a'));
txn.set_root(0, db.db);
txn.commit().unwrap();
} Some words of caution:
If you're unsure, you can ask me. The Rust typesystem isn't sufficiently expressive to provide a safe interface to Sanakirja, which means that the rules aren't that simple. |
So you mean in the example above,
? |
Yes. About So, while using a For example, if you follow this advice, and some failure happens after commit (my usual hypothesis is power failure), no memory will be leaked and no reference will be left hanging. |
Well, UB using functions that are not marked |
Actually, since Sanakirja manipulates pointers to memory not managed by Rust, nothing is really safe, and the borrow checker is unfortunately not sufficient to check everything. Other libraries suffer from this as well: for example in LMDB, mutably borrowing two different tables in LMDB is probably safe, but tables should borrow the transaction mutably to ensure the tables aren't reused after a commit. But if you were using a simpler library before (i.e. not generic in the types), writing a safe wrapper should be really easy. |
In that case I don't have the confidence needed to continue with the sanakirja crypto store prototype. Long-term, we might be interested in building a higher-level safe API on top of a version of sanakirja with anything that can cause UB marked |
No problem. If you know what you want, I can give you a safe wrapper. How many tables? Do you need nested datastructures (a Db where the keys and/or values are also Db)? |
came across surrealdb this morning... |
The default store backend is sqlite now, and we will likely kill sled entirely soon enough. |
And sled has been killed, RIP. I'm closing this issue :-). |
hm, there was a(n alpha) release just last week |
heed
is a high-level wrapper around LMDB, i think it'd be interesting to figure out if it has better performance, compatibility, and caching possibilities compared tosled
(at the moment), and/or to allow these database backends to be swapped interchangeably.The text was updated successfully, but these errors were encountered: