Note: While this implementation is used by us and has not caused major problems, we do not recommend it yet production environments. Please use the leveldb or rocksdb crate for this purpose.
This implementation does not aim to reimplement LevelDB. The major differences are:
- Separation of keys and values: Values can be stored seperately to increase compaction speed as outlined in the WiscKey paper
- Concurrent compaction: Multiple threads can compact at the same time for higher write throughput
- Async-support: All API calls are exposed as async functions
- io_uring-support: For async file system access on Linux. Optional and still considered experimental.
- Bloom filters for faster lookups
Currently, the code is only tested with Linux on x86 machines, but it should run on most systems supported by the Rust compiler.
LSM stores data using zerocopy to achieve high performance.
The implementation does not account for endianness so on-disk formats are not portable.
Replication across machines should be handled at a different layer of the system.
However, we may add a converter tool in the future or an endianess
feature flag if needed.
- FLSM: Like PebblesDB LSM-rs will fragment the keyspace to reduce write amplification and increase compaction speed
- Custom sorting functions
- More modularity and configuration options
snappy-compression
: Use the snappy format to compress data on disk (enabled by default)bloom-filters
: Add bloom filters to data blocks for more efficient searching. (enabled by default)async-io
: Usetokio_uring
for I/O instead of that of the standard library. Note, that this only works recent version of the Linux kernel. (disabled by default)wisckey
: Store keys and values separately. This usually results in higher throughput with slightly higher CPU-usage. (disabled by default)
This crate exposes an async API intended to be used with Tokio or a similar runtime. Alternatively, you can use the lsm-sync crate included in this repo, which internally uses Tokio but expose a synchronous API.
You need to serialize your data in a way that its byte representation maintains the same ordering as the unserialized data. For example, you may want to use big endian encoding so that numerical values are ordered correctly.
You can create or open a new databse instance as shown below.
use lsm::{Database, Params};
// Set options here, such as the location of the database files
let params = Params {
db_path,
..Default::default()
};
// Instantiate database
let database = Database::new_with_params(SM, params)
.await
.expect("Failed to create database instance");
To write to the database use the get
call. Note that the crate only supports
writing byte vectors. (De-)serialization is supposed to happen at another layer.
let key = String::from("mykey").into_bytes();
let value = String::from("hello world").into_bytes();
database.put(key, value).await.expect("Writing to database failed");
When reading, LSM will return a reference to the data to avoid copying.
let value_ref = database.get(&key).await.expect("Reading failed");
// Returns a slice to the data
let data: &[u8] = value_ref.get_value();
// Assuming the put from above workd, this will print "hello world"
println!("{}", std::str::from_utf(data).unwrap());
Please refer to the tests for more examples to how to use the crate.
This library ships with several tests. We provide a justfile for convenience:
just test #runs all tests for all configurations
just lint #runs cargo clippy
Currently, the io-uring feature relies on tokio-uring-executor, a simplistic multi-threaded wrapper around tokio-uring
.
Eventually tokio-uring
will support multiple threads natively and this workaround will be removed.
I would also like to add support for more mature io_uring runtimes such as gloomio but only have limited time to work on this crate. Help is very welcome.
This is an incomplete list of crates that provide similar functionality. Please reach out if you know of others to add.
- rust-rocksdb: Rust bindings for RocksDB
- leveldb: Rust bindings for LevelDB
- wickdb: Rust re-implementation of vanilla LevelDB
- agatedb: A WiscKey implementation in Rust for TiKV
These differ significantly in their approach but also provide a key-value store abstraction