A high-performance, file-system based key-value database for Python
FastKV is a pure-Python, high-performance key-value database designed for durability, speed, and simplicity. It implements a log-structured merge-tree (LSM-tree) architecture similar to RocksDB/LevelDB, optimized for SSD storage with asynchronous compaction, bloom filters, and configurable durability modes.
- High Performance: Optimized for write-heavy workloads with sequential I/O patterns
- ACID Compliant: Write-ahead logging (WAL) ensures crash recovery
- Multiple Durability Modes: Choose between speed and safety
- Asynchronous Operations: Built-in async API for non-blocking I/O
- Efficient Storage: SSTable-based storage with Bloom filters and compression support
- Memory Efficient: Configurable memtable sizes and background compaction
- Thread-Safe: Designed for concurrent access
- Zero Dependencies: Pure Python implementation (optional msgpack for better performance)
- Command Line Interface: Built-in CLI for database operations
pip install fastkvfrom fastkv import FastKV
# Open a database
db = FastKV("./my_database")
# Store data
db.put("user:1001", {"name": "Alice", "age": 30, "email": "alice@example.com"})
db.put("user:1002", {"name": "Bob", "age": 25})
db.put("config:theme", "dark")
db.put("counter:visits", 42)
# Retrieve data
user = db.get("user:1001")
print(f"User: {user}") # {"name": "Alice", "age": 30, "email": "alice@example.com"}
# Scan with prefix
users = db.scan("user:")
for key, value in users:
print(f"{key}: {value}")
# Batch operations
db.batch_put([
("order:001", {"item": "Book", "price": 29.99}),
("order:002", {"item": "Pen", "price": 1.99})
])
# Delete keys
db.delete("config:theme")
# Get statistics
stats = db.stats()
print(f"Total keys: {stats['total_keys']}")
print(f"Memtable size: {stats['memtable_size']} bytes")
# Close the database
db.close()import asyncio
from fastkv import AsyncFastKV
async def main():
async with AsyncFastKV("./my_async_db") as db:
# All operations are asynchronous
await db.put("async_key", "async_value")
value = await db.get("async_key")
print(f"Got: {value}")
# Batch operations
await db.batch_put([("a", 1), ("b", 2), ("c", 3)])
# Scan
results = await db.scan(prefix="", limit=10)
for key, val in results:
print(f"{key}: {val}")
asyncio.run(main())FastKV provides a comprehensive command-line interface accessible via python -m fastkv. This allows you to run tests, benchmarks, and use an interactive shell without writing code.
# Run tests
python -m fastkv test
# Run performance benchmarks
python -m fastkv benchmark
# Start interactive shell
python -m fastkv shell --path ./my_database
# Run benchmark with custom database path
python -m fastkv benchmark --path ./benchmark_data
# Start shell with specific database location
python -m fastkv shell --path /var/lib/fastkv/app_dataThe interactive shell provides a REPL (Read-Eval-Print Loop) interface for database operations:
python -m fastkv shell --path ./my_database| Command | Syntax | Description | Example |
|---|---|---|---|
| put | put <key> <value> |
Store a key-value pair | put user:1001 '{"name": "Alice", "age": 30}' |
| get | get <key> |
Retrieve value by key | get user:1001 |
| delete | delete <key> |
Remove a key | delete user:1001 |
| scan | scan [prefix] [limit] |
Scan keys with prefix | scan user: 10 |
| stats | stats |
Show database statistics | stats |
| bulk | bulk <filename.json> |
Bulk load from JSON file | bulk data.json |
| exit | exit or quit |
Exit the shell | exit |
# Start the shell
$ python -m fastkv shell --path ./testdb
Database opened at ./testdb
fastkv>
# Store data
fastkv> put config:app_name "MyApp"
OK
fastkv> put user:1001 '{"name": "Alice", "active": true}'
OK
# Retrieve data
fastkv> get user:1001
{
"name": "Alice",
"active": true
}
# Scan with prefix
fastkv> scan user:
user:1001: {"name": "Alice", "active": true}
Total: 1 items
# Scan with limit
fastkv> scan "" 5
config:app_name: "MyApp"
user:1001: {"name": "Alice", "active": true}
Total: 2 items
# View statistics
fastkv> stats
{
"memtable_size": 2048,
"memtable_keys": 2,
"immutable_memtables": 0,
"immutable_keys": 0,
"total_sstables": 0,
"total_keys": 2,
"sstable_stats": {},
"seq_num": 2
}
# Bulk load from JSON file
fastkv> bulk data.json
Loaded 1000 items
# Exit shell
fastkv> exit
Goodbye!
Database closedCreate a JSON file for bulk loading:
[
["key1", "value1"],
["key2", {"nested": "data"}],
["key3", [1, 2, 3]],
["user:1001", {"name": "Alice", "age": 30}],
["user:1002", {"name": "Bob", "age": 25}]
]Then load it:
python -m fastkv shell --path ./mydb
fastkv> bulk data.json
Loaded 5 itemsThe benchmark mode tests the database performance:
$ python -m fastkv benchmark
Running benchmark...Run the built-in test suite:
$ python -m fastkv test
Running FastKV tests...
✓ Test 1 passed: Basic operations
✓ Test 2 passed: Crash recovery
✓ Test 3 passed: Async operations
All tests passed! ✅from fastkv import FastKV, DurabilityMode
# Custom configuration
db = FastKV(
db_path="./my_data",
durability=DurabilityMode.SYNC, # SYNC, BACKGROUND, or NONE
max_memtable_size=128 * 1024 * 1024 # 128MB memtable
)DurabilityMode.NONE: Maximum performance, data may be lost on crashDurabilityMode.BACKGROUND(default): Good balance, async fsyncDurabilityMode.SYNC: Maximum durability, sync before return
from fastkv import ValueEncoding
# Different serialization formats (default: JSON)
db.put("key", data, encoding=ValueEncoding.JSON)
db.put("key", data, encoding=ValueEncoding.MSGPACK) # Requires msgpack
db.put("key", data, encoding=ValueEncoding.PICKLE)For initial data import, use bulk loading for better performance:
# Generate sample data
items = [(f"item:{i}", {"id": i, "data": "x" * 100}) for i in range(100000)]
# Bulk load (bypasses WAL for speed)
db.bulk_load(items)Compaction runs automatically in the background, but you can monitor it:
# The database automatically schedules compaction
# when certain thresholds are reached
stats = db.stats()
print(stats['sstable_stats']) # View SSTable distributionimport pickle
class CustomObject:
def __init__(self, data):
self.data = data
obj = CustomObject("test")
# Store custom objects
db.put("custom", obj, encoding=ValueEncoding.PICKLE)
# Retrieve
retrieved = db.get("custom")
print(type(retrieved)) # <class '__main__.CustomObject'>FastKV implements an LSM-tree storage engine with these components:
- Ensures durability and crash recovery
- Segmented files with rotation
- Configurable sync modes
- In-memory sorted key-value store
- Automatically flushed to disk when full
- Thread-safe with bisect-based ordering
- Immutable sorted files on disk
- Block-based storage with Bloom filters
- Multi-level compaction strategy
- Background merging of SSTables
- Level-based compaction policy
- Configurable parallelism
- Use appropriate durability:
BACKGROUNDmode offers good balance for most use cases - Batch operations: Use
batch_put()for multiple writes - Bulk load initial data: Use
bulk_load()for initial imports - Monitor memory usage: Adjust
max_memtable_sizebased on available RAM - Use msgpack: Install msgpack for faster serialization
- Use CLI for quick operations: The shell is perfect for debugging and administration
class FastKV:
def __init__(self, db_path: Union[str, Path],
durability: DurabilityMode = DurabilityMode.BACKGROUND,
max_memtable_size: int = 64 * 1024 * 1024)
def put(self, key: str, value: Any,
encoding: ValueEncoding = ValueEncoding.JSON) -> None
def get(self, key: str) -> Optional[Any]
def delete(self, key: str) -> None
def batch_put(self, items: List[Tuple[str, Any]]) -> None
def scan(self, prefix: Optional[str] = None,
limit: Optional[int] = None) -> List[Tuple[str, Any]]
def stats(self) -> Dict[str, Any]
def bulk_load(self, items: List[Tuple[str, Any]]) -> None
def close(self) -> Noneclass AsyncFastKV:
async def open(self) -> None
async def put(self, key: str, value: Any) -> None
async def get(self, key: str) -> Optional[Any]
async def delete(self, key: str) -> None
async def batch_put(self, items: List[Tuple[str, Any]]) -> None
async def scan(self, prefix: Optional[str] = None,
limit: Optional[int] = None) -> List[Tuple[str, Any]]
async def stats(self) -> Dict[str, Any]
async def bulk_load(self, items: List[Tuple[str, Any]]) -> None
async def close(self) -> None# Clone the repository
git clone https://github.com/arifchy369/FastKV.git
cd fastkv
# Install in development mode
pip install -e .
# Run tests
python -m fastkv test
# Run benchmarks
python -m fastkv benchmark
# Start interactive shell
python -m fastkv shellContributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - see LICENSE file for details.
- Issues: GitHub Issues
- Author: Arif Chowdhury (@arifchy369)
- Snapshot and backup functionality
- Transaction support
- Replication and clustering
- More compression algorithms (LZ4, Zstd)
- Query language support
- TTL (time-to-live) for keys
- Windows performance optimizations
FastKV - Fast, durable key-value storage for Python applications.