v0.6.0 — performance, correctness, durability & features
SnapDB v0.6.0 is a performance, correctness, durability, and feature release. Pure-Python and zero-dependency at runtime (NumPy is optional, only for zero-copy export). Every change was adversarially reviewed (findings reproduced before fixing) and CI is green across Linux (3.9–3.13) and Windows plus a benchmark.
⚡ Performance
- Delta-encoded column reads are now O(1)/O(n) via a lazy reconstruction cache (previously O(n)/O(n²)) — a 20k-row delta-column scan dropped from ~17.2 s to ~11 ms.
- Vectorized multi-condition
select_where()combines per-column bitmasks with C-speed big-integerAND/OR— ~2× faster than a row-predicateselect()on selective queries. - Vectorized aggregates (array-level
sum/min/max) for null-free numeric columns;__slots__on hot classes.
🛡️ Correctness & durability
- Hash indexes are kept in sync on insert /
batch_insert/ update / delete (they previously went stale after the first build); unifiedcreate_index()for row and columnar storage;find()works without a pre-built index; an emptied index is no longer mistaken for a missing one. - Delta delete/null corruption fixed — deleting or nulling a delta-encoded row no longer shifts other rows' values.
- Transaction rollback now actually undoes writes (and restores indexes).
- Durable multi-slab persistence — the on-disk bitmap/slab geometry and the last slab's high-water mark are now persisted, so reopening a database larger than one slab no longer loses data.
- Columnar index is now
value → set(first-match, consistent under duplicates/deletes and before/after auto-indexing); defaultto_numpy()returns an independent copy.
✨ Features
- #4 Vectorized predicates:
select_where([(col, op, value), …], combine="and"|"or")(eq/ne/gt/gte/lt/lte/in/between, projection, limit/offset, dict shorthand). - #6 Auto-indexing:
SnapDB(auto_index=True, auto_index_threshold=N). - #7 Zero-copy NumPy export:
to_numpy()/to_numpy(zero_copy=True)/column_buffer()(PEP 688), NumPy optional.
🧰 Tooling & docs
benchmarks/bench_suite.py— reproducible, cross-platform benchmark with JSON + Markdown output.- GitHub Actions CI: ruff lint, pytest matrix (Linux 3.9–3.13 + Windows), benchmark artifact.
- README refreshed with honest, reproducible numbers: SnapDB columnar is the lowest-memory engine here (~2.2 MB / 100K rows vs SQLite 2.9, pandas 11, dict 22) and ~3.3× faster than in-memory SQLite on full-scan aggregation (NumPy-backed math is faster, stated plainly).
🗺️ Roadmap (tracked issues)
FOR encoding (#11), sys.monitoring profiler (#12), faster row bulk-insert (#13), optional NumPy-accelerated filters/aggregates (#14).