What's New
Index Building Optimizations (42% faster at 1M scale)
- Combined vocabulary + term_to_id in single pass
- Per-document array concatenation for COO matrix construction
- Rolled back set.union (was 5.6x slower at scale)
Benchmark Results (1M Wikipedia documents)
| Engine |
Build Time |
Latency |
QPS |
| vajra |
17.0 min |
3.40ms |
294 |
| bm25s |
11.3 min |
5.44ms |
184 |
- 42% faster build time (17 min vs 29.5 min previously)
- 1.6x faster queries than BM25S (3.40ms vs 5.44ms)
Installation
pip install vajra-bm25==0.3.1
Full Changelog
v0.3.0...v0.3.1