v0.3.1 - Performance Optimizations

aiexplorations released this 10 Jan 04:26

e6f358d

What's New

Index Building Optimizations (42% faster at 1M scale)

Combined vocabulary + term_to_id in single pass
Per-document array concatenation for COO matrix construction
Rolled back set.union (was 5.6x slower at scale)

Benchmark Results (1M Wikipedia documents)

Engine	Build Time	Latency	QPS
vajra	17.0 min	3.40ms	294
bm25s	11.3 min	5.44ms	184

42% faster build time (17 min vs 29.5 min previously)
1.6x faster queries than BM25S (3.40ms vs 5.44ms)

Installation

pip install vajra-bm25==0.3.1

Full Changelog

v0.3.0...v0.3.1

Assets 2