v0.2.2
Performance
- Stats-based row-group pruning for top-N. Pass 1 reads each RG's max (or min, for ASC) from the Parquet footer, sorts best-first, and decodes sequentially with early break when the heap converges.
parquet10m_orderby_top10: 69 ms to 15 ms. Now beats DuckDB (23 ms). - Direct
string_temit in PhysicalWindow. Vectorised emit no longer goes throughValue::VARCHARboxing for varchar columns. 10M-row varchar emit drops from 3 s to 300 ms. - Lazy column-major Python results.
QueryResultnow holds the C result alive until garbage collected. New methods:fetchnumpy()returns a dict of numpy arrays. Numeric columns wrap the C buffer vianp.frombufferwith a defensive copy.fetchdf()builds a pandas DataFrame fromfetchnumpy(), skipping row-tuple materialisation entirely.
10M-row int conversion drops from ~4 s to ~50 ms.
Bench
13 win, 5 tie, 0 slow on the 18-query DuckDB suite (was 11 / 5 / 2 in 0.2.1).
Install
pip install --upgrade slothdb
npm install @slothdb/wasm@0.2.2
403 of 403 doctest tests passing on Windows, Linux, macOS.