Rust Polars 0.38.2
🏆 Highlights
- Streaming outer joins (#14828)
🚀 Performance improvements
- Ensure parallel encoding/compression in
sink_parquet
(#14964) - hoist errors out of iterators in parquet (#14945)
- add basic AVX-512 filters (#14892)
- improve join-asof materialization (#14884)
- Optimize chunked-id gather for binaryviews (#14878)
- rework scalar filter kernels (#14865)
- Reduce size of optional join-indexes (#14856)
- Streaming outer joins (#14828)
- Set sorted flag for
cum_count
on columns (#14849)
✨ Enhancements
- Support writing
Array
type in parquet (#14943) - Sort decimal fields (#14649)
- Import
NamedFrom
indf!
macro (#14860) - try-improve concurrency tuner (#14827)
- Streaming outer joins (#14828)
- Set sorted flag for
cum_count
on columns (#14849) - Ensure binview types are rle-encoded in parquet write (#14818)
- Implement strict/nonstrict conversion for primitive AnyValues (#14186)
- Disable timeouts (#14809)
- cleanup spill disks in process (#14807)
🐞 Bug fixes
- Fix invalid partitionable query (#14966)
- allow nonstrict cast of categorical/enum to enum (#14910)
count_rows
multi-threaded under-counting in parser.rs (#14963)- raise proper error instead of panicking when result of truncation is non-existent datetime (#14958)
- ooc-sort issues (#14959)
- Do not raise when constructing from a list of Series with Nones (#14942)
- Don't access out-of-bounds for null indices in bitmap gather (#14932)
- std when ddof>=n_values returns None even in rolling context (#11750)
- Don't rechunk categoricals when moving to physical (#14934)
- parquet rle boolean decoder (#14931)
- boolean filter gave overly large buffers to Bitmap::from_u8_vec (#14924)
- Fix sliced dictionary state in parquet (#14917)
- Fix possibly incorrect order of columns when using ipc stream
with_columns
(#14859) - Fully qualify
polars_bail!
inpolars_ensure!
(#14901) - Fix
DataFrame.min
/max
for decimals (#14890) - Assert chunks are equal after physical cast to prevent OOB (#14873)
- not all cpu feature flag tests were mocked (#14864)
📖 Documentation
- Remove some repetition in comments/docstrings (#14912)
- Update contributing link (#14882)
- Fix some word-repetition in code comments (#14825)
- Seperate
asof
from join strategy, change parameter fromstrategy
tohow
in user guide (#14793)
🛠️ Other improvements
- fix features (#14977)
- fix chrono deprecation warnings (#14928)
- Update Cargo.lock and remove cmake limit workaround (#14905)
- Simplify streaming placeholder replacement. (#14915)
- Optional deps should include
fastexcel
(#14907) - Deduplicate
POLARS_FORCE_ASYNC
env var parsing (#14909) - Make assumption about column name to index conversion having occurred explicit (#14894)
- Make assumption about wildcards having been resolved explicit (#14899)
- reactivate argminmax simd (#14679)
- sort by 'idx' after outer join (#14867)
- Simplify computation of
with_columns
attribute in physical csv scanner of default engine. (#14837) - centrally define IdxSize (#14854)
- run and fix pext64_polyfill test (#14852)
- introduce partitioned table (#14819)
- add missing deprecation directive in groupby.count (#14817)
- Extract key value construction (#14812)
- Fix Makefile build commands (#14806)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @Sol-Hee, @alexander-beedie, @ambidextrous, @battmdpkq, @deanm0000, @dependabot, @dependabot[bot], @eitsupi, @flisky, @geekvest, @mcrumiller, @mickvangelderen, @nameexhaustion, @orlp, @petrosbar, @ritchie46 and @stinodego