Skip to content

v0.2.0 — DataFusion SQL, PyArrow, pandas, Polars, DuckDB

Choose a tag to compare

@bintocher bintocher released this 24 Mar 23:11
· 28 commits to main since this release

What's New

DataFusion SQL (feature datafusion_support)

  • Register QVD files as tables and query with SQL
  • JOIN multiple QVD files
  • QvdTableProvider implements DataFusion TableProvider trait

PyArrow / pandas / Polars (Python)

  • table.to_arrow() — zero-copy QVD -> PyArrow RecordBatch via Arrow C Data Interface
  • table.to_pandas() — QVD -> pandas DataFrame
  • table.to_polars() — QVD -> Polars DataFrame
  • QvdTable.from_arrow(batch) — PyArrow -> QVD (round-trip)
  • read_qvd_to_arrow(), read_qvd_to_pandas(), read_qvd_to_polars() — convenience functions

DuckDB Integration

  • Use QVD data in DuckDB via Arrow bridge (Rust and Python)

Bug Fixes

  • Fix critical XML parser bug: table-level Offset/Length was incorrectly read from field-level tags, causing scrambled data in Arrow/Parquet round-trips

Breaking Changes

  • PyO3 upgraded from 0.23 to 0.28

Other

  • New feature flag: datafusion_support
  • Python feature now enables arrow/pyarrow and arrow/ffi
  • 77 Python integration tests (Arrow, pandas, Polars, DuckDB, ExistsIndex, Parquet, data types)