Releases: delta-io/delta-rs
python-v0.6.3
What's Changed
- Added without_files flag to DeltaTable constructor by @MykhailoHevak in #866
- chore: update object-store dependency by @wjones127 in #884
- build(deps): bump serde from 1.0.145 to 1.0.147 by @dependabot in #895
- build(deps): bump anyhow from 1.0.65 to 1.0.66 by @dependabot in #897
- build(deps): bump serde_json from 1.0.86 to 1.0.87 by @dependabot in #896
- build(deps): bump futures from 0.3.24 to 0.3.25 by @dependabot in #899
- build(deps): bump async-trait from 0.1.57 to 0.1.58 by @dependabot in #898
- build(deps): bump once_cell from 1.15.0 to 1.16.0 by @dependabot in #907
- build(deps): bump libc from 0.2.135 to 0.2.137 by @dependabot in #905
- build(deps): bump lambda_runtime from 0.6.1 to 0.7.0 by @dependabot in #903
- Fix parsing struct stats after schema evolution by @Tom-Newton in #901
- fix: pass storage options down when getting delta table by @wjones127 in #893
- Fix cargo clippy issues 0.1.65 in Rust by @fvaleye in #923
New Contributors
- @MykhailoHevak made their first contribution in #866
Full Changelog: python-v0.6.2...python-v0.6.3
What's Changed
- Added without_files flag to DeltaTable constructor by @MykhailoHevak in #866
- chore: update object-store dependency by @wjones127 in #884
- build(deps): bump serde from 1.0.145 to 1.0.147 by @dependabot in #895
- build(deps): bump anyhow from 1.0.65 to 1.0.66 by @dependabot in #897
- build(deps): bump serde_json from 1.0.86 to 1.0.87 by @dependabot in #896
- build(deps): bump futures from 0.3.24 to 0.3.25 by @dependabot in #899
- build(deps): bump async-trait from 0.1.57 to 0.1.58 by @dependabot in #898
- build(deps): bump once_cell from 1.15.0 to 1.16.0 by @dependabot in #907
- build(deps): bump libc from 0.2.135 to 0.2.137 by @dependabot in #905
- build(deps): bump lambda_runtime from 0.6.1 to 0.7.0 by @dependabot in #903
- Fix parsing struct stats after schema evolution by @Tom-Newton in #901
- fix: pass storage options down when getting delta table by @wjones127 in #893
- Fix cargo clippy issues 0.1.65 in Rust by @fvaleye in #923
- Bump version of the Python binding to 0.6.3 by @fvaleye in #924
New Contributors
- @MykhailoHevak made their first contribution in #866
Full Changelog: python-v0.6.2...python-v0.6.3
python-v0.6.2
What's Changed
- build(deps): bump url from 2.2.2 to 2.3.0 by @dependabot in #800
- build(deps): bump lambda_runtime from 0.6.0 to 0.6.1 by @dependabot in #801
- build(deps): bump parquet2 from 0.16.2 to 0.16.3 by @dependabot in #802
- build(deps): bump criterion from 0.3.6 to 0.4.0 by @dependabot in #803
- build(deps): bump percent-encoding from 2.1.0 to 2.2.0 by @dependabot in #804
- build(deps): bump url from 2.3.0 to 2.3.1 by @dependabot in #805
- feat: integrate
object_store
for read/write with pyarrow by @roeap in #799 - cleanup errors and make
DeltaWriterError
internal by @roeap in #784 - Refactoring of the Github Action python release by @fvaleye in #810
- Re-allow writing to non-existant local paths by @wjones127 in #811
- [Python][Docs] Add description to landing page by @wjones127 in #817
- [Python] Fix handling of null stats in write_deltalake by @wjones127 in #815
- feat(python): Set smaller defaults on row group and file size by @wjones127 in #818
- chore: bump arrow and friends by @roeap in #814
- feat: improve storage configuration by @roeap in #822
- Datafusion-imports by @roeap in #823
- build(deps): bump tokio-stream from 0.1.9 to 0.1.10 by @dependabot in #824
- build(deps): bump anyhow from 1.0.64 to 1.0.65 by @dependabot in #825
- build(deps): bump once_cell from 1.13.1 to 1.14.0 by @dependabot in #826
- build(deps): bump thiserror from 1.0.34 to 1.0.35 by @dependabot in #827
- build(deps): bump env_logger from 0.7.1 to 0.9.1 by @dependabot in #829
- build(deps): bump tokio from 1.21.0 to 1.21.1 by @dependabot in #828
- Fix warnings by @wjones127 in #839
- comment cleanup by @houqp in #840
- add codeowners by @houqp in #841
- build(deps): bump thiserror from 1.0.35 to 1.0.36 by @dependabot in #843
- build(deps): bump libc from 0.2.132 to 0.2.133 by @dependabot in #844
- build(deps): bump serde from 1.0.144 to 1.0.145 by @dependabot in #846
- build(deps): bump reqwest from 0.11.11 to 0.11.12 by @dependabot in #847
- build(deps): bump once_cell from 1.14.0 to 1.15.0 by @dependabot in #849
- use rustls for delta checkpoint lambda by @houqp in #842
- Add invariant enforcement support by @wjones127 in #834
- Add contributing page with roadmap and good first issues link by @wjones127 in #853
- build(deps): bump tokio from 1.21.1 to 1.21.2 by @dependabot in #856
- build(deps): bump libc from 0.2.133 to 0.2.134 by @dependabot in #857
- build(deps): bump openssl from 0.10.41 to 0.10.42 by @dependabot in #858
- build(deps): bump thiserror from 1.0.36 to 1.0.37 by @dependabot in #859
- build(deps): bump uuid from 1.1.2 to 1.2.1 by @dependabot in #872
- build(deps): bump serde_json from 1.0.85 to 1.0.86 by @dependabot in #874
- build(deps): bump pyo3 from 0.17.1 to 0.17.2 by @dependabot in #873
- Bump Python binding version to 0.6.2 by @fvaleye in #876
- Fix the Python Release Github Action by @fvaleye in #877
Full Changelog: python-v0.6.1...python-v0.6.2
python-v0.6.1
What's Changed
- feat: add gcs integration tests by @roeap in #779
- build(deps): bump lz4-sys from 1.9.2 to 1.9.4 in /aws/delta-checkpoint by @dependabot in #782
- build(deps): bump lz4-sys from 1.9.2 to 1.9.4 in /delta-inspect by @dependabot in #783
- build(deps): bump tokio from 1.20.1 to 1.21.0 by @dependabot in #790
- build(deps): bump thiserror from 1.0.32 to 1.0.34 by @dependabot in #792
- build(deps): bump pretty_assertions from 1.2.1 to 1.3.0 by @dependabot in #791
- build(deps): bump anyhow from 1.0.62 to 1.0.64 by @dependabot in #793
- build(deps): bump env_logger from 0.7.1 to 0.9.0 by @dependabot in #794
- hotfix: python object store paths by @roeap in #787
- prepare python release
0.6.1
by @roeap in #795
Full Changelog: python-v0.6.0...python-v0.6.1
python-v0.6.0
What's Changed
- JSON Writer writing partitions values fix by @Blajda in #658
- Support date32 and decimal stats in write_deltalake by @wjones127 in #659
- bugfix: Make sure vacuum works on relative paths by @wjones127 in #664
- Fix linting / build on main by @mrk-its in #670
- feat: add support for HTTPS_PROXY env var by @xfrancois in #665
- Utilise struct stats when available by @Tom-Newton in #656
- fix: inconsistent path in azure list by @roeap in #673
- Factor vacuum and implement a builder by @Blajda in #672
- Bump openssl-src from 111.21.0+1.1.1p to 111.22.0+1.1.1q by @dependabot in #674
- Bump openssl-src from 111.20.0+1.1.1o to 111.22.0+1.1.1q in /aws/delta-checkpoint by @dependabot in #675
- fix: get docs building and add to CI checks by @wjones127 in #679
- fix: omit common prefixes in azure
list_objs
by @roeap in #683 - fix: traverse directories in local
list_objs
by @roeap in #681 - Implement vacuum tests and general test setup utils by @Blajda in #682
- Setup Github dependabot for Rust by @fvaleye in #687
- Bump regex from 1.5.6 to 1.6.0 by @dependabot in #694
- Bump serde_json from 1.0.81 to 1.0.82 by @dependabot in #692
- Bump serial_test from 0.7.0 to 0.8.0 by @dependabot in #699
- Bump hyper from 0.14.19 to 0.14.20 by @dependabot in #702
- feat: sharable reference to storage backend by @roeap in #697
- Bump openssl from 0.10.40 to 0.10.41 by @dependabot in #700
- Get file size from Pyarrow directly (>= 9.0.0) by @Bernolt in #704
- Bump bytes from 1.1.0 to 1.2.0 by @dependabot in #707
- Bump serde from 1.0.137 to 1.0.140 by @dependabot in #708
- Bump crossbeam from 0.8.1 to 0.8.2 by @dependabot in #709
- Bump tokio from 1.19.2 to 1.20.0 by @dependabot in #710
- Fix usage documentation in Python binding by @fvaleye in #716
- Bump bytes from 1.2.0 to 1.2.1 by @dependabot in #719
- Bump tokio from 1.20.0 to 1.20.1 by @dependabot in #720
- Bump lambda_runtime from 0.3.0 to 0.6.0 by @dependabot in #711
- feat: integrate with object_store / datafusion APIs by @roeap in #703
- Bump async-trait from 0.1.56 to 0.1.57 by @dependabot in #730
- Bump serde from 1.0.140 to 1.0.142 by @dependabot in #726
- Bump anyhow from 1.0.58 to 1.0.60 by @dependabot in #727
- Bump libc from 0.2.126 to 0.2.127 by @dependabot in #728
- Bump serde_json from 1.0.82 to 1.0.83 by @dependabot in #731
- Bump thiserror from 1.0.31 to 1.0.32 by @dependabot in #732
- Prune scanned files on column stats by @roeap in #724
- Fix parsing null counts for struct type columns in the struct stats by @Tom-Newton in #714
- Python: fix: fix minimal test and bump minimum pyarrow version by @wjones127 in #733
- Implement Python Schema in Rust by @wjones127 in #684
- fix: Address clippy lint warnings. by @tsh56 in #742
- Bump serde from 1.0.142 to 1.0.143 by @dependabot in #737
- Bump libc from 0.2.127 to 0.2.132 by @dependabot in #743
- build(deps): bump anyhow from 1.0.60 to 1.0.62 by @dependabot in #744
- build(deps): bump serial_test from 0.8.0 to 0.9.0 by @dependabot in #745
- build(deps): bump chrono from 0.4.20 to 0.4.22 by @dependabot in #748
- build(deps): bump futures from 0.3.21 to 0.3.23 by @dependabot in #747
- Cast min and max too when parsing stats by @wjones127 in #753
- build(deps): bump serde_json from 1.0.83 to 1.0.85 by @dependabot in #759
- build(deps): bump serde from 1.0.143 to 1.0.144 by @dependabot in #760
- turn table state version into a private field by @houqp in #772
- build(deps): bump pyo3 from 0.16.5 to 0.16.6 by @dependabot in #773
- Remove Python version 3.6 support And run multiple python versions by @fvaleye in #770
- parquet2 implementation backed by parquet2 feature gate by @houqp in #465
- Adopt
ObjectStore
by @roeap in #761 - chore: cleanup by @roeap in #774
- Bump version of the Python binding to 0.6.0 by @fvaleye in #762
New Contributors
- @mrk-its made their first contribution in #670
- @xfrancois made their first contribution in #665
- @Bernolt made their first contribution in #704
- @tsh56 made their first contribution in #742
Full Changelog: python-v0.5.8...python-v0.6.0
python-v0.5.8
What's Changed
- Expose read and write options in public API by @george-zubrienko in #581
- [proof] make sure lock at least expires once by @houqp in #591
- Python API - delta.appendOnly enforcement by @WarSame in #590
- Avoid building pandas and numpy from source by @wjones127 in #595
- Introduce require_files for tracking the add files in table state by @mosyp in #594
- Make sure pandas is optional by @wjones127 in #597
- High level Delta Operations with Datafusion by @roeap in #584
- Re-enable datafusion tests and improve supported types. by @roeap in #601
- default to root for empty path in azure store by @roeap in #603
- publish dynamodb_lock to crates.io by @houqp in #605
- Configure Azure storage using a map (#555) by @Blajda in #598
- Azure options by @roeap in #606
- Update rusoto dependencies to 0.48 by @ahmedriza in #611
- upgrade to datafusion 8 by @houqp in #612
- fix: cap sphinx version to avoid bug in 5.0 by @wjones127 in #615
- Provide Python aarch64 wheels for Linux. by @fvaleye in #613
- Refactoring of the Python release Github action by @fvaleye in #616
- fix: Use relative paths for add paths by @wjones127 in #618
- Bin packing optimization by @Blajda in #607
- feat: impl rename_noreplace with std::fs::hard_link by default by @wjones127 in #621
- feat(python): validate schema in write_deltalake by @wjones127 in #624
- Fix the AWS_REGION environment variable configuration in S3 backend by @fvaleye in #633
- Refactor azure storage with crate updates by @roeap in #644
- Defer creation of storage backend in DeltaTableBuilder by @Blajda in #639
- fix: Add correct size and null paritition values to add actions by @wjones127 in #625
- Bump flatbuffers from 0.8.4 to 2.1.2 in /aws/delta-checkpoint by @dependabot in #626
- Bump hyper from 0.14.9 to 0.14.19 in /aws/delta-checkpoint by @dependabot in #628
- Bump regex from 1.5.4 to 1.5.5 in /aws/delta-checkpoint by @dependabot in #629
- Bump regex from 1.5.4 to 1.5.6 in /delta-inspect by @dependabot in #630
- Bump thread_local from 1.1.3 to 1.1.4 in /aws/delta-checkpoint by @dependabot in #646
- fix: Prevent warning spam when reading tables generated by delta 1.2.1 by @Tom-Newton in #651
- refactor: move version field to
DeltaTableState
by @roeap in #649 - feat: add enforce_retention_duration param to vacuum method by @houqp in #648
- fix: read vacuumed delta log without _last_checkpoint by @roeap in #643
- feat: Upgrade to arrow/parquet 15 and datafusion 9 by @xianwill in #652
- Release of the Python binding version 0.5.8 by @fvaleye in #640
New Contributors
- @george-zubrienko made their first contribution in #581
- @WarSame made their first contribution in #590
- @dependabot made their first contribution in #626
- @Tom-Newton made their first contribution in #651
Full Changelog: python-v0.5.7...python-v0.5.8
python-v0.5.7
What's Changed
- Upgrade DataFusion, Arrow, Parquet dependencies by @Dandandan in #562
- fix clippy warnings by @houqp in #567
- Azure improvements by @thovoll in #556
- Update ADLSGen2-HOWTO.md by @dgcaron in #560
- Parse partition values before handing to PyArrow by @wjones127 in #565
- [Python] Test in minimal and latest Python environments by @wjones127 in #572
- [Python] Initial PyArrow writer by @wjones127 in #566
- Upgrade
arrow, parquet, datafusion
version by @zemelLeong in #583 - Record Batch Writer by @roeap in #573
- Replace table location prefix from s3a to s3 by @novakov-alexey in #585
- Allow metadata for write_deltalake by @PadenZach in #587
- Change private time_utils module to public. by @xianwill in #586
- Release of the Python binding version 0.5.7 by @fvaleye in #589
New Contributors
- @dgcaron made their first contribution in #560
- @zemelLeong made their first contribution in #583
- @novakov-alexey made their first contribution in #585
- @PadenZach made their first contribution in #587
Full Changelog: python-v0.5.6...python-v0.5.7
python-v0.5.6
- Bump version of Python binding to 0.5.6 (#558)
- Move delta-inspect to its own crate (#557)
- Fix VACUUM by using table_uri when filtering files to delete (#551)
- Formally verify S3 atomic rename (#540)
- Implement missing Azure storage backend methods (#499)
- Implement polling for table updates (#550)
- Add target in Python release Github action workflow. (#548)
Credits:
QP Hou, Thomas Vollmer, David Blajda, Florian Valeye
Full Changelog: python-v0.5.5...python-v0.5.6
python-v0.5.5
- Add storage options for backends (#544)
- Remove coupling of DynamoDbLockClient from S3 storage (#535)
- add macOS 11 support in python binding release (#541)
- Refresh Python usage documentation (#539)
- [Python] Create PyArrow dataset fragments from delta log (#525)
- Fix Delta metadata transaction schema (#531)
- Add gcs test and improve credential error (#533)
- Return complete history (#526)
- Move dynamodb lock into its own crate (#508)
- Add datafusion examples to docs (#519)
- Fix S3 list_objs and cleanup_metadata (#518)
- Add support for creating List and Map schema types (#517)
- Update datafusion version to 6 (#516)
- Retry S3 get request on 500 Internal Server Error (#510)
- Fix memory overhead when creating checkpoint (#502)
- Fix nullable partition values (#498)
- Fix cleanup_expired_logs timestamp (#503)
- Add bool config enableExpiredLogCleanup. (#500)
- pin arrow to major version (#501)
Credits:
Florian Valeye, ahmedriza, Will Jones, Liang-Chi Hsieh, Gabriel J. Michael, Matthew Turner, Mykhailo Osypov, Andrei Ionescu, QP Hou
Full Changelog: python-v0.5.4...python-v0.5.5
python-v0.5.4
- Clean up expired delta table commit logs after checkpoint (#484)
- Add authorization options for azure storage backend (#486)
- Bump arrow to 6.1.0 (#494)
- Add DeltaTableError in Python binding. Add markers for integration tests with pytest. (#496)
- Change Rust edition from 2018 to 2021 (#490)
- Add docs for ADLS Gen2. (#492)
- Add gt, gte, lt and lte partition filters. (#478)
- Fix python build (#487)
- Try to fix flaky rename under Windows (#485)
- Update azure crates (#474)
- Update README.adoc (#482)
- Fix documentation for the DeltaStorageHandler (#483)
- Throw an error when filter key is not in partitioned columns. (#475)
- Add GCS feature to the Python Cargo.toml file (#476)
- Make file storage backend's atomic rename async (#471)
- materialize tables in python via native storage backend (#463)
- Fix coverage of the Python tests (#467)
- Support hash lookup by path string for Remove action (#462)
- Add new module for DeltaTableState (#464)
- Avoid table stats override in datafusion extension. (#459)
- Fix action reconciliation for add after remove (#456)
- Add pool_idle_timeout options for s3 and sts clients (#458)
- Generate new session name on assume role credentials provider refresh (#451)
- return lazy iterator in get tombstone methods (#452)
- Support no tombstone loading & new table builder API (#445)
- Fix broken tombstones metadata when extended_file_metadata is different between tomstones in state (#450)
- README: mark Checkpoint creation as done for Rust (#449)
- Add maturin develop command with extra (#448)
- Run all tests under s3 feature flag (#447)
- Update datafusion links (#446)
- Batch-apply remove actions in tombstone handling (#444)
- Fixing test to compare sorted vec (#443)
- Add delete_lock and fix release_lock (#440)
Credits:
Liang-Chi Hsieh, Robert Pack, Mykhailo Osypov, Florian Valeye, Thomas Vollmer, Yuan Zhou, roeap, Denny Lee, Yuan Zhou, Kelvin S. do Prado, QP Hou, Thomas Peiselt, Bruno Bigras, Akshay Ghiya
python-v0.5.3
- Add history command in delta-rs (#428)
- reenable datafusion integration with temporary fork (#436)
- Decode path in Add and Remove actions. (#434)
- Optimize remove action apply with early iteration exit #424 (#431)
- Clean up DeltaTransactionError (#432)
- Add is_non_acquirable field to the dynamodb lock (#429)
- Expose valid primitive type list to public doc (#430)
- Support partition value string deserialization for timestamp/binary (#371)
- Bump arrow to 6.0.0-SNAPSHOT and bring map support to schema (#375)
- Update README.adoc (#426)
- Introduce DeltaConfig and tombstones retention policy (#420)
- Sync Action attributes with delta (#380)
- Add LICENSE file in the Python binding and refer it in the pyproject.toml (#422)
- Change checkpoint creation logs from info to debug (#423)
- Add the Glue Data Catalog for reading the DeltaTable (#419)
- Add S3StorageOptions to allow configuring S3 backend explicitly (#418)
- BUGFIX: writes to gcs must include the content length header
- Ensures that all table schemas are of StructType (#415)
- Fix reading nullable action fields from parquet (#417)
- Add filesystem argument for reading DeltaTable in Python binding (#414)
- Add implementation for
load_with_datetime
in Python package. (#411) - Add a Makefile build task in the Python binding (#410)
- Use update_incremental in update (#398)
- Use
tokio::fs::rename
input_obj
. (#403) - Update python readme (#406)
- Update pyproject definition in pyproject.toml (#405)
- Add examples for reading delta table with Rust API. (#400)
- Implement delete_objs in fs and s3 storage backends. (#395)
- Remove version param from create_checkpoint_from_table (#399)
- Google cloud storage backend (#355)
- added initial commit info on create method for a DeltaTable (#387)
- Upgrade to DataFusion 5.0 (#389)
- additional error handling to atomic_rename (#386)
- Reuse table/storage instances in checkpoints (#384)
- Add sts assume role creds for S3 (#383)
- Update datafusion and ballista links in README (#382)
- Merge Cargo.toml into pyproject.toml (#381)
- Implement consistent behavior in Windows with regard to swap parameter. (#379)
- Refactoring of black, isort, mypy tools usages into pyproject.toml (#378)
- Wrap DeltaTransactionError with DeltaTableError. (#374)
- Allow filesystem backend put_obj to overwrite existing (#376)
- Make Format.options to be required field (#370)
- Implement atomic put_obj. (#367)
- support partition value string deserialization for float/double/date (#363)
- Add '.tmp' suffix to temporary file of prepared commit (#366)
- cache cargo builds in CI (#359)