Releases: delta-io/delta-rs
Releases · delta-io/delta-rs
python-v0.18.0: CDC for update operation, added `set table properties` operation
New features
- feat: adopt kernel schema types by @roeap in #2495
- feat: add stats to convert-to-delta operation by @gruuya in #2491
- feat(python, rust): add
set table properties
operation by @ion-elgreco in #2264 - feat: implement transaction identifiers - continued by @roeap in #2539
- feat: introduce CDC write-side support for the Update operations by @rtyler in #2486
Bug Fixes
- fix(rust, python): fixed differences in storage options between log and object stores by @mightyshazam in #2500
- fix: enable field_with_name to support nested fields with '.' delimiter by @alexwilcoxson-rel in #2519
- fix(python): release GIL on most operations by @adriangb in #2512
- fix: clippy warnings by @imor in #2548
- fix: remove deprecated overwrite_schema configuration which has incorrect behavior by @rtyler in #2554
- fix: update deltalake crate examples for crate layout and TimestampNtz by @jhoekx in #2559
- fix: consistently use raise_if_key_not_exists in CreateBuilder by @vegarsti in #2569
- fix: cast support fields nested in lists and maps by @HawaiianSpork in #2541
Other Changes
- docs: fix typo by @avriiil in #2508
- chore: tidying up builds without datafusion feature and clippy by @rtyler in #2516
- chore: fixing some clips by @rtyler in #2521
- fix: msrv in workspace by @roeap in #2524
- feat(rust): make PartitionWriter public by @adriangb in #2525
- docs: improve daft integration docs by @avriiil in #2496
- chore: bump python 0.17.5 by @ion-elgreco in #2531
- chore(deps): update itertools requirement from 0.12 to 0.13 by @dependabot in #2526
- docs: dask write syntax fix by @avriiil in #2543
- docs: pull delta from conda not pip by @avriiil in #2535
- docs: clarify locking mechanism requirement for S3 by @inigohidalgo in #2558
- chore(deps): update sqlparser requirement from 0.46 to 0.47 by @dependabot in #2563
- docs: dt.delete add context + api docs link by @avriiil in #2560
New Contributors
- @imor made their first contribution in #2548
- @inigohidalgo made their first contribution in #2558
- @vegarsti made their first contribution in #2565
- @HawaiianSpork made their first contribution in #2541
Full Changelog: python-v0.17.4...python-v0.18.0
python-v0.17.4: stats collection according config
New features
- feat(python): add parameter to DeltaTable.to_pyarrow_dataset() by @adriangb in #2465
- feat(python, rust): respect column stats collection configurations by @ion-elgreco in #2428
Bug Fixes
- fix(rust): implement abort commit for S3DynamoDBLogStore by @PeterKeDer in #2452
- fix(python, rust): use new schema for stats parsing instead of old by @ion-elgreco in #2480
- fix: check to see if the file exists before attempting to rename by @rtyler in #2482
- fix(rust): unable to read delta table when table contains both null and non-null add stats by @yjshen in #2476
- fix(python, rust): region lookup wasn't working correctly for dynamo by @mightyshazam in #2488
- fix: return unsupported error for merging schemas in the presence of partition columns by @emcake in #2469
- fix(python): reuse state in
to_pyarrow_dataset
by @ion-elgreco in #2485
Other Changes
- chore(deps): update sqlparser requirement from 0.44 to 0.46 by @dependabot in #2483
- test: add test for concurrent checkpoint during table load by @alexwilcoxson-rel in #2151
Full Changelog: python-v0.17.3...python-v0.17.4
python-v0.17.3: CDF read support
New features
- feat(rust): advance state in post commit by @ion-elgreco in #2396
- feat: cdf reader for delta tables by @hntd187 in #2048
- feat(python, rust): add OBJECT_STORE_CONCURRENCY_LIMIT setting for ObjectStoreFactory by @zZKato in #2458
Bug Fixes
Other changes
- chore(rust): bump arrow v51 and datafusion v37.1 by @lasantosr in #2395
New Contributors
Full Changelog: python-v0.17.2...python-v0.17.3
rust-v0.17.3
rust-v0.17.3 (2024-05-01)
Implemented enhancements:
- Limit concurrent ObjectStore access to avoid resource limitations in constrained environments #2457
- How to get a DataFrame in Rust? #2404
- Allow checkpoint creation when partion column is "timestampNtz " #2381
- is there a way to make writing timestamp_ntz optional #2339
- Update arrow dependency #2328
- Release GIL in deltalake.write_deltalake #2234
- Unable to retrieve custom metadata from tables in rust #2153
- Refactor commit interface to be a Builder #2131
Fixed bugs:
- Handle rate limiting during write contention #2451
- regression : delta.logRetentionDuration don't seems to be respected #2447
- Issue writing to mounted storage in AKS using delta-rs library #2445
- TableMerger - when_matched_delete() fails when Column names contain special characters #2438
- Generic DeltaTable error: External error: Arrow error: Invalid argument error: arguments need to have the same data type - while merge data in to delta table #2423
- Merge on predicate throw error on date colum: Unable to convert expression to string #2420
- Writing Tables with Append mode errors if the schema metadata is different #2419
- Logstore issues on AWS Lambda #2410
- Datafusion timestamp type doesn't respect delta lake schema #2408
- Compacting produces smaller row groups than expected #2386
- ValueError: Partition value cannot be parsed from string. #2380
- Very slow s3 connection after 0.16.1 #2377
- Merge update+insert truncates a delta table if the table is big enough #2362
- Do not add readerFeatures or writerFeatures keys under checkpoint files if minReaderVersion or minWriterVersion do not satisfy the requirements #2360
- Create empty table failed on rust engine #2354
- Getting error message when running in lambda: message: "Too many open files" #2353
- Temporary files filling up _delta_log folder - increasing table load time #2351
- compact fails with merged schemas #2347
- Cannot merge into table partitioned by date type column on 0.16.3 #2344
- Merge breaks using logical datatype decimal128 #2343
- Decimal types are not checked against max precision/scale at table creation #2331
- Merge update+insert truncates a delta table #2320
- Extract
add.stats_parsed
with wrong type #2312 - Process fails without error message when executing merge #2310
- delta_rs don't seems to respect the row group size #2309
- Auth error when running inside VS Code #2306
- Unable to read deltatables with binary columns: Binary is not supported by JSON #2302
- Schema evolution not coercing with Large arrow types #2298
- Panic in
deltalake_core::kernel::snapshot::log_segment::list_log_files_with_checkpoint::{{closure}}
#2290 - Checkpoint does not preserve reader and writer features for the table protocol. #2288
- Z-Order with larger dataset resulting in memory error #2284
- Successful writes return error when using concurrent writers #2279
- Rust writer should raise when decimal types are incompatible (currently writers and puts table in invalid state) #2275
- Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262
- DeltaTable is not resilient to corrupted checkpoint state #2258
- Inconsistent units of time #2256
- Partition column comparison is an assertion rather than if block with raise exception #2242
- Unable to merge column names starting from numbers #2230
- Merging to a table with multiple distinct partitions in parallel fails #2227
- cleanup_metadata not respecting custom
logRetentionDuration
#2180 - Merge predicate fails with a field with a space #2167
- When_matched_update causes records to be lost with explicit predicate #2158
- Merge execution time grows exponetially with the number of column #2107
- _internal.DeltaError when merging #2084
python-v0.17.2
What's Changed
- chore: introduce the Operation trait to enforce consistency between operations by @rtyler in #2435
- fix(python): reuse table state in write engine by @ion-elgreco in #2453
Full Changelog: python-v0.17.1...python-v0.17.2
python-v0.17.1
Bug Fixes
- fix(python, rust): use from_name during column projection creation by @ion-elgreco in #2441
- fix(python, rust): check timestamp_ntz in nested fields, add check_can_write in pyarrow writer by @ion-elgreco in #2443
- fix(python, rust): remove imds calls from profile auth and region by @mightyshazam in #2442
Full Changelog: python-v0.17.0...python-v0.17.1
python-v0.17.0: checkpoint hook
New features
- feat(rust): post commit hook (v2), create checkpoint hook by @ion-elgreco in #2391
- feat: added configuration variables to handle EC2 metadata service by @mightyshazam in #2385
- feat: lazy static runtime in python by @ion-elgreco in #2424
- feat: implement repartitioned for DeltaScan by @jkylling in #2421
Bug Fixes
- fix(python, rust): expr parsing date/timestamp by @ion-elgreco in #2357
- fix(rust): remove flush after writing every batch by @PeterKeDer in #2387
- fix: return error when checkpoints and metadata get out of sync by @esarili in #2406
- fix: time travel when checkpointed and logs removed by @ion-elgreco in #2389
- fix(rust): timestamp deserialization format, missing type by @ion-elgreco in #2383
- fix(rust): stats_parsed has different number of records with stats by @yjshen in #2405
- fix(python): load_as_version with datetime object with no timezone specified by @t1g0rz in #2429
- fix(python,rust): missing remove actions during create_or_replace specified by @ion-elgreco in #2437
Other Changes
- chore: bump chrono by @universalmind303 in #2372
- docs: document required aws permissions by @ale-rinaldi in #2393
- docs: add Daft integration by @avriiil in #2402
New Contributors
- @PeterKeDer made their first contribution in #2387
- @ale-rinaldi made their first contribution in #2393
- @esarili made their first contribution in #2406
- @jkylling made their first contribution in #2421
- @t1g0rz made their first contribution in #2429
Full Changelog: python-v0.16.4...python-v0.17.0
python-v0.16.4
Bug Fixes
- fix(python): wrong batch size by @ion-elgreco in #2314
- fix(rust): raise schema mismatch when decimal is not subset by @ion-elgreco in #2330
- fix: make struct fields nullable in stats schema by @qinix in #2346
- fix: remove tmp files in cleanup_metadata by @ion-elgreco in #2356
- fix(python,rust): optimize compact on schema evolved table by @ion-elgreco in #2358
- fix: add config for parquet pushdown on delta scan by @Blajda in #2364
- feat(rust): derive Copy on some public enums by @lasantosr in #2329
- fix: add snappy compression on checkpoint files by @ion-elgreco in #2365
Other Changes
- chore(rust): bump datafusion to 36 by @universalmind303 in #2249
- chore: bump python 0.16.4 by @ion-elgreco in #2371
New Contributors
- @lasantosr made their first contribution in #2329
Full Changelog: python-v0.16.3...python-v0.16.4
python-v0.16.3
New features
Bug Fixes
- fix: try to fix timeouts by @ion-elgreco in #2318
- fix: handle conflict checking in optimize correctly by @emcake in #2208
- fix: merge concurrency control by @ion-elgreco in #2324
- fix: merge pushdown handling by @Blajda in #2326
- fix(rust): serialize MetricDetails from compaction runs to a string by @liamphmurphy in #2317
- fix(rust): adhere to protocol for Decimal by @ion-elgreco in #2332
Other Changes
- chore: object store 0.9.1 by @ion-elgreco in #2311
- docs: add example in to_pyarrow_dataset by @ion-elgreco in #2315
- Revert 2291 merge predicate fix by @Blajda in #2323
New Contributors
- @liamphmurphy made their first contribution in #2317
Full Changelog: python-v0.16.2...python-v0.16.3
python-v0.16.2
Bug Fixes
- fix: schema evolution not coercing with large arrow types by @aersam in #2305
- fix: clean up some non-datafusion builds by @rtyler in #2303
- fix: checkpoint features format below v3,7 by @ion-elgreco in #2307
- fix: merge predicate for concurrent writes by @JonasDev1 in #2291
- fix(rust): add missing chrono-tz feature by @ion-elgreco in #2295
Other Changes
New Contributors
Full Changelog: python-v0.16.1...python-v0.16.2