Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
134 commits
Select commit Hold shift + click to select a range
b5caa71
chore: remove cdf feature
ion-elgreco Apr 5, 2025
3ef4cf5
Correct Python docs for incremental compaction on OPTIMIZE
roykim98 Mar 5, 2025
0daa911
fix: added restored metadata as action to the next committed version
Nordalf Mar 5, 2025
ad580b0
chore: add a regression test to ensure restore respects metadata actions
rtyler Apr 5, 2025
64f6369
chore: fix some minor build warnings
rtyler Apr 4, 2025
edaa9a3
fix: handle unknown features
roeap Apr 8, 2025
e2f633f
fix: update to latest kernel state
roeap Apr 8, 2025
3941c0c
test: update or disable tests with unsupported features
roeap Apr 8, 2025
ad61c51
refactor: move transaction module to kernel
roeap Apr 11, 2025
2269a67
chore: clippy
roeap Apr 11, 2025
50e6c0a
chore: move proofs into dedicated folder
roeap Apr 12, 2025
7538c58
refactor: move storage module into logstore
roeap Apr 12, 2025
5b0ddfb
feat: harmonize storage config parsing
roeap Apr 12, 2025
ff7b553
refactor: remove RetryConfigParse trait
roeap Apr 12, 2025
e038eec
feat!: formalize parsing of storage options
roeap Apr 13, 2025
44cc251
feat: centrally apply object store layers
roeap Apr 13, 2025
a641665
refactor: isolate factories for storage / log store integrations
roeap Apr 14, 2025
aa4f112
fix: url parsing inconsistencies
roeap Apr 14, 2025
ba80e15
fix: PR feedback
roeap Apr 15, 2025
b2dd8fa
fix: PR feedback
roeap Apr 15, 2025
41b4265
fix: clippy warnings
alamb Apr 16, 2025
4cf5d5e
feat: derive macro for config implementations
roeap Apr 16, 2025
34975a9
feat: error handling in derive macro
roeap Apr 16, 2025
08e0361
refactor: move str_ist_truthy to config
roeap Apr 16, 2025
494e0b0
chore: clippy
roeap Apr 16, 2025
82661b8
Chore: put a couple symbols behind the right feature gate
rtyler Apr 20, 2025
792a635
update for kernel 0.10.0
zachschuermann Apr 29, 2025
338a0bf
fix daft docs
zachschuermann Apr 29, 2025
d4586c9
Fix the default target size
HiromuHota Apr 30, 2025
06aa6c0
gate RetryConfig usage on 'cloud' feature
zeevm Apr 24, 2025
1f5d8d8
add compile_error if neither 'rustls' not 'native-tls' are enabled
zeevm Apr 25, 2025
0ac6e39
feat: Update to Datafusion 47.0.0
alamb Apr 11, 2025
da2e201
feat: Update to Datafusion 47.0.0
alamb Apr 11, 2025
2bc1937
chore: re-enable hdfs support and add a teensy tiny unit test
rtyler May 1, 2025
131a6e2
chore: tighten up the checking on the predicate comparison
rtyler May 1, 2025
07afbfd
chore: bump versions of rust crates for another release party
rtyler May 1, 2025
b3ed075
chore: remove unused dependencies
rtyler May 1, 2025
68782da
chore: ensure derive is ready for publishing too
rtyler May 1, 2025
1e0a7c2
chore: the mount crate reqiores the cloud feature now
rtyler May 3, 2025
53448c9
chore: hdfs requires the cloud feature
rtyler May 3, 2025
bd17370
chore: modify the publish script to take the required crate ordering …
rtyler May 3, 2025
b30c2ed
chore: remove unnecessary datafusion dependency for mount
rtyler May 3, 2025
e47bba5
chore: reduce feature/dependency footprint for subcrates
rtyler May 3, 2025
13acb2f
feat: introduce VacuumMode::Full for cleaning up orphaned files
rtyler May 3, 2025
6babfb6
fix: if field contains space in constraint expression, the check will…
Nordalf Apr 9, 2025
89ddf07
chore: add test for handling fields with spaces in constraints
rtyler May 3, 2025
394697a
chore(deps): Update sqlparser requirement from 0.53.0 to 0.56.0
dependabot[bot] May 5, 2025
9954bff
chore(deps): Update foyer requirement from 0.16.1 to 0.17.0
dependabot[bot] May 5, 2025
58c32e0
chore: setup dat test scaffolding
rtyler May 4, 2025
09b240e
chore: bring dat test loading into the root
rtyler May 4, 2025
3be102d
chore: enable dat testing with the existing code
roeap Jan 16, 2025
e13080b
chore: missed a version bump for core
rtyler May 5, 2025
696171a
fix: build Unity Catalog crate without DataFusion
linhr May 7, 2025
38d1891
fix: drop column earlier
ion-elgreco May 5, 2025
24c1017
chore: add a regression test for #3413
rtyler May 7, 2025
3d2bbef
chore: include license file in deltalake-derive crate
ankane May 6, 2025
83fe3e4
chore(deps): bump foyer to v0.17.2 to prevent from wrong result
MrCroxx May 14, 2025
3a30e8e
fix: pin arrow to 55.0.0
ion-elgreco May 13, 2025
98eec11
feat: during LakeFS file operations, skip merge when 0 changes
smeyerre Mar 25, 2025
a713ee3
Fix broken test
smeyerre Mar 25, 2025
4f9e1b9
Fix lakefs diff API parameter order
smeyerre May 12, 2025
a99e502
Fix formatting
smeyerre May 13, 2025
1c29aa6
added gc valid check
JustinRush80 May 14, 2025
78c3a4e
chore: bump crate versions which are due for release
rtyler May 15, 2025
3fe9e9a
feat: spawn io with spawn service
ion-elgreco May 13, 2025
00416ea
fix: pin arrow to 55.0.0
ion-elgreco May 13, 2025
b41bd3d
chore: rely on the testing during coverage generation to speed up tests
rtyler May 15, 2025
5f6fc4d
chore: make codecov more vigorously enforced to help ensure quality
rtyler May 15, 2025
e985f95
chore: prepare py-1.0 release
ion-elgreco May 16, 2025
95e6a35
Upgrade load_with_datetime to ignore any uncommited deltas in any sub…
corwinjoy May 9, 2025
e976586
feat(datafusion): file pruning based on pushdown limit for partition…
aditanase Apr 14, 2025
3803af8
feat(datafusion): optmize partition pruning, pushdown full predicates…
aditanase Apr 14, 2025
8d130f7
chore: experiment with using sccache in GitHub Actions
rtyler May 16, 2025
c786adb
chore: cleanup the CODEOWNERS a bit for more accurate review assignments
rtyler May 16, 2025
c6c9d47
chore: only check our documentation, not dependencies
rtyler May 16, 2025
73f4da3
chore: refactor the Rust build to use as much as possible of sccache
rtyler May 16, 2025
23f01ed
chore: remove unused code and deps
roeap May 17, 2025
b9f3c65
chore: remove peek_next_commit on DeltaTable which has been deprecate…
rtyler May 17, 2025
7392171
chore: refactor some symbols out of table/mod.rs into their own files
rtyler May 17, 2025
0e1df33
docs: add 1.0.0 migration guide
ion-elgreco May 17, 2025
9890d9c
refactor: more specific factory parameter names
roeap May 17, 2025
435cc81
feat: expose kernel Engine on LogStore
roeap May 18, 2025
b058e86
chore: pr feedback and test fixes
roeap May 18, 2025
9bc3b1b
test: avoid circular dependency with core/test crates
roeap May 19, 2025
0b1c615
refactor: use LogStore in Snapshot / LogSegment APIs
roeap May 20, 2025
efd09f8
chore: build default tests with the crate in CI
rtyler May 21, 2025
2000473
chore: enable the datafusion feature for integration tests which need it
rtyler May 21, 2025
1cf629c
chore: annotate tests which require datafusion appropriately
rtyler May 21, 2025
6b34d0f
ci: add spellchecker to pr tests
roeap May 23, 2025
2fe2ec5
chore: mark more tests which require datafusion
rtyler May 23, 2025
92e75bb
refactor: move from pyarrow to arro3
ion-elgreco May 24, 2025
888b03f
chore: pr feedback
ion-elgreco May 24, 2025
9fc5204
refactor: use root store in log processing
roeap May 24, 2025
e71e507
fix: use more accurate log path parsing
roeap May 24, 2025
d455431
chore: set correct markers
ion-elgreco May 25, 2025
3d673d8
chore: update kernel
roeap May 24, 2025
80da7f3
chore: update kernel
roeap May 24, 2025
8a6a870
fix: remove problematic typos configuration and fix Spellcheck issues
fvaleye May 25, 2025
235b088
feat: use kernel checkpoint writer
roeap May 24, 2025
1cdb1ca
refactor: use kernel log segment for some log inspection
roeap May 25, 2025
784e65e
chore: remove unused time_utils
roeap May 25, 2025
9b24a3d
chore: more typos
roeap May 25, 2025
ecf6ebc
refactor: remove protocol error
roeap May 26, 2025
bfb8c7c
feat: add table description and name API for Python
fvaleye May 25, 2025
cc3e348
feat: add validator crate and use to have update table metadata valid…
fvaleye May 26, 2025
226c397
chore: remove unused stats parsed field
roeap May 27, 2025
36b4b77
fix: arro3 schema conversion logic
ion-elgreco May 28, 2025
d014e95
chore: update migration docs
ion-elgreco May 28, 2025
c40d41e
chore: improve wording
ion-elgreco May 28, 2025
e53a7e0
chore: update kernel to 0.11
roeap May 28, 2025
e57929e
fix: set casting safe param to False
ion-elgreco May 28, 2025
3197b92
chore: add xfail to flaky test
ion-elgreco May 28, 2025
4f14b8e
fix bullet list formatting
avriiil May 28, 2025
8d8ace1
refactor!: get transaction versions for specific applications
roeap May 28, 2025
fbdcc34
test: improve storage config testing
roeap May 28, 2025
7b0b4a2
chore: exclude Invariants from the default writer v2 feature set
rtyler May 28, 2025
0f346ec
refactor!: remove and deprecate some python methods
roeap May 28, 2025
5ea3d15
fix: ensure projecting only columns that exist in new files afte sche…
alexwilcoxson-rel May 28, 2025
e8a7c40
docs: update link to df
rluvaton May 28, 2025
48cf336
chore: update runner
ion-elgreco May 29, 2025
a31f241
ci: improve coverage collection
roeap May 29, 2025
2f096a9
chore: prepare for the next python release
rtyler May 29, 2025
e5a7963
chore!: remove get_earliest_version
roeap May 29, 2025
6baef44
refactor!: have DeltaTable::version return an Option
roeap May 29, 2025
e06e47d
Revert "chore: add test for handling fields with spaces in constraints"
ion-elgreco May 30, 2025
1ec381d
Revert "fix: if field contains space in constraint expression, the ch…
ion-elgreco May 30, 2025
79907fd
fix: spaced columns parsing
ion-elgreco May 30, 2025
1c648b8
chore: update tests
ion-elgreco May 30, 2025
2f990fe
chore: fmt
ion-elgreco May 30, 2025
36a7696
fix: wrong schema set in table provider
ion-elgreco Jun 1, 2025
caaf56a
chore: bump version
ion-elgreco Jun 1, 2025
028db69
refactor: move LazyTableProvider into python crate
roeap May 30, 2025
f8dcef3
feat: add convenience extensions for kernel engine types
roeap Jun 1, 2025
b907b08
Merge tag 'python-v1.0.2' into update-main
hamidgh09 Jul 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
17 changes: 12 additions & 5 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,7 +1,14 @@
crates/ @wjones127 @roeap @rtyler @hntd187 @ion-elgreco
delta-inspect/ @wjones127 @rtyler
crates/core @roeap @rtyler @hntd187 @ion-elgreco
crates/deltalake @roeap @rtyler @hntd187 @ion-elgreco
crates/aws @rtyler
crates/lakefs @ion-elgreco
crates/catalog-unity @roeap @hntd187

delta-inspect/ @rtyler

proofs/ @houqp
python/ @wjones127 @fvaleye @roeap @ion-elgreco
python/ @wjones127 @roeap @ion-elgreco

tlaplus/ @houqp
.github/ @wjones127 @rtyler
docs/ @MrPowers

.github/ @rtyler
4 changes: 2 additions & 2 deletions .github/actions/setup-env/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ description: "Set up Python, virtual environment, and Rust toolchain"
inputs:
python-version:
description: "The Python version to set up"
required: true
required: false
default: "3.10"

rust-toolchain:
description: "The Rust toolchain to set up"
required: true
required: false
default: "stable"

runs:
Expand Down
11 changes: 6 additions & 5 deletions .github/codecov.yml
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@

coverage:
status:
project:
default:
# allow some leniency on the deviation of pull requests
threshold: '1%'
informational: true
threshold: 1
if_ci_failed: error
informational: false
patch:
default:
informational: true

if_ci_failed: error
informational: false

ignore:
- "delta-inspect/"
- "proofs/"
- "**/*.toml"
- "crates/benchmarks/"
2 changes: 1 addition & 1 deletion .github/scripts/retry_integration_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ MAX_RETRIES=$2
RETRY_DELAY=$3
ATTEMPT=1
run_command() {
uv run --no-sync pytest -m "($TEST_NAME and integration)" --doctest-modules 2>&1
uv run --no-sync pytest -m "($TEST_NAME and integration and pyarrow)" --doctest-modules 2>&1
}
until [ $ATTEMPT -gt $MAX_RETRIES ]
do
Expand Down
219 changes: 167 additions & 52 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,89 +14,80 @@ env:
# Disable incremental builds by cargo for CI which should save disk space
# and hopefully avoid final link "No space left on device"
CARGO_INCREMENTAL: 0
SCCACHE_GHA_ENABLED: "true"
RUSTC_WRAPPER: "sccache"

jobs:
default_build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
override: true

- name: Build
run: (cd crates/deltalake && cargo build)

format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
override: true

- name: Format
run: cargo fmt -- --check

# run various build comnfigurations, fmt, and clippy.
build:
strategy:
fail-fast: false
fail-fast: true
matrix:
os:
- ubuntu-latest
- windows-latest
- macos-latest
runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
toolchain: "1.82"
override: true

- name: Format
run: cargo fmt -- --check

- name: Default build
run: (cd crates/deltalake && cargo build --tests)

- name: build and lint with clippy
run: cargo clippy --features ${{ env.DEFAULT_FEATURES }} --tests

- name: Spot-check build for native-tls features
run: cargo clippy --no-default-features --features azure,datafusion,s3-native-tls,gcs,glue --tests

- name: Check docs
run: cargo doc --features ${{ env.DEFAULT_FEATURES }}

- name: Check no default features (except rustls)
run: cargo check --no-default-features --features rustls

test:
- name: Check docs
run: cargo doc --no-deps --features ${{ env.DEFAULT_FEATURES }}

unit_test:
name: Unit Tests
strategy:
fail-fast: false
fail-fast: true
matrix:
os:
- ubuntu-latest
- windows-latest
- macos-latest
runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
toolchain: "1.82"
override: true

- name: Run tests
run: cargo test --verbose --features ${{ env.DEFAULT_FEATURES }}
run: |
make setup-dat
cargo test --features ${{ env.DEFAULT_FEATURES }}

integration_test:
name: Integration Tests
Expand All @@ -115,13 +106,71 @@ jobs:
AZURE_STORAGE_CONNECTION_STRING: "DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://localhost:10000/devstoreaccount1;QueueEndpoint=http://localhost:10001/devstoreaccount1;"

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: "1.82"
override: true

- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov

- name: Start emulated services
run: docker compose up -d

- name: Run tests with rustls (default)
run: |
gmake setup-dat
cargo llvm-cov \
--features integration_test,${{ env.DEFAULT_FEATURES }} \
--workspace \
--exclude delta-inspect \
--exclude deltalake-hdfs \
--exclude deltalake-lakefs \
--codecov \
--output-path codecov.json

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: codecov.json
fail_ci_if_error: true
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

integration_test_native_tls:
name: Integration Tests (Native TLS)
runs-on: ubuntu-latest
env:
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
AWS_DEFAULT_REGION: "us-east-1"
AWS_ACCESS_KEY_ID: deltalake
AWS_SECRET_ACCESS_KEY: weloverust
AWS_ENDPOINT_URL: http://localhost:4566
AWS_ALLOW_HTTP: "1"
AZURE_USE_EMULATOR: "1"
AZURE_STORAGE_ALLOW_HTTP: "1"
AZURITE_BLOB_STORAGE_URL: "http://localhost:10000"
AZURE_STORAGE_CONNECTION_STRING: "DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://localhost:10000/devstoreaccount1;QueueEndpoint=http://localhost:10001/devstoreaccount1;"

steps:
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
toolchain: "1.82"
override: true

# Install Java and Hadoop for HDFS integration tests
Expand All @@ -139,15 +188,63 @@ jobs:
- name: Start emulated services
run: docker compose up -d

- name: Run tests with rustls (default)
run: |
cargo test --features integration_test,${{ env.DEFAULT_FEATURES }}

- name: Run tests with native-tls
run: |
cargo clean
gmake setup-dat
cargo test --no-default-features --features integration_test,s3-native-tls,datafusion

integration_test_hdfs:
name: Integration Tests (HDFS)
runs-on: ubuntu-latest
env:
# https://github.com/rust-lang/cargo/issues/10280
CARGO_NET_GIT_FETCH_WITH_CLI: "true"

steps:
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: "1.82"
override: true

- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov

# Install Java and Hadoop for HDFS integration tests
- uses: actions/setup-java@v4
with:
distribution: "temurin"
java-version: "17"

- name: Download Hadoop
run: |
wget -q https://dlcdn.apache.org/hadoop/common/hadoop-3.4.0/hadoop-3.4.0.tar.gz
tar -xf hadoop-3.4.0.tar.gz -C $GITHUB_WORKSPACE
echo "$GITHUB_WORKSPACE/hadoop-3.4.0/bin" >> $GITHUB_PATH

- name: Run tests with rustls (default)
run: |
gmake setup-dat
cargo llvm-cov \
--features integration_test \
--package deltalake-hdfs \
--codecov \
--output-path codecov.json

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: codecov.json
fail_ci_if_error: true
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

integration_test_lakefs:
name: Integration Tests (LakeFS v1.48)
runs-on: ubuntu-latest
Expand All @@ -156,15 +253,21 @@ jobs:
CARGO_NET_GIT_FETCH_WITH_CLI: "true"

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9

- name: Install minimal stable with clippy and rustfmt
uses: actions-rs/toolchain@v1
with:
profile: default
toolchain: '1.82'
toolchain: "1.82"
override: true

- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov

- name: Download Lakectl
run: |
wget -q https://github.com/treeverse/lakeFS/releases/download/v1.48.1/lakeFS_1.48.1_Linux_x86_64.tar.gz
Expand All @@ -176,5 +279,17 @@ jobs:

- name: Run tests with rustls (default)
run: |
cargo test --features integration_test_lakefs,lakefs,datafusion

gmake setup-dat
cargo llvm-cov \
--package deltalake-lakefs \
--features integration_test_lakefs \
--codecov \
--output-path codecov.json

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: codecov.json
fail_ci_if_error: true
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Loading