Skip to content

[python] Fix Daft fallback filter limit pushdown#7965

Merged
JingsongLi merged 1 commit into
apache:masterfrom
QuakeWang:fix/daft-fallback-filter
May 26, 2026
Merged

[python] Fix Daft fallback filter limit pushdown#7965
JingsongLi merged 1 commit into
apache:masterfrom
QuakeWang:fix/daft-fallback-filter

Conversation

@QuakeWang
Copy link
Copy Markdown
Contributor

Purpose

The Daft Paimon datasource used the configured read builder for scan planning, but fallback split tasks rebuilt a bare TableRead. As a result, fallback reads for PK merge, non-Parquet formats, BLOB columns, and deletion-vector paths could miss pushed predicate/projection/limit state.

This was correctness-sensitive because Daft may treat pushed filters as already handled by the source. A query filtering a fallback table could plan the right split but still emit unfiltered rows from that split. There was also a related limit-ordering issue: applying limit inside the fallback reader while Daft still had remaining row or partition filters could truncate rows before those filters were evaluated.

This patch makes fallback tasks use a configured TableRead, keeps the required filter columns available for fallback execution, and centralizes the source-side limit decision so limit is only pushed when it is safe for the source to apply it before returning rows.

Tests

python -m py_compile \
  paimon-python/pypaimon/daft/daft_datasource.py \
  paimon-python/pypaimon/tests/daft/daft_data_test.py

python -m pytest paimon-python/pypaimon/tests/daft/daft_data_test.py -q

python -m pytest \
  paimon-python/pypaimon/tests/daft/daft_sink_test.py::TestBlobType::test_write_read_blob_type -q

Comment thread paimon-python/pypaimon/daft/daft_datasource.py Outdated
@XiaoHongbo-Hope
Copy link
Copy Markdown
Contributor

Looks good to me with little comment.

@QuakeWang QuakeWang force-pushed the fix/daft-fallback-filter branch from e583140 to 408da42 Compare May 26, 2026 04:42
@JingsongLi
Copy link
Copy Markdown
Contributor

+1

@JingsongLi JingsongLi merged commit 98d913e into apache:master May 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants