Skip to content

Conversation

@LiaCastaneda
Copy link

@LiaCastaneda LiaCastaneda commented Nov 20, 2025

Cherry picked commit from (apache#18799)

  • Closes #.

Dynamic filter pushdown in DataFusion currently lacks an API to determine when filters are "complete" (all contributing partitions have reported), this creates an ambiguity issue where it's impossible to differentiate between:

  1. Complete filter with no data: Build side produced 0 rows, filter remains as placeholder lit(true), no more updates coming
  2. Incomplete filter: Filter is still being computed, updates are pending

I think this could be especially useful when we want to make the filter updates progressively in the future.

  • Calls mark_complete() after barrier completes, regardless of whether bounds exist.
  • Exposes is_complete() function on the DynamicFilterPhysicalExpr.

I didn't add any tests because the change is minimal , and comprehensive testing would require making the DynamicFilterPhysicalExpr public or running through the full optimizer pipeline.

Exposing is_complete() function.

(cherry picked from commit 7fa2a69)

Which issue does this PR close?

  • Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

…ete or updated (apache#18799)

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Closes #.

Dynamic filter pushdown in DataFusion currently lacks an API to
determine when filters are "complete" (all contributing partitions have
reported), this creates an ambiguity issue where it's impossible to
differentiate between:

1. **Complete filter with no data**: Build side produced 0 rows, filter
remains as placeholder `lit(true)`, no more updates coming
2. **Incomplete filter**: Filter is still being computed, updates are
pending

I think this could be especially useful when we want to make the filter
updates progressively in the future.

- Calls `mark_complete()` after barrier completes, regardless of whether
bounds exist.
- Exposes` is_complete() f`unction on the `DynamicFilterPhysicalExpr`.

I didn't add any tests because the change is minimal , and comprehensive
testing would require making the `DynamicFilterPhysicalExpr` public or
running through the full optimizer pipeline.

Exposing is_complete() function.

(cherry picked from commit 7fa2a69)
@LiaCastaneda LiaCastaneda force-pushed the lia/cherry-pick-new-dynamic-filter-api-function branch from 67cc93b to eaaecc2 Compare November 20, 2025 10:44
@LiaCastaneda
Copy link
Author

This commit will not be included in the new V51 release and has to be cherry picked when we do the next upgrade.

@LiaCastaneda LiaCastaneda marked this pull request as ready for review November 20, 2025 12:07
@LiaCastaneda
Copy link
Author

LiaCastaneda commented Nov 20, 2025

Most of the tests are failing with No space left on device.

@LiaCastaneda LiaCastaneda merged commit 4cc47d3 into branch-50 Nov 20, 2025
53 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants