proto: preserve DynamicFilterPhysicalExpr identity across round-trip#21786
Open
adriangb wants to merge 7 commits intoapache:mainfrom
Open
proto: preserve DynamicFilterPhysicalExpr identity across round-trip#21786adriangb wants to merge 7 commits intoapache:mainfrom
adriangb wants to merge 7 commits intoapache:mainfrom
Conversation
2df2293 to
955117c
Compare
Without this change a plan with a DynamicFilterPhysicalExpr referenced from two sites (e.g. a HashJoinExec and a pushed-down ParquetSource predicate) loses referential integrity when serialized and deserialized: the filter is snapshotted away at serialize time, and even if it survived the existing Arc-pointer dedup scheme would give two sites different ids because the pushed-down side comes from `with_new_children` with a different outer Arc. Changes: - `PhysicalExpr::expression_id(&self) -> Option<u64>` new trait method, defaulting to None. Only DynamicFilterPhysicalExpr reports an identity. - DynamicFilterPhysicalExpr stores a stable random u64 id that follows the shared inner Arc through `with_new_children`. Custom Debug hides the random id so plan snapshots stay deterministic. - New proto variant PhysicalDynamicFilterExprNode carrying the current expression, the site's children view, generation, and is_complete. - serialize_physical_expr_with_converter stops calling snapshot_physical_expr at the top — dynamic filters survive as themselves. HashTableLookupExpr still gets the lit(true) replacement. - DeduplicatingSerializer is now stateless: it stamps `expr_id = expr.expression_id()`. The old session_id/Arc::as_ptr/pid hashing is dropped; dedup only fires for expressions that opt in via expression_id. Restoring within-process dedup for other types is a follow-up (implement expression_id on InList, literals, etc.). - DeduplicatingDeserializer has a single unified cache path: on miss parse + cache, on hit parse once to recover this site's children and overlay via `with_new_children` on the cached canonical. This gives DynamicFilter its shared-inner semantics without any type-specific code in the deserializer. - Three integration tests in roundtrip_physical_plan.rs cover shared inner preservation, per-site remapped children + update propagation including column remap, and the FilterExec → ProjectionExec → DataSourceExec(ParquetSource) shape from the PR description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the 5-arg with_id_and_state(id, children, inner, generation,
is_complete) constructor with three fluent setters on Self that match
the HashJoinExecBuilder style used elsewhere in the crate:
DynamicFilterPhysicalExpr::new(children, inner)
.with_id(id)
.with_generation(generation)
.with_is_complete(is_complete)
new() keeps the default case (random id, generation 1, not complete).
The setters are intended for the deserialize side of proto round-trip
and assume the filter hasn't been shared yet; update() and
mark_complete() remain the correct path for live mutation.
No behavior change — only from_proto is migrated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change HashJoinExecBuilder::with_dynamic_filter from a crate-private `Option<HashJoinExecDynamicFilter>` setter into a public API that takes just `Arc<DynamicFilterPhysicalExpr>` and wraps it internally. The inner `HashJoinExecDynamicFilter` and its `SharedBuildAccumulator` stay private; only the filter itself crosses the API boundary. Promote `dynamic_filter_for_test` to a plain `pub fn dynamic_filter()` accessor — proto serialization has a legitimate non-test use for it. Existing test caller migrated. No behavior change. Sets up HashJoinExec.dynamic_filter to be round-trippable through proto in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirroring HashJoinExecBuilder::with_dynamic_filter: expose a fluent setter on SortExec that installs a caller-provided `Arc<DynamicFilterPhysicalExpr>` as the TopK dynamic filter, replacing any auto-created one. Add a matching `dynamic_filter()` accessor that reads through the `TopKDynamicFilters` wrapper. No behavior change. Sets up SortExec.filter to be round-trippable through proto in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add optional PhysicalExprNode dynamic_filter field to HashJoinExecNode. Emit it from to_proto via the new HashJoinExec::dynamic_filter() accessor; on deserialize, parse it and install via HashJoinExecBuilder::with_dynamic_filter. Critically, because the scan-side pushed-down predicate is already in the id cache by the time we deserialize the HashJoinExec's field, the cache hit returns the same canonical Arc — the join's filter and the scan's filter share `inner` automatically. Build-side update() during execution is observed by the scan, and the scan prunes rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add optional PhysicalExprNode dynamic_filter field to SortExecNode. Emit from to_proto via SortExec::dynamic_filter(); on deserialize parse it and install via SortExec::with_dynamic_filter after the usual with_fetch / with_preserve_partitioning chain. The with_fetch step may auto-create a TopK filter when fetch is set; with_dynamic_filter then replaces it with the one from the sender so the id matches the pushed-down scan's copy (shared via the id cache). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1136f73 to
5ff6a25
Compare
…n tests Introduce a new workspace member `datafusion-tests` at datafusion/tests dedicated to integration tests that need to depend on both `datafusion` and `datafusion-proto`. Placing such tests in either of those crates' own `tests/` directory would close a dev-dependency cycle caught by the workspace's circular-dependency check (see `dev/depcheck`). The first two tests exercise end-to-end SQL round-trips proving the DynamicFilterPhysicalExpr wiring added in preceding commits actually drives scan-level pruning: - `hash_join_dynamic_filter_prunes_via_sql`: INNER JOIN with a WHERE on the build side; after round-trip the probe-side `ParquetSource` emits fewer rows than the full table. - `topk_dynamic_filter_prunes_files_via_sql`: ORDER BY ... LIMIT 1 over two single-row parquet files; after round-trip the second file is pruned by row-group statistics because TopK's dynamic filter tightens after reading the first, and the scan sees exactly one row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Given a plan like
after serialize/deserialize the two
DynamicFilterPhysicalExprwrappers should still share the same mutableinner, so that aHashJoinExecupdate during execution is visible at the pushed-downParquetSourceand the scan prunes rows. Today this breaks for several reasons:serialize_physical_expr_with_convertercallssnapshot_physical_expr, which replacesDynamicFilterPhysicalExprwith its current inner expression (oftenlit(true)) — identity is lost.Arc::as_ptr, but theHashJoinExecside and theParquetSourceside hold different outerArcs (one comes fromwith_new_children), so they never shareexpr_id.HashJoinExecandSortExec(TopK) don't serialize their owndynamic_filterfield — on deserialize they mint a fresh filter that's disconnected from anything pushed into the scan.What changes are included in this PR?
Seven commits, grouped by concern. Refactor commits change API shape only and should be reviewable independently. Change commits add behavior.
Base
proto: preserve DynamicFilterPhysicalExpr identity across round-trip— AddsPhysicalExpr::expression_id(&self) -> Option<u64>(defaultNone).DynamicFilterPhysicalExprgains a stableu64id that follows the sharedArc<RwLock<Inner>>acrosswith_new_children. New proto variantPhysicalDynamicFilterExprNode { current_expr, children, generation, is_complete }— the inner expr tree is serialized natively instead of snapshotted away.DeduplicatingSerializerstampsexpr_id = expr.expression_id()(no moreArc::as_ptr/session_id/pidhashing).DeduplicatingDeserializeruses a single unified cache path: on miss, parse + cache; on hit, parse to recover this site's children and overlay viawith_new_childrenon the cached canonical — givesDynamicFilterits shared-inner semantics without any type-specific code in the deserializer. Three proto-level round-trip tests cover shared-inner preservation, per-site remapped children with update propagation, and theFilterExec → ProjectionExec → DataSourceExec(ParquetSource)shape.Refactor
physical-expr: builder setters on DynamicFilterPhysicalExpr— Replaces a growing 5-arg constructor with fluentwith_id(u64)/with_generation(u64)/with_is_complete(bool)setters onSelf, following theHashJoinExecBuilderstyle.new(children, inner)keeps the default case.hash_join: public builder setter + accessor for the dynamic filter— MakesHashJoinExecBuilder::with_dynamic_filterpublic and changes its signature from the crate-privateOption<HashJoinExecDynamicFilter>to plainArc<DynamicFilterPhysicalExpr>, wrapping it internally. The innerSharedBuildAccumulatorstays private; only the filter crosses the API boundary. Promotesdynamic_filter_for_testtodynamic_filter().sort: add SortExec::with_dynamic_filter builder setter + accessor— Mirror forSortExec. Fluent setter onSelf+ adynamic_filter()accessor that reads through theTopKDynamicFilterswrapper.Change
proto: round-trip HashJoinExec dynamic_filter— Addsoptional PhysicalExprNode dynamic_filtertoHashJoinExecNode.to_protoemits the filter via the new accessor;from_protoparses it (hitting the id cache populated by the scan-side pushed-down predicate) and installs it viaHashJoinExecBuilder::with_dynamic_filter. Because the cache returns the same canonicalArc, the join's filter and the scan's filter shareinnerautomatically.proto: round-trip SortExec's TopK dynamic filter— Same pattern forSortExecNode.with_fetchmay auto-create a TopK filter;with_dynamic_filterthen replaces it with the sender's so the id matches the pushed-down scan's copy.Integration tests
test: add datafusion-tests workspace crate for cross-crate integration tests— Introduces a newdatafusion-testsworkspace member atdatafusion/tests/. Tests living here can dev-depend on bothdatafusionanddatafusion-protosimultaneously; putting them in either of those crates' owntests/directory closes a dev-dep cycle caught bydev/depcheck. Hosts two end-to-end SQL tests that prove the round-trip actually drives scan-level pruning (see next section).Comparison with #20416
PR #20416 introduces a generic
PhysicalExprId { exact, shallow }and keeps an Arc-ptr default so every expression is stamped. This PR instead makes the identity hook opt-in per type (expression_id()), restricting the blast radius to the one type that needs it today. A follow-up can implementexpression_id()on other types (e.g.InList, literals) to re-introduce generic within-process dedup on a case-by-case basis.Are these changes tested?
Yes.
Unit tests (
datafusion/physical-expr/src/expressions/dynamic_filters.rs):test_expression_id_stable_across_with_new_children— the id surviveswith_new_children.Proto-level round-trip tests (
datafusion/proto/tests/cases/roundtrip_physical_plan.rs):roundtrip_dynamic_filter_preserves_shared_inner— two wrappers in aBinaryExpr(And)predicate shareinnerafter round-trip; anupdate()on one is observable via the other.roundtrip_dynamic_filter_preserves_remapped_children— two wrappers with different effective children; each preserves its site-specific projection, both share identity,update()propagates, andcurrent()on the remapped side applies the column substitution.roundtrip_dynamic_filter_in_parquet_pushdown— the plan shape from the PR description (FilterExec → ProjectionExec → DataSourceExec(ParquetSource)).End-to-end SQL tests (
datafusion/tests/tests/dynamic_filter_proto_roundtrip.rs):hash_join_dynamic_filter_prunes_via_sql— registers a parquet file, runsINNER JOIN ... WHERE b.n_nationkey = 5, round-trips viaDeduplicatingProtoConverter, executes, and asserts the probe-side scan emitted strictly fewer rows than the full table.topk_dynamic_filter_prunes_files_via_sql— writes two single-row parquet files (a.parquetkey=1,b.parquetkey=2), runsORDER BY ... LIMIT 1withtarget_partitions=1, round-trips, executes, and asserts the scan emitted exactly 1 row.b.parquetis pruned by row-group statistics once TopK's dynamic filter tightens after readinga.parquet.Removed: the pre-existing
test_expression_deduplication_arc_sharing,test_deduplication_within_plan_deserialization,test_deduplication_within_expr_deserialization, and twotest_session_id_rotation_*tests — they asserted the generic Arc-ptr dedup contract that this PR deliberately drops.CI is green:
cargo fmt,clippy,circular dependency check,detect-unused-dependencies, allcargo testmatrixes pass.Are there any user-facing changes?
PhysicalExpr::expression_id(&self) -> Option<u64>with a safe default (None). Existing implementations keep compiling.DynamicFilterPhysicalExprsurface:with_id,with_generation,with_is_complete,is_complete(). TheDebugimpl now hides the randomidfield so plan snapshots stay deterministic.HashJoinExecBuilder::with_dynamic_filterpromoted topubwith aArc<DynamicFilterPhysicalExpr>signature;HashJoinExec::dynamic_filter()accessor promoted fromdynamic_filter_for_test. Callers of the renamed method need to update.SortExec::with_dynamic_filtersetter +SortExec::dynamic_filter()accessor.DeduplicatingProtoConverterno longer deduplicates arbitrary expressions by Arc pointer. Plans that relied on that (e.g. for a large sharedInList) will serialize independently until a per-typeexpression_id()is added. Dynamic filters now round-trip with shared state, which is the primary motivation.datafusion-tests(internal;publish = false) hosting cross-crate integration tests.🤖 Generated with Claude Code