feat: AQE DPP broadcast reuse for Iceberg native scans#4033
Open
mbutrovich wants to merge 5 commits intoapache:mainfrom
Open
feat: AQE DPP broadcast reuse for Iceberg native scans#4033mbutrovich wants to merge 5 commits intoapache:mainfrom
mbutrovich wants to merge 5 commits intoapache:mainfrom
Conversation
…eflection hack with a columnar rule that wires CometSubqueryBroadcastExec to reuse the join's already-materialized broadcast exchange, eliminating the double execution of the dim table.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #4022.
Rationale for this change
Under AQE, Spark's
PlanAdaptiveDynamicPruningFiltersconvertsSubqueryAdaptiveBroadcastExectoSubqueryBroadcastExecfor DPP broadcast reuse. However,CometIcebergNativeScanExecwrapsBatchScanExecand hides itsruntimeFiltersfrom the plan's expression tree. Spark's rule can't see the DPP expressions, so the SAB stays unconverted and the dim table executes independently — a double broadcast execution.The existing workaround used reflection to set
InSubqueryExec's privateresultfield, bypassingexecuteCollect()(which throws on SAB). This was fragile and didn't achieve broadcast reuse.What changes are included in this PR?
New rule:
CometPlanAdaptiveDynamicPruningFiltersA columnar rule (registered as
postColumnarTransitions) that convertsSubqueryAdaptiveBroadcastExectoCometSubqueryBroadcastExecinsideCometIcebergNativeScanExec.originalPlan.runtimeFilters. The subquery wraps the join's already-materializedBroadcastQueryStageExec, achieving true broadcast reuse — no re-execution of the dim table.Key design decisions:
postColumnarTransitions, notqueryStageOptimizerRule.CometExecRuleruns inpreColumnarTransitionsand recreates scan instances, which would discard earlier modifications.buildKeysexprIds to disambiguate multiple broadcast joins in the same plan.CometBroadcastHashJoinExecandBroadcastHashJoinExecto handle Spark fallback (e.g., disabled Comet BHJ config). UsesCometSubqueryBroadcastExecfor Comet broadcasts (Arrow data) andSubqueryBroadcastExecfor Spark broadcasts (HashedRelation).Literal.TrueLiteralwhen no matching broadcast join exists (e.g., SortMergeJoin). This disables DPP but produces correct results.Metrics:
LazyIcebergMetricReplaces
capturedMetricValues->serializedPartitionDatachain withLazyIcebergMetric, whosevaluegetter lazily triggers planning. This decouplesmetricsMAP construction (accessed bySparkPlanInfobefore AQE runs) from DPP resolution, which must happen after the rule converts the SAB.equalsfixCometIcebergNativeScanExec.equalsnow includesruntimeFilters. Without this,transformUpcan't detect changes when the rule replaces SAB expressions, because the old and new scans are "equal" by the old definition.Shim changes
ShimSubqueryBroadcast: addscreateSubqueryBroadcastExec(version-safe constructor) andresolveSubqueryAdaptiveBroadcast(reflection fallback for 3.4, unreachable throw for 3.5+).ShimCometSparkSessionExtensions: moved fromspark-3.x/tospark-3.4/andspark-3.5/(originally needed forinjectQueryStageOptimizerRuleshim, kept as-is since the split is harmless).Reflection hack removal
The inline reflection code in
CometIcebergNativeScanExec.serializedPartitionData(setInSubqueryResult,Castmatching, manual column index lookup) is removed on 3.5+. On 3.4, the reflection fallback is preserved in the shim since the rule API (injectColumnar) works on 3.4 butCometPlanAdaptiveDynamicPruningFiltersmay not convert the SAB in all edge cases.How are these changes tested?
8 new tests in
CometIcebergNativeSuite:CometSubqueryBroadcastExec+BroadcastQueryStageExecchild + correct resultsCometBroadcastExchangeExecin the plan (broadcast reuse)BroadcastQueryStageExecSubqueryBroadcastExecLiteral.TrueLiteral