[SPARK-32801][SQL] Make InferFiltersFromConstraints take into account EqualNullSafe#29650
[SPARK-32801][SQL] Make InferFiltersFromConstraints take into account EqualNullSafe#29650tanelk wants to merge 11 commits intoapache:masterfrom
Conversation
# Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
wangyum
left a comment
There was a problem hiding this comment.
Are you sure we can infer from EqualNullSafe? Some EqualNullSafe from Alias:
I do not follow, why should this make |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
@cloud-fan, it seems you worked on this |
|
Test build #133059 has finished for PR 29650 at commit
|
|
Test build #134102 has finished for PR 29650 at commit
|
|
cc @maryannxue too |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
InferFiltersFromConstraintsonly infers new filters usingEqualTo, generalized it to also includeEqualNullSafe.This introduced an infinite loop in
InferFiltersFromConstraintsSuite. It has been fixed in theOptimizerby restructuring the batches (#19149) - I did the same in the test suite.Also it revealed, that the
InferFiltersFromConstraintsis not idempotent, for two cases:"x.a".attr === "x.b".attr && "x.a".attr === "y.a".attr && "x.b".attr === "y.b".attr, that should infer"y.a".attr === "y.b".attrThese were also fixed
Why are the changes needed?
Possible performance improvement from new inferred constraints
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT