-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict caused by InferFiltersFromConstraints #19149
Conversation
Test build #81469 has finished for PR 19149 at commit
|
Test build #81470 has finished for PR 19149 at commit
|
0be3b67
to
6fc7140
Compare
Test build #81475 has finished for PR 19149 at commit
|
Test build #81476 has finished for PR 19149 at commit
|
Test build #81483 has finished for PR 19149 at commit
|
Test build #81495 has finished for PR 19149 at commit
|
retest this please |
Test build #81502 has finished for PR 19149 at commit
|
Hi, @gatorsmile . |
Except that, Isolation of |
Can we add a test? |
I think the idea is good, but maybe the fix can be refined by identifying which are the rules that conflict with |
At least part of the issue was solved by #19201. |
Actually, the root issue is not resolve by #19201. Will add a unit test case later. |
1a22533
to
9b6fe36
Compare
Test build #85015 has finished for PR 19149 at commit
|
retest this please |
Test build #85018 has finished for PR 19149 at commit
|
PushDownPredicate, | ||
LimitPushDown, | ||
ColumnPruning, | ||
InferFiltersFromConstraints, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still have it in the big batch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is filtered out in the first and the third batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, why not remove it from the big batch?
retest this please |
Test build #85095 has finished for PR 19149 at commit
|
retest this please |
1 similar comment
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #85108 has finished for PR 19149 at commit
|
Thanks! Merged to master. |
## What changes were proposed in this pull request? Previously, PR apache#19201 fix the problem of non-converging constraints. After that PR apache#19149 improve the loop and constraints is inferred only once. So the problem of non-converging constraints is gone. However, the case below will fail. ``` spark.range(5).write.saveAsTable("t") val t = spark.read.table("t") val left = t.withColumn("xid", $"id" + lit(1)).as("x") val right = t.withColumnRenamed("id", "xid").as("y") val df = left.join(right, "xid").filter("id = 3").toDF() checkAnswer(df, Row(4, 3)) ``` Because `aliasMap` replace all the aliased child. See the test case in PR for details. This PR is to fix this bug by removing useless code for preventing non-converging constraints. It can be also fixed with apache#20270, but this is much simpler and clean up the code. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes apache#20278 from gengliangwang/FixConstraintSimple.
## What changes were proposed in this pull request? Previously, PR #19201 fix the problem of non-converging constraints. After that PR #19149 improve the loop and constraints is inferred only once. So the problem of non-converging constraints is gone. However, the case below will fail. ``` spark.range(5).write.saveAsTable("t") val t = spark.read.table("t") val left = t.withColumn("xid", $"id" + lit(1)).as("x") val right = t.withColumnRenamed("id", "xid").as("y") val df = left.join(right, "xid").filter("id = 3").toDF() checkAnswer(df, Row(4, 3)) ``` Because `aliasMap` replace all the aliased child. See the test case in PR for details. This PR is to fix this bug by removing useless code for preventing non-converging constraints. It can be also fixed with #20270, but this is much simpler and clean up the code. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes #20278 from gengliangwang/FixConstraintSimple. (cherry picked from commit 8598a98) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The optimizer rule
InferFiltersFromConstraints
could trigger our batchOperator Optimizations
exceeds the max iteration limit (i.e., 100) so that the final plan might not be properly optimized. The ruleInferFiltersFromConstraints
could conflict with the other Filter/Join predicate reduction rules. Thus, we need to separateInferFiltersFromConstraints
from the other rules.This PR is to separate
InferFiltersFromConstraints
from the main batchOperator Optimizations
.How was this patch tested?
The existing test cases.