-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22141][SQL] Propagate empty relation before checking Cartesian products #19362
[SPARK-22141][SQL] Propagate empty relation before checking Cartesian products #19362
Conversation
@@ -200,6 +200,30 @@ class JoinSuite extends QueryTest with SharedSQLContext { | |||
Nil) | |||
} | |||
|
|||
test("inner join, propagate empty relation before checking Cartesian products") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It make make the test more concise if you just test the various join types in a single test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - pending jenkins
Test build #82229 has finished for PR 19362 at commit
|
Merging to master. Thanks! Can you open a backport if we need to port it to 2.2? |
Test build #82230 has finished for PR 19362 at commit
|
… products When inferring constraints from children, Join's condition can be simplified as None. For example, ``` val testRelation = LocalRelation('a.int) val x = testRelation.as("x") val y = testRelation.where($"a" === 2 && !($"a" === 2)).as("y") x.join.where($"x.a" === $"y.a") ``` The plan will become ``` Join Inner :- LocalRelation <empty>, [a#23] +- LocalRelation <empty>, [a#224] ``` And the Cartesian products check will throw exception for above plan. Propagate empty relation before checking Cartesian products, and the issue is resolved. Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes apache#19362 from gengliangwang/MoveCheckCartesianProducts.
@hvanhovell Got it, I have created #19366 for the back port. |
… Cartesian products Back port #19362 to branch-2.2 ## What changes were proposed in this pull request? When inferring constraints from children, Join's condition can be simplified as None. For example, ``` val testRelation = LocalRelation('a.int) val x = testRelation.as("x") val y = testRelation.where($"a" === 2 && !($"a" === 2)).as("y") x.join.where($"x.a" === $"y.a") ``` The plan will become ``` Join Inner :- LocalRelation <empty>, [a#23] +- LocalRelation <empty>, [a#224] ``` And the Cartesian products check will throw exception for above plan. Propagate empty relation before checking Cartesian products, and the issue is resolved. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes #19366 from gengliangwang/branch-2.2.
@@ -136,6 +134,8 @@ abstract class Optimizer(sessionCatalog: SessionCatalog) | |||
Batch("LocalRelation", fixedPoint, | |||
ConvertToLocalRelation, | |||
PropagateEmptyRelation) :: | |||
Batch("Check Cartesian Products", Once, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also add a comment here about the positioning ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. The PR is merged, what should I do now..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a follow-up PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gatorsmile I see, thanks!
@@ -136,6 +134,8 @@ abstract class Optimizer(sessionCatalog: SessionCatalog) | |||
Batch("LocalRelation", fixedPoint, | |||
ConvertToLocalRelation, | |||
PropagateEmptyRelation) :: | |||
Batch("Check Cartesian Products", Once, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment of CheckCartesianProducts
should also be updated and add this constrain.
## What changes were proposed in this pull request? Add comments for specifying the position of batch "Check Cartesian Products", as rxin suggested in apache#19362 . ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes apache#19379 from gengliangwang/SPARK-22141-followup.
… Cartesian products Back port apache#19362 to branch-2.2 ## What changes were proposed in this pull request? When inferring constraints from children, Join's condition can be simplified as None. For example, ``` val testRelation = LocalRelation('a.int) val x = testRelation.as("x") val y = testRelation.where($"a" === 2 && !($"a" === 2)).as("y") x.join.where($"x.a" === $"y.a") ``` The plan will become ``` Join Inner :- LocalRelation <empty>, [a#23] +- LocalRelation <empty>, [a#224] ``` And the Cartesian products check will throw exception for above plan. Propagate empty relation before checking Cartesian products, and the issue is resolved. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes apache#19366 from gengliangwang/branch-2.2.
What changes were proposed in this pull request?
When inferring constraints from children, Join's condition can be simplified as None.
For example,
The plan will become
And the Cartesian products check will throw exception for above plan.
Propagate empty relation before checking Cartesian products, and the issue is resolved.
How was this patch tested?
Unit test