[SPARK-35845][SQL] OuterReference resolution should reject ambiguous column names by cloud-fan · Pull Request #33004 · apache/spark

cloud-fan · 2021-06-21T18:33:47Z

What changes were proposed in this pull request?

The current OuterReference resolution is a bit weird: when the outer plan has more than one child, it resolves OuterReference from the output of each child, one by one, left to right.

This is incorrect in the case of join, as the column name can be ambiguous if both left and right sides output this column.

This PR fixes this bug by resolving OuterReference with outerPlan.resolveChildren, instead of something like outerPlan.children.foreach(_.resolve(...))

Why are the changes needed?

bug fix

Does this PR introduce any user-facing change?

The problem only occurs in join, and join condition doesn't support correlated subquery yet. So this PR only improves the error message. Before this PR, people see

java.lang.UnsupportedOperationException
Cannot generate code for expression: outer(t1a#291)

How was this patch tested?

a new test

cloud-fan · 2021-06-21T18:34:22Z

cc @allisonwang-db @maropu

SparkQA · 2021-06-21T19:33:24Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44622/

cloud-fan · 2021-06-21T19:39:58Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala

@@ -220,7 +220,7 @@ object PullupCorrelatedPredicates extends Rule[LogicalPlan] with PredicateHelper
    */


Changes in this file are not quite necessary, but just to match the code in the analyzer side: when we need to pass around an outer plan, just pass it instead of its children.

SparkQA · 2021-06-21T19:42:47Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44622/

SparkQA · 2021-06-21T20:12:06Z

Test build #140094 has finished for PR 33004 at commit 9173bb3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-06-21T20:27:33Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44626/

SparkQA · 2021-06-21T20:38:49Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44626/

SparkQA · 2021-06-21T20:51:14Z

Test build #140098 has finished for PR 33004 at commit 20c3c5c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

SparkQA · 2021-06-22T08:31:05Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44655/

SparkQA · 2021-06-22T08:39:06Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44655/

SparkQA · 2021-06-22T12:08:35Z

Test build #140127 has finished for PR 33004 at commit 01f5833.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2021-06-23T06:32:10Z

Thanks, merging to master

OuterReference resolution should reject ambiguous column names

9173bb3

github-actions bot added the SQL label Jun 21, 2021

more updates

20c3c5c

cloud-fan commented Jun 21, 2021

View reviewed changes

allisonwang-db reviewed Jun 21, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

fix

01f5833

allisonwang-db approved these changes Jun 22, 2021

View reviewed changes

gengliangwang approved these changes Jun 23, 2021

View reviewed changes

gengliangwang closed this in 20edfdd Jun 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-35845][SQL] OuterReference resolution should reject ambiguous column names#33004

[SPARK-35845][SQL] OuterReference resolution should reject ambiguous column names#33004
cloud-fan wants to merge 3 commits intoapache:masterfrom
cloud-fan:outer-ref

cloud-fan commented Jun 21, 2021

Uh oh!

cloud-fan commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

cloud-fan Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

gengliangwang commented Jun 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -220,7 +220,7 @@ object PullupCorrelatedPredicates extends Rule[LogicalPlan] with PredicateHelper
		*/

Conversation

cloud-fan commented Jun 21, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

cloud-fan commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

cloud-fan Jun 21, 2021

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

gengliangwang commented Jun 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants