Skip to content

[SPARK-35845][SQL] OuterReference resolution should reject ambiguous column names#33004

Closed
cloud-fan wants to merge 3 commits intoapache:masterfrom
cloud-fan:outer-ref
Closed

[SPARK-35845][SQL] OuterReference resolution should reject ambiguous column names#33004
cloud-fan wants to merge 3 commits intoapache:masterfrom
cloud-fan:outer-ref

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

The current OuterReference resolution is a bit weird: when the outer plan has more than one child, it resolves OuterReference from the output of each child, one by one, left to right.

This is incorrect in the case of join, as the column name can be ambiguous if both left and right sides output this column.

This PR fixes this bug by resolving OuterReference with outerPlan.resolveChildren, instead of something like outerPlan.children.foreach(_.resolve(...))

Why are the changes needed?

bug fix

Does this PR introduce any user-facing change?

The problem only occurs in join, and join condition doesn't support correlated subquery yet. So this PR only improves the error message. Before this PR, people see

java.lang.UnsupportedOperationException
Cannot generate code for expression: outer(t1a#291)

How was this patch tested?

a new test

@github-actions github-actions bot added the SQL label Jun 21, 2021
@cloud-fan
Copy link
Contributor Author

cc @allisonwang-db @maropu

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44622/

@@ -220,7 +220,7 @@ object PullupCorrelatedPredicates extends Rule[LogicalPlan] with PredicateHelper
*/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file are not quite necessary, but just to match the code in the analyzer side: when we need to pass around an outer plan, just pass it instead of its children.

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44622/

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Test build #140094 has finished for PR 33004 at commit 9173bb3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44626/

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44626/

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Test build #140098 has finished for PR 33004 at commit 20c3c5c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44655/

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44655/

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Test build #140127 has finished for PR 33004 at commit 01f5833.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

Thanks, merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants