New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17863] [SQL] should not add column into Distinct #15489

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@davies
Contributor

davies commented Oct 14, 2016

What changes were proposed in this pull request?

We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but that's wrong when the child is Distinct, because the added column will change the behavior of Distinct, we should not do that.

How was this patch tested?

Added regression test.

|order by struct.a, struct.b
|""".stripMargin)
}
assert(error.message contains "cannot resolve '`struct.a`' given input columns: [a, b]")

This comment has been minimized.

@yhuai

yhuai Oct 14, 2016

Contributor

Is it possible to suggest the workaround in the error message?

@yhuai

yhuai Oct 14, 2016

Contributor

Is it possible to suggest the workaround in the error message?

This comment has been minimized.

@davies

davies Oct 14, 2016

Contributor

What's the suggestion looks like?

@davies

davies Oct 14, 2016

Contributor

What's the suggestion looks like?

This comment has been minimized.

@yhuai

yhuai Oct 14, 2016

Contributor

Oh, i see. Seems this error is thrown by check analysis, which does not have info about what is the query. So, it is hard to figure out what messages (for the workaround) to put at here.

@yhuai

yhuai Oct 14, 2016

Contributor

Oh, i see. Seems this error is thrown by check analysis, which does not have info about what is the query. So, it is hard to figure out what messages (for the workaround) to put at here.

@yhuai

This comment has been minimized.

Show comment
Hide comment
@yhuai

yhuai Oct 14, 2016

Contributor

LGTM.

Contributor

yhuai commented Oct 14, 2016

LGTM.

@SparkQA

This comment has been minimized.

Show comment
Hide comment
@SparkQA

SparkQA Oct 14, 2016

Test build #66975 has finished for PR 15489 at commit 961c062.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

SparkQA commented Oct 14, 2016

Test build #66975 has finished for PR 15489 at commit 961c062.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
@yhuai

This comment has been minimized.

Show comment
Hide comment
@yhuai

yhuai Oct 14, 2016

Contributor

Thanks! Merging to master and branch 2.0.

Contributor

yhuai commented Oct 14, 2016

Thanks! Merging to master and branch 2.0.

@asfgit asfgit closed this in da9aeb0 Oct 14, 2016

asfgit pushed a commit that referenced this pull request Oct 14, 2016

[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?

We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but that's wrong when the child is Distinct, because the added column will change the behavior of Distinct, we should not do that.

## How was this patch tested?

Added regression test.

Author: Davies Liu <davies@databricks.com>

Closes #15489 from davies/order_distinct.

(cherry picked from commit da9aeb0)
Signed-off-by: Yin Huai <yhuai@databricks.com>

ThySinner pushed a commit to ThySinner/spark that referenced this pull request Oct 19, 2016

Davies Liu Cao Jianghe
[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?

We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but that's wrong when the child is Distinct, because the added column will change the behavior of Distinct, we should not do that.

## How was this patch tested?

Added regression test.

Author: Davies Liu <davies@databricks.com>

Closes apache#15489 from davies/order_distinct.

robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016

Davies Liu Robert Kruszewski
[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?

We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but that's wrong when the child is Distinct, because the added column will change the behavior of Distinct, we should not do that.

## How was this patch tested?

Added regression test.

Author: Davies Liu <davies@databricks.com>

Closes apache#15489 from davies/order_distinct.

uzadude added a commit to uzadude/spark that referenced this pull request Jan 27, 2017

[SPARK-17863][SQL] should not add column into Distinct
## What changes were proposed in this pull request?

We are trying to resolve the attribute in sort by pulling up some column for grandchild into child, but that's wrong when the child is Distinct, because the added column will change the behavior of Distinct, we should not do that.

## How was this patch tested?

Added regression test.

Author: Davies Liu <davies@databricks.com>

Closes apache#15489 from davies/order_distinct.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment