[SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan #19483

cloud-fan · 2017-10-12T14:08:47Z

What changes were proposed in this pull request?

Due to optimizer removing some unnecessary aliases, the logical and physical plan may have different output attribute ids. FileFormatWriter should handle this when creating the physical sort node.

How was this patch tested?

new regression test.

cloud-fan · 2017-10-12T14:09:08Z

cc @gatorsmile

gatorsmile

LGTM

SparkQA · 2017-10-12T16:09:03Z

Test build #82683 has finished for PR 19483 at commit d90a0e4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-10-12T16:18:05Z

Test build #82685 has finished for PR 19483 at commit 3bd5b11.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-10-13T02:39:12Z

It sounds like we are facing various issues because we are using the analyzed plan. Is that possible we just add an extra Project using the analyzed plan's output at the end of optimizer?

cloud-fan · 2017-10-13T02:53:20Z

I'll refactor it later, to use requiredChildOrdering to do the sort. I just wanna make this bug fix as simple as possible.

tejasapatil · 2017-10-13T03:04:13Z

I'll refactor it later, to use requiredChildOrdering to do the sort.

The hive bucketing PR does that : #19001 I can isolate that piece and put out a PR

cloud-fan · 2017-10-13T03:17:57Z

that will be great, thanks @tejasapatil !

SparkQA · 2017-10-13T04:47:33Z

Test build #82712 has finished for PR 19483 at commit f4a7337.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ry schema ## What changes were proposed in this pull request? #18386 fixes SPARK-21165 but breaks SPARK-22252. This PR reverts #18386 and picks the patch from #19483 to fix SPARK-21165. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #19484 from cloud-fan/bug.

cloud-fan · 2017-10-13T05:09:48Z

thanks for the review, merging to master!

…ry schema ## What changes were proposed in this pull request? apache#18386 fixes SPARK-21165 but breaks SPARK-22252. This PR reverts apache#18386 and picks the patch from apache#19483 to fix SPARK-21165. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes apache#19484 from cloud-fan/bug.

cloud-fan mentioned this pull request Oct 12, 2017

[SPARK-22252][SQL][2.2] FileFormatWriter should respect the input query schema #19484

Closed

cloud-fan force-pushed the bug2 branch from d90a0e4 to 3bd5b11 Compare October 12, 2017 14:26

gatorsmile approved these changes Oct 12, 2017

View reviewed changes

FileFormatWriter should only rely on attributes from analyzed plan

f4a7337

cloud-fan force-pushed the bug2 branch from 3bd5b11 to f4a7337 Compare October 13, 2017 01:58

cloud-fan changed the title ~~[SPARK-21165][SQL] FileFormatWriter should only rely on attributes from analyzed plan~~ [SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan Oct 13, 2017

asfgit closed this in ec12220 Oct 13, 2017

tejasapatil mentioned this pull request Jan 9, 2018

[SPARK-19256][SQL] Remove ordering enforcement from FileFormatWriter and let planner do that #20206

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan #19483

[SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan #19483

cloud-fan commented Oct 12, 2017 •

edited

Loading

cloud-fan commented Oct 12, 2017

gatorsmile left a comment

SparkQA commented Oct 12, 2017

SparkQA commented Oct 12, 2017

gatorsmile commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

tejasapatil commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

SparkQA commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

[SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan #19483

[SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan #19483

Conversation

cloud-fan commented Oct 12, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

cloud-fan commented Oct 12, 2017

gatorsmile left a comment

Choose a reason for hiding this comment

SparkQA commented Oct 12, 2017

SparkQA commented Oct 12, 2017

gatorsmile commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

tejasapatil commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

SparkQA commented Oct 13, 2017

cloud-fan commented Oct 13, 2017

cloud-fan commented Oct 12, 2017 •

edited

Loading