[SPARK-17237][SQL] Remove backticks in a pivot result schema #14812

maropu · 2016-08-25T19:31:51Z

What changes were proposed in this pull request?

Pivoting adds backticks (e.g. 3_count(`c`)) in column names and, in some cases,
thes causes analysis exceptions like;

scala> val df = Seq((2, 3, 4), (3, 4, 5)).toDF("a", "x", "y")
scala> df.groupBy("a").pivot("x").agg(count("y"), avg("y")).na.fill(0)
org.apache.spark.sql.AnalysisException: syntax error in attribute name: `3_count(`y`)`;
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:134)
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:144)
...

So, this pr proposes to remove these backticks from column names.

How was this patch tested?

Added a test in DataFrameAggregateSuite.

SparkQA · 2016-08-25T21:37:36Z

Test build #64432 has finished for PR 14812 at commit 530d5c0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-18T10:41:56Z

Test build #68844 has finished for PR 14812 at commit 9aa5d7d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-18T18:25:51Z

Test build #68860 has finished for PR 14812 at commit 22743c7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2016-11-19T00:28:38Z

@gatorsmile Do u have time to check this? Thanks!

SparkQA · 2017-01-10T17:13:14Z

Test build #71132 has finished for PR 14812 at commit 3aac14f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-01-10T19:51:19Z

sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala

      limit2Df.select($"id"))
  }
+
+  test("handle missing data after pivoting") {


The test case name is misleading. Maybe just use the PR title here.

gatorsmile · 2017-01-10T19:54:27Z

Sorry, I missed this ping. Could you fix the test case failure? Thanks!

gatorsmile · 2017-01-10T20:00:20Z

The fix looks good to me. We just need to resolve the test case failure. Thanks!

maropu · 2017-01-11T00:05:23Z

okay, thanks! I'll check again soon

SparkQA · 2017-01-12T14:03:06Z

Test build #71256 has finished for PR 14812 at commit 2e567cc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-01-12T14:10:58Z

@gatorsmile okay, fixed.

gatorsmile · 2017-01-12T17:43:29Z

LGTM

## What changes were proposed in this pull request? Pivoting adds backticks (e.g. 3_count(\`c\`)) in column names and, in some cases, thes causes analysis exceptions like; ``` scala> val df = Seq((2, 3, 4), (3, 4, 5)).toDF("a", "x", "y") scala> df.groupBy("a").pivot("x").agg(count("y"), avg("y")).na.fill(0) org.apache.spark.sql.AnalysisException: syntax error in attribute name: `3_count(`y`)`; at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:134) at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:144) ... ``` So, this pr proposes to remove these backticks from column names. ## How was this patch tested? Added a test in `DataFrameAggregateSuite`. Author: Takeshi YAMAMURO <linguin.m.s@gmail.com> Closes #14812 from maropu/SPARK-17237. (cherry picked from commit 5585ed9) Signed-off-by: gatorsmile <gatorsmile@gmail.com>

gatorsmile · 2017-01-12T17:48:26Z

Thanks! Merging to master/2.1.

Could you please open a PR to backport it to 2.0?

gatorsmile · 2017-01-12T17:49:16Z

@maropu JIRA is down. Will update the JIRA later.

maropu · 2017-01-12T23:56:20Z

okay, thanks!

## What changes were proposed in this pull request? Pivoting adds backticks (e.g. 3_count(\`c\`)) in column names and, in some cases, thes causes analysis exceptions like; ``` scala> val df = Seq((2, 3, 4), (3, 4, 5)).toDF("a", "x", "y") scala> df.groupBy("a").pivot("x").agg(count("y"), avg("y")).na.fill(0) org.apache.spark.sql.AnalysisException: syntax error in attribute name: `3_count(`y`)`; at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:134) at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:144) ... ``` So, this pr proposes to remove these backticks from column names. ## How was this patch tested? Added a test in `DataFrameAggregateSuite`. Author: Takeshi YAMAMURO <linguin.m.s@gmail.com> Closes apache#14812 from maropu/SPARK-17237.

maropu force-pushed the SPARK-17237 branch from 530d5c0 to 9aa5d7d Compare November 18, 2016 09:05

maropu changed the title ~~[SPARK-17237][SQL] Remove unnecessary backticks in a pivot result schema~~ [SPARK-17237][SQL] Remove backticks in a pivot result schema Nov 19, 2016

maropu added 2 commits January 11, 2017 00:39

Fix a bug to handle missing data after pivoting

ea0e62d

Fix tests in pivoting

3aac14f

maropu force-pushed the SPARK-17237 branch from 22743c7 to 3aac14f Compare January 10, 2017 15:40

gatorsmile reviewed Jan 10, 2017

View reviewed changes

Apply comments

2e567cc

asfgit closed this in 5585ed9 Jan 12, 2017

maropu mentioned this pull request Jan 13, 2017

[SPARK-17237][SPARK-17458][SQL][Backport-2.0] Preserve aliases that are given for pivot aggregations #16565

Closed

maropu deleted the SPARK-17237 branch July 5, 2017 11:44

[SPARK-17237][SQL] Remove backticks in a pivot result schema #14812

[SPARK-17237][SQL] Remove backticks in a pivot result schema #14812

Uh oh!

Conversation

maropu commented Aug 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Aug 25, 2016

Uh oh!

SparkQA commented Nov 18, 2016

Uh oh!

SparkQA commented Nov 18, 2016

Uh oh!

maropu commented Nov 19, 2016

Uh oh!

SparkQA commented Jan 10, 2017

Uh oh!

gatorsmile Jan 10, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Jan 10, 2017

Uh oh!

gatorsmile commented Jan 10, 2017

Uh oh!

maropu commented Jan 11, 2017

Uh oh!

SparkQA commented Jan 12, 2017

Uh oh!

maropu commented Jan 12, 2017

Uh oh!

gatorsmile commented Jan 12, 2017

Uh oh!

gatorsmile commented Jan 12, 2017

Uh oh!

gatorsmile commented Jan 12, 2017

Uh oh!

maropu commented Jan 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maropu commented Aug 25, 2016 •

edited

Loading