[SPARK-29225][SQL] Change Spark SQL `DESC FORMATTED` format same with Hive by AngersZhuuuu · Pull Request #25917 · apache/spark

AngersZhuuuu · 2019-09-24T13:58:28Z

What changes were proposed in this pull request?

As I have mentioned in SPARK-29225
Spark SQL DESC FORMATTED table_name format have so many different point form hive's format.
Since some JDBC query interaction platform like HUE use Hive format to parse information form the query result.
Such as column information:
Hue use DESC FORMATTED table query to get result and parse column information by it's result format.
Spark SQL show different format, it cause problem when show result.

Why are the changes needed?

For better interaction with JDBC query platform like Hue.

Does this PR introduce any user-facing change?

Create table like below:

create table order_partition(
    id string,
    name string,
    num int,
    order_number string, 
   event_time string)
PARTITIONED BY(event_month string)

Then call DESC FORMATTED order_partition .

Origin Format:

Current format:

Changes
When call DESC FORMATTED table_name

Add header before no-partition columns
Add empty line between header and columns
Add empty line between normal column and #Partition information
Remove partition columns in normal column information, only show below #Partition information

How was this patch tested?

MT

HyukjinKwon

@AngersZhuuuu I think it affects user-facing output. Please describe before/after outputs

AngersZhuuuu · 2019-09-25T03:36:54Z

@AngersZhuuuu I think it affects user-facing output. Please describe before/after outputs

Add photo about before/after result since the length of result formatted string is too long.

dongjoon-hyun · 2019-10-04T20:56:00Z

ok to test

SparkQA · 2019-10-04T22:38:50Z

Test build #111786 has finished for PR 25917 at commit 78b3079.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-04T23:17:26Z

Test build #111789 has finished for PR 25917 at commit 78b3079.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-10-04T23:22:26Z

The failures looks relevant to this PR, @AngersZhuuuu .

AngersZhuuuu · 2019-10-05T04:54:49Z

The failures looks relevant to this PR, @AngersZhuuuu .

Got it, UT need change.

SparkQA · 2019-10-05T07:05:02Z

Test build #111799 has finished for PR 25917 at commit d704d2f.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

AngersZhuuuu · 2019-10-05T13:24:11Z

@dongjoon-hyun
UT changed. and pass that UT but failed to -9

dongjoon-hyun · 2019-10-05T23:32:38Z

Retest this please.

dongjoon-hyun · 2019-10-05T23:43:46Z

sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala

+    if (partColNames.isEmpty) {
+      schema
+    } else {
+      StructType(schema.filter(col => !partColNames.contains(col.name)))


Although this function is used for the data from metadata only for now, this is a usual suspect for the case-sensitive failures.

We don't know who will use this function later. And, since partColNames is Seq[String], it's not safe. We had better consider a case-sensitivity. Please refer SchemaUtils.normalizePartitionSpec and SchemaUtils.checkColumnNameDuplication. You can use resolver to be robust.

@dongjoon-hyun
Updated, current way is robust enough. Since here is just show table desc. use resolver to check column name equal.
By the way : PartitionUtils.normalizePartitionSpec

SparkQA · 2019-10-06T02:22:47Z

Test build #111810 has finished for PR 25917 at commit d704d2f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-06T04:38:05Z

Test build #111812 has finished for PR 25917 at commit 3d8160d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-06T10:47:32Z

Test build #111816 has finished for PR 25917 at commit 52eb432.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-06T13:35:47Z

Test build #111817 has finished for PR 25917 at commit 950bcfc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AngersZhuuuu · 2019-10-25T09:16:47Z

@dongjoon-hyun @HyukjinKwon Any more suggestion for this problem?

github-actions · 2020-02-03T00:06:01Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

format desc table

78b3079

dongjoon-hyun added the SQL label Sep 24, 2019

HyukjinKwon reviewed Sep 25, 2019

View reviewed changes

dongjoon-hyun changed the title ~~[SPARK-29225][SQL] Change Spark SQL DESC FORMATTED TABLE_NAME format same with Hive~~ [SPARK-29225][SQL] Change Spark SQL DESC FORMATTED format same with Hive Oct 4, 2019

FIX UT

d704d2f

dongjoon-hyun reviewed Oct 5, 2019

View reviewed changes

fix for case sensitive

3d8160d

AngersZhuuuu added 3 commits October 6, 2019 16:05

describle.sql

ba7674b

Update describe-part-after-analyze.sql.out

52eb432

fix sql test output

950bcfc

github-actions bot added the Stale label Feb 3, 2020

github-actions bot closed this Feb 4, 2020

Conversation

AngersZhuuuu commented Sep 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu commented Sep 25, 2019

Uh oh!

dongjoon-hyun commented Oct 4, 2019

Uh oh!

SparkQA commented Oct 4, 2019

Uh oh!

SparkQA commented Oct 4, 2019

Uh oh!

dongjoon-hyun commented Oct 4, 2019

Uh oh!

AngersZhuuuu commented Oct 5, 2019

Uh oh!

SparkQA commented Oct 5, 2019

Uh oh!

AngersZhuuuu commented Oct 5, 2019

Uh oh!

dongjoon-hyun commented Oct 5, 2019

Uh oh!

dongjoon-hyun Oct 5, 2019

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu Oct 6, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 6, 2019

Uh oh!

SparkQA commented Oct 6, 2019

Uh oh!

SparkQA commented Oct 6, 2019

Uh oh!

SparkQA commented Oct 6, 2019

Uh oh!

AngersZhuuuu commented Oct 25, 2019

Uh oh!

github-actions bot commented Feb 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AngersZhuuuu commented Sep 24, 2019 •

edited

Loading