Skip to content

[SPARK-29225][SQL] Change Spark SQL DESC FORMATTED format same with Hive#25917

Closed
AngersZhuuuu wants to merge 6 commits intoapache:masterfrom
AngersZhuuuu:SPARK-29225
Closed

[SPARK-29225][SQL] Change Spark SQL DESC FORMATTED format same with Hive#25917
AngersZhuuuu wants to merge 6 commits intoapache:masterfrom
AngersZhuuuu:SPARK-29225

Conversation

@AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Sep 24, 2019

What changes were proposed in this pull request?

As I have mentioned in SPARK-29225
Spark SQL DESC FORMATTED table_name format have so many different point form hive's format.
Since some JDBC query interaction platform like HUE use Hive format to parse information form the query result.
Such as column information:
Hue use DESC FORMATTED table query to get result and parse column information by it's result format.
Spark SQL show different format, it cause problem when show result.

Why are the changes needed?

For better interaction with JDBC query platform like Hue.

Does this PR introduce any user-facing change?

Create table like below:

create table order_partition(
    id string,
    name string,
    num int,
    order_number string, 
   event_time string)
PARTITIONED BY(event_month string)

Then call DESC FORMATTED order_partition .

Origin Format:
image

Current format:
image

Changes
When call DESC FORMATTED table_name

  1. Add header before no-partition columns
  2. Add empty line between header and columns
  3. Add empty line between normal column and #Partition information
  4. Remove partition columns in normal column information, only show below #Partition information

How was this patch tested?

MT

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AngersZhuuuu I think it affects user-facing output. Please describe before/after outputs

@AngersZhuuuu
Copy link
Contributor Author

@AngersZhuuuu I think it affects user-facing output. Please describe before/after outputs

Add photo about before/after result since the length of result formatted string is too long.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29225][SQL] Change Spark SQL DESC FORMATTED TABLE_NAME format same with Hive [SPARK-29225][SQL] Change Spark SQL DESC FORMATTED format same with Hive Oct 4, 2019
@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Oct 4, 2019

Test build #111786 has finished for PR 25917 at commit 78b3079.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2019

Test build #111789 has finished for PR 25917 at commit 78b3079.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

The failures looks relevant to this PR, @AngersZhuuuu .

@AngersZhuuuu
Copy link
Contributor Author

The failures looks relevant to this PR, @AngersZhuuuu .

Got it, UT need change.

@SparkQA
Copy link

SparkQA commented Oct 5, 2019

Test build #111799 has finished for PR 25917 at commit d704d2f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

@dongjoon-hyun
UT changed. and pass that UT but failed to -9

@dongjoon-hyun
Copy link
Member

Retest this please.

if (partColNames.isEmpty) {
schema
} else {
StructType(schema.filter(col => !partColNames.contains(col.name)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this function is used for the data from metadata only for now, this is a usual suspect for the case-sensitive failures.

We don't know who will use this function later. And, since partColNames is Seq[String], it's not safe. We had better consider a case-sensitivity. Please refer SchemaUtils.normalizePartitionSpec and SchemaUtils.checkColumnNameDuplication. You can use resolver to be robust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
Updated, current way is robust enough. Since here is just show table desc. use resolver to check column name equal.
By the way : PartitionUtils.normalizePartitionSpec

@SparkQA
Copy link

SparkQA commented Oct 6, 2019

Test build #111810 has finished for PR 25917 at commit d704d2f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2019

Test build #111812 has finished for PR 25917 at commit 3d8160d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2019

Test build #111816 has finished for PR 25917 at commit 52eb432.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2019

Test build #111817 has finished for PR 25917 at commit 950bcfc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

@dongjoon-hyun @HyukjinKwon Any more suggestion for this problem?

@github-actions
Copy link

github-actions bot commented Feb 3, 2020

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Feb 3, 2020
@github-actions github-actions bot closed this Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants