Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20067] [SQL] Unify and Clean Up Desc Commands Using Catalog Interface #17394

Closed
wants to merge 15 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Mar 23, 2017

What changes were proposed in this pull request?

This PR is to unify and clean up the outputs of DESC EXTENDED/FORMATTED and SHOW TABLE EXTENDED by moving the logics into the Catalog interface. The output formats are improved. We also add the missing attributes. It impacts the DDL commands like SHOW TABLE EXTENDED, DESC EXTENDED and DESC FORMATTED.

In addition, by following what we did in Dataset API printSchema, we can use treeString to show the schema in the more readable way.

Below is the current way:

Schema: STRUCT<`a`: STRING (nullable = true), `b`: INT (nullable = true), `c`: STRING (nullable = true), `d`: STRING (nullable = true)>

After the change, it should look like

Schema: root
 |-- a: string (nullable = true)
 |-- b: integer (nullable = true)
 |-- c: string (nullable = true)
 |-- d: string (nullable = true)

How was this patch tested?

describe.sql and show-tables.sql

@SparkQA
Copy link

SparkQA commented Mar 23, 2017

Test build #75082 has finished for PR 17394 at commit 8720919.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2017

Test build #75087 has finished for PR 17394 at commit d999147.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @cloud-fan

@@ -97,16 +97,21 @@ DESC EXTENDED t
struct<col_name:string,data_type:string,comment:string>
-- !query 6 output
# Detailed Table Information CatalogTable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to wrap the information with CatalogTable(...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Let me remove it and move it to the next line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output looks weird if we sort the Command, so file a PR: #17414

|-- a: string (nullable = true)
|-- b: integer (nullable = true)
|-- c: string (nullable = true)
|-- d: string (nullable = true)
Copy link
Member Author

@gatorsmile gatorsmile Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Do we still need Schema, if the above info already has it? The only missing part is the nullability, which can be put in comment or add a separate column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea we do not need, but I think CatalogTable.toString should still have the schema. Actually why do we have DESC EXTENDED? I think DESC FORMATTED is good enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is what Programming Hive said:

Replacing EXTENDED with FORMATTED provides more readable but also more verbose output.

For most end-users, they should use DESC FORMATTED .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can keep the output of DESC EXTENDED as it is. Then, we need to revisit DESC FORMATTED to ensure it includes all the user-facing metadata info.

@SparkQA
Copy link

SparkQA commented Mar 25, 2017

Test build #75185 has finished for PR 17394 at commit 3859097.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 25, 2017

Test build #75200 has finished for PR 17394 at commit 1d72079.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 25, 2017

Test build #75221 has finished for PR 17394 at commit 1d72079.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -273,28 +278,32 @@ case class CatalogTable(
if (bucketColumnNames.nonEmpty) s"Bucket Columns: $bucketColumnsString" else "",
if (sortColumnNames.nonEmpty) s"Sort Columns: $sortColumnsString" else ""
)

case _ => Nil
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should de-duplicate the codes.

@gatorsmile gatorsmile changed the title [SPARK-20067] [SQL] Use treeString to print out the table schema for CatalogTable [SPARK-20067] [SQL] Unify and Clean Up Desc Commands Using Catalog Interface Mar 26, 2017
@SparkQA
Copy link

SparkQA commented Mar 26, 2017

Test build #75232 has finished for PR 17394 at commit 68bb05c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 26, 2017

Test build #75243 has finished for PR 17394 at commit ac3f351.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -21,6 +21,7 @@ import java.net.URI
import java.util.Date

import com.google.common.base.Objects
import scala.collection.mutable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we put the imports of scala library above third-party.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the order, can scalastyle pass?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, the import order is:

import java.util.Date

import scala.collection.mutable

import com.google.common.base.Objects

Dataset.scala is an example.

@viirya
Copy link
Member

viirya commented Mar 27, 2017

Please update the PR description too since you update the title.

@cloud-fan
Copy link
Contributor

after this PR, what's the difference between DESC EXTENDED and DESC FORMATTED?

@gatorsmile
Copy link
Member Author

After refactoring, I think DESC FORMATTED might be more useful for users.

DESC FORMATTED outputs the metadata in a more formatted way. Users can query it using the key value (stored in col_name) to get the result. They do not need to parse the string to extract the data value, unlike DESC EXTENDED.

Below is the output of DESC FORMATTED

+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+-------+
|col_name                    |data_type                                                                                                                                         |comment|
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+-------+
|key                         |int                                                                                                                                               |test   |
|value                       |string                                                                                                                                            |null   |
|ds                          |string                                                                                                                                            |null   |
|hr                          |string                                                                                                                                            |null   |
|# Partition Information     |                                                                                                                                                  |       |
|# col_name                  |data_type                                                                                                                                         |comment|
|ds                          |string                                                                                                                                            |null   |
|hr                          |string                                                                                                                                            |null   |
|                            |                                                                                                                                                  |       |
|# Detailed Table Information|                                                                                                                                                  |       |
|Table                       |`default`.`tab1`                                                                                                                                  |       |
|Owner                       |xiao                                                                                                                                              |       |
|Created                     |Mon Mar 27 21:25:07 PDT 2017                                                                                                                      |       |
|Last Access                 |Wed Dec 31 16:00:00 PST 1969                                                                                                                      |       |
|Type                        |EXTERNAL                                                                                                                                          |       |
|Provider                    |hive                                                                                                                                              |       |
|Comment                     |Apache Spark                                                                                                                                      |       |
|Properties                  |[serialization.format=1]                                                                                                                          |       |
|Location                    |file:/private/var/folders/6r/15tqm8hn3ldb3rmbfqm1gf4c0000gn/T/spark-8d941400-8f59-4080-8094-611fa517f11e                                          |       |
|Serde Library               |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                                                                                                |       |
|InputFormat                 |org.apache.hadoop.mapred.TextInputFormat                                                                                                          |       |
|OutputFormat                |org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat                                                                                        |       |
|Partition Provider          |Catalog                                                                                                                                           |       |
|Partition Columns           |[`ds`, `hr`]                                                                                                                                      |       |
|Schema                      |root
 |-- key: integer (nullable = true)
 |-- value: string (nullable = true)
 |-- ds: string (nullable = true)
 |-- hr: string (nullable = true)
|       |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+-------+

Below is the output of DESC EXTENDED

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+
|col_name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |data_type|comment|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+
|key                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |int      |test   |
|value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |string   |null   |
|ds                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |string   |null   |
|hr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |string   |null   |
|# Partition Information                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |         |       |
|# col_name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |data_type|comment|
|ds                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |string   |null   |
|hr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |string   |null   |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |         |       |
|# Detailed Table Information                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |         |       |
|Table: `default`.`tab1`
Owner: xiao
Created: Mon Mar 27 21:25:07 PDT 2017
Last Access: Wed Dec 31 16:00:00 PST 1969
Type: EXTERNAL
Provider: hive
Comment: Apache Spark
Properties: [serialization.format=1]
Location: file:/private/var/folders/6r/15tqm8hn3ldb3rmbfqm1gf4c0000gn/T/spark-8d941400-8f59-4080-8094-611fa517f11e
Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Partition Provider: Catalog
Partition Columns: [`ds`, `hr`]
Schema: root
 |-- key: integer (nullable = true)
 |-- value: string (nullable = true)
 |-- ds: string (nullable = true)
 |-- hr: string (nullable = true)
|         |       |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+


@SparkQA
Copy link

SparkQA commented Mar 28, 2017

Test build #75295 has started for PR 17394 at commit 29817ea.

@SparkQA
Copy link

SparkQA commented Mar 28, 2017

Test build #75294 has finished for PR 17394 at commit bef1134.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

seems DESC EXTENDED is totally useless... How about we deprecate DESC EXTENDED at parser and implement it with DESC FORMATTED?

@gatorsmile
Copy link
Member Author

Yes. We do not need to keep the difference between DESC EXTENDED and DESC FORMATTED

@SparkQA
Copy link

SparkQA commented Apr 1, 2017

Test build #75440 has finished for PR 17394 at commit e116018.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -214,6 +215,7 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext {
// Returns true if the plan is supposed to be sorted.
def isSorted(plan: LogicalPlan): Boolean = plan match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe call it needSort?


# Detailed Table Information
Table `default`.`t`
Created[removed by SQLQueryTestSuite]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks weird, how about Created [not included in comparison]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Location[removed by SQLQueryTestSuite]sql/core/spark-warehouse/t
Partition Provider Catalog
Partition Columns [`c`, `d`]
Schema root
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this information is duplicated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me remove both

d string

# Detailed Table Information
Table `default`.`t`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we separate it into 2 lines, Database and Table?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Detailed Partition Information CatalogPartition(
Partition Values: [c=Us, d=1]
Storage(Location: sql/core/spark-warehouse/t/c=Us/d=1)
Partition Parameters:{})
a string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a header here # col_name data_type comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

buffer: ArrayBuffer[Row]): Unit = {
append(buffer, "", "", "")
append(buffer, "Detailed Partition Information " + partition.toString, "", "")
if (isExtended) describeFormattedDetailedPartitionInfo(table, metadata, partition, result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but it looks weird that DESC tbl and DESC tbl PARTITION (xxx) has the same result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function describeDetailedPartitionInfo will only be called for the DDL command

  DESCRIBE [EXTENDED|FORMATTED] table_name PARTITION (partitionVal*)


# Detailed Partition Information
Database default
Table t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but why do we put the Database and Table in # Detailed Partition Information? Did we follow hive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

hive> DESC EXTENDED page_view PARTITION(dt='part1', country='part2');
OK
viewtime            	int                 	                    
userid              	bigint              	                    
page_url            	string              	                    
referrer_url        	string              	                    
ip                  	string              	IP Address of the User
dt                  	string              	                    
country             	string              	                    
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
dt                  	string              	                    
country             	string              	                    
	 	 
Detailed Partition Information	Partition(values:[part1, part2], dbName:default, tableName:page_view, createTime:1491066325, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:viewtime, type:int, comment:null), FieldSchema(name:userid, type:bigint, comment:null), FieldSchema(name:page_url, type:string, comment:null), FieldSchema(name:referrer_url, type:string, comment:null), FieldSchema(name:ip, type:string, comment:IP Address of the User), FieldSchema(name:dt, type:string, comment:null), FieldSchema(name:country, type:string, comment:null)], location:file:/user/hive/warehouse/page_view/dt=part1/country=part2, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{totalSize=0, numRows=0, rawDataSize=0, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=0, transient_lastDdlTime=1491066325})	

@SparkQA
Copy link

SparkQA commented Apr 1, 2017

Test build #75446 has finished for PR 17394 at commit a6db8a3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}
}

case class SimpleDDLScan(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two classes are moved to DataSourceTest.scala

@SparkQA
Copy link

SparkQA commented Apr 2, 2017

Test build #75455 has finished for PR 17394 at commit 43668be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class DDLScanSource extends RelationProvider
  • case class SimpleDDLScan(



-- !query 7
-- !query 10
DESC t PARTITION (c='Us', d=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same result with DESC t, is it same with Hive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

hive> DESC page_view PARTITION(dt='part1', country='part2');
OK
viewtime            	int                 	                    
userid              	bigint              	                    
page_url            	string              	                    
referrer_url        	string              	                    
ip                  	string              	IP Address of the User
dt                  	string              	                    
country             	string              	                    
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
dt                  	string              	                    
country             	string              	                    
Time taken: 0.12 seconds, Fetched: 13 row(s)
hive> DESC page_view;
OK
viewtime            	int                 	                    
userid              	bigint              	                    
page_url            	string              	                    
referrer_url        	string              	                    
ip                  	string              	IP Address of the User
dt                  	string              	                    
country             	string              	                    
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
dt                  	string              	                    
country             	string              	                    
Time taken: 0.022 seconds, Fetched: 13 row(s)

Partition Values [c=Us, d=1]
Location [not included in comparison]sql/core/spark-warehouse/t/c=Us/d=1

# Table Storage Information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we have a new # Table Storage Information section for DESC EXTENDED t PARTITION (xxx)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hive does it. See the following output.

hive> DESC FORMATTED page_view PARTITION(dt='part1', country='part2');
OK
# col_name            	data_type           	comment             
	 	 
viewtime            	int                 	                    
userid              	bigint              	                    
page_url            	string              	                    
referrer_url        	string              	                    
ip                  	string              	IP Address of the User
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
dt                  	string              	                    
country             	string              	                    
	 	 
# Detailed Partition Information	 	 
Partition Value:    	[part1, part2]      	 
Database:           	default             	 
Table:              	page_view           	 
CreateTime:         	Sat Apr 01 17:05:25 UTC 2017	 
LastAccessTime:     	UNKNOWN             	 
Location:           	file:/user/hive/warehouse/page_view/dt=part1/country=part2	 
Partition Parameters:	 	 
	COLUMN_STATS_ACCURATE	{\"BASIC_STATS\":\"true\"}
	numFiles            	0                   
	numRows             	0                   
	rawDataSize         	0                   
	totalSize           	0                   
	transient_lastDdlTime	1491066325          
	 	 
# Storage Information	 	 
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
InputFormat:        	org.apache.hadoop.mapred.SequenceFileInputFormat	 
OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat	 
Compressed:         	No                  	 
Num Buckets:        	-1                  	 
Bucket Columns:     	[]                  	 
Sort Columns:       	[]                  	 
Storage Desc Params:	 	 
	serialization.format	1                   
Time taken: 0.124 seconds, Fetched: 39 row(s)

hive> DESC EXTENDED page_view PARTITION(dt='part1', country='part2');
OK
viewtime            	int                 	                    
userid              	bigint              	                    
page_url            	string              	                    
referrer_url        	string              	                    
ip                  	string              	IP Address of the User
dt                  	string              	                    
country             	string              	                    
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
dt                  	string              	                    
country             	string              	                    
	 	 
Detailed Partition Information	Partition(values:[part1, part2], dbName:default, tableName:page_view, createTime:1491066325, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:viewtime, type:int, comment:null), FieldSchema(name:userid, type:bigint, comment:null), FieldSchema(name:page_url, type:string, comment:null), FieldSchema(name:referrer_url, type:string, comment:null), FieldSchema(name:ip, type:string, comment:IP Address of the User), FieldSchema(name:dt, type:string, comment:null), FieldSchema(name:country, type:string, comment:null)], location:file:/user/hive/warehouse/page_view/dt=part1/country=part2, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{totalSize=0, numRows=0, rawDataSize=0, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=0, transient_lastDdlTime=1491066325})	
Time taken: 0.116 seconds, Fetched: 15 row(s)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hive use # Storage Information, which seems to be the storage of the partition, not the table

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Just checked the source code of Hive. This section in Hive is the storage for Partition, but the info was originally from the table storage, as shown in the codes. We did the same thing.

The only difference is our BucketSpec is not part of CatalogStorageFormat. Should we follow Hive to move it into CatalogStorageFormat? Or do it as a follow-up? After the move, we can easily output the exactly the same things as Hive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we rename this section to Storage Information? We may have per-partition storage in the future, although for now this is always table storage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let me make the change.

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Apr 4, 2017

Test build #75497 has finished for PR 17394 at commit 862a4d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Thanks! Merging to master.

@asfgit asfgit closed this in 51d3c85 Apr 4, 2017
@dongjoon-hyun
Copy link
Member

Hi, All .
This seems to be incompatible changes even in DESC t from 2.2.0.
Is it better to be compatible on DESC t which has no EXTENDED|FORMATTED?

@gatorsmile
Copy link
Member Author

Add the following line when issuing DESC t on partitioned tables?
# col_name data_type comment

@dongjoon-hyun
Copy link
Member

Yep. That one.

@dongjoon-hyun
Copy link
Member

It's reported at https://issues.apache.org/jira/browse/SPARK-20954 .
If the current one is better, could you give some advice on that? :)

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jun 5, 2017

What I mean is to remove # col_name data_type comment in case of DESC t.

@gatorsmile
Copy link
Member Author

Sure, we can remove it. Feel free to submit a PR.

@dongjoon-hyun
Copy link
Member

Thank you for advice. I'll make a PR soon.

@dongjoon-hyun
Copy link
Member

Hi, All.
I made a PR, #18203.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants