Skip to content

[SPARK-38216][SQL] Fail early if all the columns are partitioned columns when creating a Hive table#35527

Closed
yikf wants to merge 1 commit intoapache:masterfrom
yikf:ct-hive
Closed

[SPARK-38216][SQL] Fail early if all the columns are partitioned columns when creating a Hive table#35527
yikf wants to merge 1 commit intoapache:masterfrom
yikf:ct-hive

Conversation

@yikf
Copy link
Contributor

@yikf yikf commented Feb 15, 2022

What changes were proposed in this pull request?

In Hive the schema and partition columns must be disjoint sets, if hive table which all columns are partitioned columns, so that other columns is empty, it will fail when Hive create table, error msg as follow:

throw new HiveException( "at least one column must be specified for the table")
That's because we did the disjoint operation in toHiveTable

So when creating a Hive table, fail early if all the columns are partitioned columns,

Why are the changes needed?

unify analysis error msg when create table with all the columns are partitioned columns

Does this PR introduce any user-facing change?

yes, but error msg only

How was this patch tested?

add ut

@github-actions github-actions bot added the SQL label Feb 15, 2022
@yikf yikf changed the title Fail early if all the columns are partitioned columns when creating a Hive table [SPARK-38216][SQL] Fail early if all the columns are partitioned columns when creating a Hive table Feb 15, 2022
@yikf
Copy link
Contributor Author

yikf commented Feb 15, 2022

Could you please take a look when you have a time, thanks in advance @cloud-fan

conf.resolver)

if (schema.nonEmpty && normalizedPartitionCols.length == schema.length) {
if (DDLUtils.isHiveTable(table)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check the commit history here? It seems we intentionally exclude hive table here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be any relevant information in the commit history message, It seems we did this on purpose from comment.

But in HiveClientImpl.toHiveTable, we partitioned the partition cols and other cols, If all columns are partitioned columns, hivetabl.getFields will get an empty result, so Hive will throw an exception cols has at least one column

If Hive allows cols to inherit partitioned columns, we should not do partition in toHiveTable, if not, we should fail early, I'm sorry I'm not sure about that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somani can you take a look?

Copy link
Contributor

@somani somani Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry just got to this.

If Hive allows cols to inherit partitioned columns, we should not do partition in toHiveTable, if not, we should fail early, I'm sorry I'm not sure about that

... what do we mean by "cols to inherit partitioned columns"?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

test("SPARK-38216: Fail early if all the columns are partitioned columns") {
assertAnalysisError(
"CREATE TABLE tab (c1 int) PARTITIONED BY (c1) STORED AS PARQUET",
"Cannot use all columns for partition columns")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the result of this query before this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception new HiveException( "at least one column must be specified for the table") thrown by Hive

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in bd79378 Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants