[SPARK-38216][SQL] Fail early if all the columns are partitioned columns when creating a Hive table#35527
[SPARK-38216][SQL] Fail early if all the columns are partitioned columns when creating a Hive table#35527yikf wants to merge 1 commit intoapache:masterfrom yikf:ct-hive
Conversation
|
Could you please take a look when you have a time, thanks in advance @cloud-fan |
| conf.resolver) | ||
|
|
||
| if (schema.nonEmpty && normalizedPartitionCols.length == schema.length) { | ||
| if (DDLUtils.isHiveTable(table)) { |
There was a problem hiding this comment.
Can we check the commit history here? It seems we intentionally exclude hive table here.
There was a problem hiding this comment.
There doesn't seem to be any relevant information in the commit history message, It seems we did this on purpose from comment.
But in HiveClientImpl.toHiveTable, we partitioned the partition cols and other cols, If all columns are partitioned columns, hivetabl.getFields will get an empty result, so Hive will throw an exception cols has at least one column
If Hive allows cols to inherit partitioned columns, we should not do partition in toHiveTable, if not, we should fail early, I'm sorry I'm not sure about that
There was a problem hiding this comment.
Sorry just got to this.
If Hive allows cols to inherit partitioned columns, we should not do partition in toHiveTable, if not, we should fail early, I'm sorry I'm not sure about that
... what do we mean by "cols to inherit partitioned columns"?
|
Can one of the admins verify this patch? |
| test("SPARK-38216: Fail early if all the columns are partitioned columns") { | ||
| assertAnalysisError( | ||
| "CREATE TABLE tab (c1 int) PARTITIONED BY (c1) STORED AS PARQUET", | ||
| "Cannot use all columns for partition columns") |
There was a problem hiding this comment.
what was the result of this query before this PR?
There was a problem hiding this comment.
Exception new HiveException( "at least one column must be specified for the table") thrown by Hive
|
thanks, merging to master! |
What changes were proposed in this pull request?
In Hive the schema and partition columns must be disjoint sets, if hive table which all columns are partitioned columns, so that other columns is empty, it will fail when Hive create table, error msg as follow:
throw new HiveException( "at least one column must be specified for the table")That's because we did the disjoint operation in
toHiveTableSo when creating a Hive table, fail early if all the columns are partitioned columns,
Why are the changes needed?
unify analysis error msg when create table with all the columns are partitioned columns
Does this PR introduce any user-facing change?
yes, but error msg only
How was this patch tested?
add ut