-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43969][SQL] Refactor & Assign names to the error class _LEGACY_ERROR_TEMP_1170 #41458
Conversation
query: LogicalPlan): Throwable = { | ||
new AnalysisException( | ||
errorClass = "INSERT_COLUMN_ARITY_MISMATCH", | ||
errorClass = "INSERT_COLUMN_ARITY_MISMATCH.TOO_MANY_DATA_COLUMNS", | ||
messageParameters = Map( | ||
"tableName" -> tableName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use toSQLId
, please.
@@ -1651,6 +1665,11 @@ | |||
], | |||
"sqlState" : "46110" | |||
}, | |||
"NOT_SUPPORTED_COMMAND_WITHOUT_HIVE_SUPPORT" : { | |||
"message" : [ | |||
"<cmd> is not supported, if you want to enable it, please set `spark.sql.catalogImplementation` to `hive`." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, quote the SQL config and its value in the same way as toSQLConf
and toSQLConfVal
errorClass = "_LEGACY_ERROR_TEMP_1170", | ||
messageParameters = Map("detail" -> detail)) | ||
errorClass = "NOT_SUPPORTED_COMMAND_WITHOUT_HIVE_SUPPORT", | ||
messageParameters = Map("cmd" -> cmd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a SQL statement? If so, quote it by toSQLStmt
Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
…n/spark into refactor_PreWriteCheck
"subClass" : { | ||
"NOT_ENOUGH_DATA_COLUMNS" : { | ||
"message" : [ | ||
"not enough data columns: ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detail reason
}, | ||
"TOO_MANY_DATA_COLUMNS" : { | ||
"message" : [ | ||
"too many data columns: ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detail reason
"dataColumns" -> "'1', '2', '3'", | ||
"staticPartCols" -> "`b`, `c`", | ||
"tableColumns" -> "`a`, `d`, `b`, `c`", | ||
"dataColumns" -> "`1`, `2`, `3`", | ||
"tableName" -> s"`spark_catalog`.`default`.`$tableName`") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder why $tableName
still starts from $
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is a variable, its value in this case may be 'hive_table' or 'ds_table'.
Perhaps s"spark_catalog
.default
.${tableName}
" can better express this?
"reason" -> "not enough data columns", | ||
"tableColumns" -> "'a', 'b'", | ||
"dataColumns" -> "'a'")) | ||
"tableName" -> toSQLId("unknown"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just quote it by backticks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that in Test
, try not to use toSQLId
as much as possible? I'm actually a bit confused.😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, need to be consistent to other tests where we just place the expected values.
Second, if you have a bug in toSQLId
, the test should catch it. But the reason is the first one.
errorClass = "_LEGACY_ERROR_TEMP_1170", | ||
messageParameters = Map("detail" -> detail)) | ||
errorClass = "NOT_SUPPORTED_COMMAND_WITHOUT_HIVE_SUPPORT", | ||
messageParameters = Map("cmd" -> toSQLStmt(cmd))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd
is not a SQL statement in general, see what is passed to here:
"INSERT OVERWRITE DIRECTORY with the Hive format"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for Ci. @panbingkun Could you re-trigger GAs, please.
Done. |
@panbingkun Could you fix the test failure, please:
|
Done, Let's wait for Ci. |
+1, LGTM. Merging to master. |
…_ERROR_TEMP_1170 ### What changes were proposed in this pull request? The pr aims to: - Refactor `PreWriteCheck` to use error framework. - Make `INSERT_COLUMN_ARITY_MISMATCH` more generic & avoiding to embed error's text in source code. - Assign name to _LEGACY_ERROR_TEMP_1170. - In `INSERT_PARTITION_COLUMN_ARITY_MISMATCH` error message, replace '' with `toSQLId` for table column name. ### Why are the changes needed? The changes improve the error framework. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Manually test. - Pass GA. Closes apache#41458 from panbingkun/refactor_PreWriteCheck. Lead-authored-by: panbingkun <pbk1982@gmail.com> Co-authored-by: panbingkun <84731559@qq.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
### What changes were proposed in this pull request? #41458 updated `numeric.sql.out` but not update `numeric.sql.out.java21`, this pr updated `numeric.sql.out.java21` for Java 21. ### Why are the changes needed? Fix golden file for Java 21. https://github.com/apache/spark/actions/runs/5362442727/jobs/9729315685 ``` [info] - postgreSQL/numeric.sql *** FAILED *** (1 minute, 4 seconds) [info] postgreSQL/numeric.sql [info] Expected "...OLUMN_ARITY_MISMATCH[", [info] "sqlState" : "21S01", [info] "messageParameters" : { [info] "dataColumns" : "'id', 'id', 'val', 'val', '(val * val)'", [info] "reason" : "too many data columns", [info] "tableColumns" : "'id1', 'id2', 'result']", [info] "tableName" :...", but got "...OLUMN_ARITY_MISMATCH[.TOO_MANY_DATA_COLUMNS", [info] "sqlState" : "21S01", [info] "messageParameters" : { [info] "dataColumns" : "`id`, `id`, `val`, `val`, `(val * val)`", [info] "tableColumns" : "`id1`, `id2`, `result`]", [info] "tableName" :..." Result did not match for query #474 [info] INSERT INTO num_result SELECT t1.id, t2.id, t1.val, t2.val, t1.val * t2.val [info] FROM num_data t1, num_data t2 (SQLQueryTestSuite.scala:848) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564) [info] at org.scalatest.Assertions.assertResult(Assertions.scala:847) [info] at org.scalatest.Assertions.assertResult$(Assertions.scala:842) [info] at org.scalatest.funsuite.AnyFunSuite.assertResult(AnyFunSuite.scala:1564) [info] at org.apache.spark.sql.SQLQueryTestSuite.$anonfun$readGoldenFileAndCompareResults$3(SQLQueryTestSuite.scala:848) ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual checked using Java 21 Closes #41720 from LuciferYang/SPARK-43969-FOLLOWUP-2. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
The pr aims to:
PreWriteCheck
to use error framework.INSERT_COLUMN_ARITY_MISMATCH
more generic & avoiding to embed error's text in source code.INSERT_PARTITION_COLUMN_ARITY_MISMATCH
error message, replace '' withtoSQLId
for table column name.Why are the changes needed?
The changes improve the error framework.
Does this PR introduce any user-facing change?
No.
How was this patch tested?