Skip to content

Commit

Permalink
[SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function…
Browse files Browse the repository at this point in the history
… hints

### What changes were proposed in this pull request?

Refine ANSI error messages and remove 'To return NULL instead'.
This PR is a backport of #36780 from `master`

### Why are the changes needed?

Improve error messaging for ANSI mode since the user may not even aware that query was returning NULLs.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit tests

Closes #36792 from vli-databricks/SPARK-39392-3.3.

Authored-by: Vitalii Li <vitalii.li@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
  • Loading branch information
vitaliili-db authored and MaxGekk committed Jun 8, 2022
1 parent 86f1b6b commit 3a95293
Show file tree
Hide file tree
Showing 22 changed files with 128 additions and 124 deletions.
10 changes: 5 additions & 5 deletions core/src/main/resources/error/error-classes.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,11 @@
"message" : [ "Cannot use a mixture of aggregate function and group aggregate pandas UDF" ]
},
"CAST_INVALID_INPUT" : {
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> because it is malformed. Correct the value as per the syntax, or change its target type. To return NULL instead, use `try_cast`. If necessary set <config> to \"false\" to bypass this error." ],
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set <config> to \"false\" to bypass this error." ],
"sqlState" : "42000"
},
"CAST_OVERFLOW" : {
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> due to an overflow. To return NULL instead, use `try_cast`. If necessary set <config> to \"false\" to bypass this error." ],
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> due to an overflow. Use `try_cast` to tolerate overflow and return NULL instead. If necessary set <config> to \"false\" to bypass this error." ],
"sqlState" : "22005"
},
"CONCURRENT_QUERY" : {
Expand All @@ -41,7 +41,7 @@
"sqlState" : "22008"
},
"DIVIDE_BY_ZERO" : {
"message" : [ "Division by zero. To return NULL instead, use `try_divide`. If necessary set <config> to \"false\" (except for ANSI interval type) to bypass this error." ],
"message" : [ "Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set <config> to \"false\" (except for ANSI interval type) to bypass this error." ],
"sqlState" : "22012"
},
"DUPLICATE_KEY" : {
Expand Down Expand Up @@ -93,7 +93,7 @@
"message" : [ "The index <indexValue> is out of bounds. The array has <arraySize> elements. If necessary set <config> to \"false\" to bypass this error." ]
},
"INVALID_ARRAY_INDEX_IN_ELEMENT_AT" : {
"message" : [ "The index <indexValue> is out of bounds. The array has <arraySize> elements. To return NULL instead, use `try_element_at`. If necessary set <config> to \"false\" to bypass this error." ]
"message" : [ "The index <indexValue> is out of bounds. The array has <arraySize> elements. Use `try_element_at` to tolerate accessing element at invalid index and return NULL instead. If necessary set <config> to \"false\" to bypass this error." ]
},
"INVALID_FIELD_NAME" : {
"message" : [ "Field name <fieldName> is invalid: <path> is not a struct." ],
Expand All @@ -115,7 +115,7 @@
"sqlState" : "42000"
},
"MAP_KEY_DOES_NOT_EXIST" : {
"message" : [ "Key <keyValue> does not exist. To return NULL instead, use `try_element_at`. If necessary set <config> to \"false\" to bypass this error." ]
"message" : [ "Key <keyValue> does not exist. Use `try_element_at` to tolerate non-existent key and return NULL instead. If necessary set <config> to \"false\" to bypass this error." ]
},
"MISSING_COLUMN" : {
"message" : [ "Column '<columnName>' does not exist. Did you mean one of the following? [<proposal>]" ],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,8 @@ class SparkThrowableSuite extends SparkFunSuite {

// Does not fail with too many args (expects 0 args)
assert(getMessage("DIVIDE_BY_ZERO", Array("foo", "bar", "baz")) ==
"Division by zero. To return NULL instead, use `try_divide`. If necessary set foo " +
"Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. " +
"If necessary set foo " +
"to \"false\" (except for ANSI interval type) to bypass this error.")
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,9 @@ private[sql] object QueryExecutionErrors extends QueryErrorsBase {
message: String,
hint: String = "",
errorContext: String = ""): ArithmeticException = {
val alternative = if (hint.nonEmpty) s" To return NULL instead, use '$hint'." else ""
val alternative = if (hint.nonEmpty) {
s" Use '$hint' to tolerate overflow and return NULL instead."
} else ""
new SparkArithmeticException(
errorClass = "ARITHMETIC_OVERFLOW",
messageParameters = Array(message, alternative, SQLConf.ANSI_ENABLED.key),
Expand Down Expand Up @@ -1093,7 +1095,8 @@ private[sql] object QueryExecutionErrors extends QueryErrorsBase {
value: Any, from: DataType, to: DataType, errorContext: String): Throwable = {
val valueString = toSQLValue(value, from)
new DateTimeException(s"Invalid input syntax for type ${toSQLType(to)}: $valueString. " +
s"To return NULL instead, use 'try_cast'. If necessary set ${SQLConf.ANSI_ENABLED.key} " +
s"Use `try_cast` to tolerate malformed input and return NULL instead. " +
s"If necessary set ${SQLConf.ANSI_ENABLED.key} " +
s"to false to bypass this error." + errorContext)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ select element_at(array(1, 2, 3), 5)
struct<>
-- !query output
org.apache.spark.SparkArrayIndexOutOfBoundsException
The index 5 is out of bounds. The array has 3 elements. To return NULL instead, use `try_element_at`. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
The index 5 is out of bounds. The array has 3 elements. Use `try_element_at` to tolerate accessing element at invalid index and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.


-- !query
Expand All @@ -177,7 +177,7 @@ select element_at(array(1, 2, 3), -5)
struct<>
-- !query output
org.apache.spark.SparkArrayIndexOutOfBoundsException
The index -5 is out of bounds. The array has 3 elements. To return NULL instead, use `try_element_at`. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
The index -5 is out of bounds. The array has 3 elements. Use `try_element_at` to tolerate accessing element at invalid index and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.


-- !query
Expand Down Expand Up @@ -337,7 +337,7 @@ select element_at(array(1, 2, 3), 5)
struct<>
-- !query output
org.apache.spark.SparkArrayIndexOutOfBoundsException
The index 5 is out of bounds. The array has 3 elements. To return NULL instead, use `try_element_at`. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
The index 5 is out of bounds. The array has 3 elements. Use `try_element_at` to tolerate accessing element at invalid index and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.


-- !query
Expand All @@ -346,7 +346,7 @@ select element_at(array(1, 2, 3), -5)
struct<>
-- !query output
org.apache.spark.SparkArrayIndexOutOfBoundsException
The index -5 is out of bounds. The array has 3 elements. To return NULL instead, use `try_element_at`. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
The index -5 is out of bounds. The array has 3 elements. Use `try_element_at` to tolerate accessing element at invalid index and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.


-- !query
Expand Down
Loading

0 comments on commit 3a95293

Please sign in to comment.