[WIP][SPARK-47338][SQL] Introduce UNCLASSIFIED for default error class#45457
[WIP][SPARK-47338][SQL] Introduce UNCLASSIFIED for default error class#45457itholic wants to merge 22 commits intoapache:masterfrom
UNCLASSIFIED for default error class#45457Conversation
…ault_error_class
There was a problem hiding this comment.
@itholic Why did you name it with the prefix _LEGACY_? Do you plan to eliminate it in the future?
There was a problem hiding this comment.
Because I thought that not having an error class assigned basically meant it was a LEGACY error, but I have not very strong opinion. Do you have any preference? Also cc @srielau FYI
There was a problem hiding this comment.
Because I thought that not having an error class assigned basically meant it was a LEGACY error
I would say it is true. SparkException can still be raised w/ just a message since it is not fully ported on error classes. For instance:
There was a problem hiding this comment.
Since we know the cases when the error class is not set, how about just name the error class like UNCLASSIFIED
There was a problem hiding this comment.
Sounds reasonable to me. Let me address it.
…ault_error_class
_LEGACY_ERROR_UNKNOWN for default error classUNCLASSIFIED for default error class
…ault_error_class
|
Updated PR title & description. Let me take a look at the CI failure |
UNCLASSIFIED for default error classUNCLASSIFIED for default error class
…ault_error_class
…ault_error_class
|
LGTM once CI passed, thank you! |
…ault_error_class
…ault_error_class
| pairs.saveAsNewAPIHadoopFile[NewFakeFormatWithCallback]("ignored") | ||
| } | ||
| assert(e.getCause.getMessage contains "failed to write") | ||
| assert(e.getCause.getMessage contains "Task failed while writing rows") |
There was a problem hiding this comment.
how it happens that you have to change this?
There was a problem hiding this comment.
That is also my question. I believe this error message should not been affected by current change, but CI keep complaining about this.
So I modified it for testing purposes to see if this would really change the response of CI.
| struct<> | ||
| -- !query output | ||
| org.apache.spark.api.python.PythonException | ||
| pyspark.errors.exceptions.base.PySparkRuntimeError: [UDTF_EXEC_ERROR] User defined table function encountered an error in the 'eval' or 'terminate' method: Column 0 within a returned row had a value of None, either directly or within array/struct/map subfields, but the corresponding column type was declared as non-nullable; please update the UDTF to return a non-None value at this location or otherwise declare the column type as nullable. |
There was a problem hiding this comment.
The deleted error message seems reasonable. Do you know why it is replaced?
There was a problem hiding this comment.
Yeah, I agree that this looks as bit weird.
The reason is that the UDTF_EXEC_ERROR is defined from PySpark side, so technically it is UNCLASSIFIED from JVM logic as it is not defined from error-classes.json.
But the existing error message still shows on user space, such as:
org.apache.spark.SparkException: [UNCLASSIFIED] pyspark.errors.exceptions.base.PySparkRuntimeError: [UDTF_EXEC_ERROR] User defined table function encountered an error in the 'eval' or 'terminate' method: Column 0 within a returned row had a value of None, either directly or within array/struct/map subfields, but the corresponding column type was declared as non-nullable; please update the UDTF to return a non-None value at this location or otherwise declare the column type as nullable.
I'm not sure if it would be better to keep the existing error message for PythonException or mark it as UNCLASSIFIED.
UNCLASSIFIED for default error classUNCLASSIFIED for default error class
|
Let me mark it as a draft for now, as I haven't been able to find a clear cause as to why the CI is complaining. |
|
Sorry but let me reopen this PR with current master branch since it's staled too long. |
What changes were proposed in this pull request?
This PR proposes to introduce
UNCLASSIFIEDfor default error class when error class is not defined.Why are the changes needed?
In Spark, when an
errorClassis not explicitly defined for an exception, the methodgetErrorClassreturns null so far.This behavior can lead to ambiguity and makes debugging more challenging by not providing a clear indication that the error class was not set.
Does this PR introduce any user-facing change?
No API changes, but the user-facing error message will contain
UNCLASSIFIEDwhen error class is not specified.How was this patch tested?
Updated the existing UT (
SparkThrowableSuite)Was this patch authored or co-authored using generative AI tooling?
No