[SPARK-42911][PYTHON] Introduce more basic exceptions by ueshin · Pull Request #40538 · apache/spark

ueshin · 2023-03-24T00:17:30Z

What changes were proposed in this pull request?

Introduces more basic exceptions.

ArithmeticException
ArrayIndexOutOfBoundsException
DateTimeException
NumberFormatException
SparkRuntimeException

Why are the changes needed?

There are more exceptions that Spark throws but PySpark doesn't capture.

We should introduce more basic exceptions; otherwise we still see Py4JJavaError or SparkConnectGrpcException.

>>> spark.conf.set("spark.sql.ansi.enabled", True)
>>> spark.sql("select 1/0")
DataFrame[(1 / 0): double]
>>> spark.sql("select 1/0").show()
Traceback (most recent call last):
...
py4j.protocol.Py4JJavaError: An error occurred while calling o44.showString.
: org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
       ^^^

	at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:225)
... JVM's stacktrace

>>> spark.sql("select 1/0").show()
Traceback (most recent call last):
...
pyspark.errors.exceptions.connect.SparkConnectGrpcException: (org.apache.spark.SparkArithmeticException) [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
       ^^^

Does this PR introduce any user-facing change?

The error message is more readable.

>>> spark.sql("select 1/0").show()
Traceback (most recent call last):
...
pyspark.errors.exceptions.captured.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
       ^^^

or

>>> spark.sql("select 1/0").show()
Traceback (most recent call last):
...
pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
       ^^^

How was this patch tested?

Added the related tests.

ueshin · 2023-03-24T00:17:56Z

cc @itholic

itholic

LGTM, left just nit question

itholic · 2023-03-24T00:26:38Z

+class PythonException(SparkConnectGrpcException, BasePythonException):
    """
-    Exception thrown because of Spark upgrade from Spark Connect
+    Exceptions thrown from Spark Connect server.


qq: Is Spark Connect server and Spark Connect different??

Only PythonException says it's thrown from Spark Connect "server".

The comment is from the previous. We can change it to Spark Connect while we are here.

itholic · 2023-03-24T00:29:27Z

btw, the examples in "Does this PR introduce any user-facing change?" are the same??

ueshin · 2023-03-24T01:26:44Z

the examples in "Does this PR introduce any user-facing change?" are the same??

No, previously we still see py4j.protocol.Py4JJavaError or SparkConnectGrpcException and now we only see the actual exception classes.

itholic · 2023-03-24T02:00:27Z

Ah I see. One for regular Spark session and the other for remote Spark session.

HyukjinKwon · 2023-03-24T10:13:04Z

Merged to master.

HyukjinKwon · 2023-03-24T10:14:13Z

@ueshin it has a conflict w/ branch-3.4. would you mind creating a backport PR?

### What changes were proposed in this pull request? Introduces more basic exceptions. - ArithmeticException - ArrayIndexOutOfBoundsException - DateTimeException - NumberFormatException - SparkRuntimeException ### Why are the changes needed? There are more exceptions that Spark throws but PySpark doesn't capture. We should introduce more basic exceptions; otherwise we still see `Py4JJavaError` or `SparkConnectGrpcException`. ```py >>> spark.conf.set("spark.sql.ansi.enabled", True) >>> spark.sql("select 1/0") DataFrame[(1 / 0): double] >>> spark.sql("select 1/0").show() Traceback (most recent call last): ... py4j.protocol.Py4JJavaError: An error occurred while calling o44.showString. : org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == SQL(line 1, position 8) == select 1/0 ^^^ at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:225) ... JVM's stacktrace ``` ```py >>> spark.sql("select 1/0").show() Traceback (most recent call last): ... pyspark.errors.exceptions.connect.SparkConnectGrpcException: (org.apache.spark.SparkArithmeticException) [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == SQL(line 1, position 8) == select 1/0 ^^^ ``` ### Does this PR introduce _any_ user-facing change? The error message is more readable. ```py >>> spark.sql("select 1/0").show() Traceback (most recent call last): ... pyspark.errors.exceptions.captured.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == SQL(line 1, position 8) == select 1/0 ^^^ ``` or ```py >>> spark.sql("select 1/0").show() Traceback (most recent call last): ... pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == SQL(line 1, position 8) == select 1/0 ^^^ ``` ### How was this patch tested? Added the related tests. Closes apache#40538 from ueshin/issues/SPARK-42911/exceptions. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

ueshin · 2023-03-24T18:28:48Z

@HyukjinKwon #40547

Introduce more basic exceptions.

04c1811

github-actions Bot added BUILD CONNECT CORE PYTHON SQL labels Mar 24, 2023

itholic approved these changes Mar 24, 2023

View reviewed changes

HyukjinKwon approved these changes Mar 24, 2023

View reviewed changes

Fix.

619efa2

ueshin marked this pull request as draft March 24, 2023 03:15

Fix.

d04faf3

ueshin marked this pull request as ready for review March 24, 2023 03:22

HyukjinKwon closed this in fa9b6c3 Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-42911][PYTHON] Introduce more basic exceptions#40538

[SPARK-42911][PYTHON] Introduce more basic exceptions#40538
ueshin wants to merge 3 commits into
apache:masterfrom
ueshin:issues/SPARK-42911/exceptions

ueshin commented Mar 24, 2023

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

itholic left a comment

Uh oh!

itholic Mar 24, 2023

Uh oh!

ueshin Mar 24, 2023

Uh oh!

itholic commented Mar 24, 2023 •

edited

Loading

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

itholic commented Mar 24, 2023

Uh oh!

HyukjinKwon commented Mar 24, 2023 •

edited

Loading

Uh oh!

HyukjinKwon commented Mar 24, 2023

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ueshin commented Mar 24, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

itholic left a comment

Choose a reason for hiding this comment

Uh oh!

itholic Mar 24, 2023

Choose a reason for hiding this comment

Uh oh!

ueshin Mar 24, 2023

Choose a reason for hiding this comment

Uh oh!

itholic commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

itholic commented Mar 24, 2023

Uh oh!

HyukjinKwon commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Mar 24, 2023

Uh oh!

ueshin commented Mar 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

itholic commented Mar 24, 2023 •

edited

Loading

HyukjinKwon commented Mar 24, 2023 •

edited

Loading