Skip to content

[SPARK-47200][SS] Error class for Foreach batch sink user function error#45299

Closed
micheal-o wants to merge 4 commits intoapache:masterfrom
micheal-o:ForeachBatchSinkUserCodeError
Closed

[SPARK-47200][SS] Error class for Foreach batch sink user function error#45299
micheal-o wants to merge 4 commits intoapache:masterfrom
micheal-o:ForeachBatchSinkUserCodeError

Conversation

@micheal-o
Copy link
Contributor

What changes were proposed in this pull request?

Any exception can be thrown from the user provided function for ForEachBatchSink. We want to classify this class of errors. Including errors from Python (Py4JException) and Scala functions.

Why are the changes needed?

The user provided function can throw any type of error. Using the new error framework for better error messages and classification.

Does this PR introduce any user-facing change?

Yes, better error message with error class for ForeachBatchSink user function failures.

How was this patch tested?

Updated existing tests and added a new one. Covers python and Scala.

Was this patch authored or co-authored using generative AI tooling?

No

@micheal-o
Copy link
Contributor Author

cc @HeartSaVioR

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
I'll wait for a day (or over the weekend) for @MaxGekk to chime in. I'll merge the PR once he approves or no major comment is made from him.

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master.

TakawaAkirayo pushed a commit to TakawaAkirayo/spark that referenced this pull request Mar 4, 2024
### What changes were proposed in this pull request?
Any exception can be thrown from the user provided function for ForEachBatchSink. We want to classify this class of errors. Including errors from Python (Py4JException) and Scala functions.

### Why are the changes needed?
The user provided function can throw any type of error. Using the new error framework for better error messages and classification.

### Does this PR introduce _any_ user-facing change?
Yes, better error message with error class for ForeachBatchSink user function failures.

### How was this patch tested?
Updated existing tests and added a new one. Covers python and Scala.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45299 from micheal-o/ForeachBatchSinkUserCodeError.

Authored-by: micheal-o <micheal.okutubo@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
### What changes were proposed in this pull request?
Any exception can be thrown from the user provided function for ForEachBatchSink. We want to classify this class of errors. Including errors from Python (Py4JException) and Scala functions.

### Why are the changes needed?
The user provided function can throw any type of error. Using the new error framework for better error messages and classification.

### Does this PR introduce _any_ user-facing change?
Yes, better error message with error class for ForeachBatchSink user function failures.

### How was this patch tested?
Updated existing tests and added a new one. Covers python and Scala.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45299 from micheal-o/ForeachBatchSinkUserCodeError.

Authored-by: micheal-o <micheal.okutubo@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
HeartSaVioR pushed a commit that referenced this pull request Mar 10, 2024
…g backward compatible

### What changes were proposed in this pull request?
I checked in a previous PR (#45299), that handles and classifies exceptions thrown in user provided functions for foreach batch sink. This change is to make it backward compatible in order not to break current users, since users may be depending on getting the user code error from the `StreamingQueryException.cause` instead of `StreamingQueryException.cause.cause`

### Why are the changes needed?
To prevent breaking existing usage pattern.

### Does this PR introduce _any_ user-facing change?
Yes, better error message with error class for ForeachBatchSink user function failures.

### How was this patch tested?
updated existing tests

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #45449 from micheal-o/ForeachBatchExBackwardCompat.

Authored-by: micheal-o <micheal.okutubo@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
HeartSaVioR pushed a commit that referenced this pull request Sep 5, 2024
… error

### What changes were proposed in this pull request?

Similar with classification that micheal-o  did for ForeachBatch sink PR: #45299, any exception can be thrown from the user provided function for ForEach Sink. We want to classify this class of errors. Including errors from Python (Py4JException) and Scala functions.

### Why are the changes needed?

The user provided function can throw any type of error. Using the new error framework for better error messages and classification.

### Does this PR introduce _any_ user-facing change?

Yes, better error message with error class for Foreach sink user function failures.

### How was this patch tested?

Updated existing tests. Covers Python and Scala.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47819 from jingz-db/classify-foreach-error.

Authored-by: jingz-db <jing.zhan@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants