[SPARK-51055][SS][CONNECT] Streaming foreachBatch should call init logic inside a try by WweiL · Pull Request #49757 · apache/spark

WweiL · 2025-02-01T01:34:22Z

What changes were proposed in this pull request?

So that any error can be propagated to the jvm. Especially the python version issue.

handle_worker_exception would write out a special int to the jvm, and wil be caught here:

spark/core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala

Lines 99 to 103 in c920210

    
           val resFromPython = dataIn.readInt() 
        
           if (resFromPython != 0) { 
        
             val errMessage = PythonWorkerUtils.readUTF(dataIn) 
        
             throw streamingPythonRunnerInitializationFailure(resFromPython, errMessage) 
        
           }

Why are the changes needed?

Spark Connect improvements

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit test

Was this patch authored or co-authored using generative AI tooling?

No

hvanhovell

LGTM

HeartSaVioR · 2025-02-02T23:53:26Z

@WweiL Would you mind looking at the CI build? The failure seems to be relevant.

WweiL · 2025-02-03T05:16:28Z

@HeartSaVioR Yes it indeed looks relevant, I'll take a look, thanks!

jiateoh

LGTM after the build fix, thanks!

WweiL · 2025-02-04T06:13:12Z

@HyukjinKwon can we merge this : ) TY!

HyukjinKwon · 2025-02-04T07:15:56Z

Merged to master and branch-4.0.

…gic inside a try ### What changes were proposed in this pull request? So that any error can be propagated to the jvm. Especially the python version issue. `handle_worker_exception` would write out a special int to the jvm, and wil be caught here: https://github.com/apache/spark/blob/c92021091502b15b6020e6e4cc9b148009450ba5/core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala#L99-L103 ### Why are the changes needed? Spark Connect improvements ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes #49757 from WweiL/feb-worker-init-error-propagate. Authored-by: Wei Liu <wei.liu@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 505c644) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

…gic inside a try ### What changes were proposed in this pull request? So that any error can be propagated to the jvm. Especially the python version issue. `handle_worker_exception` would write out a special int to the jvm, and wil be caught here: https://github.com/apache/spark/blob/cd1f1b3b1884c165b077120820706b1816e111d8/core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala#L99-L103 ### Why are the changes needed? Spark Connect improvements ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#49757 from WweiL/feb-worker-init-error-propagate. Authored-by: Wei Liu <wei.liu@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit ec0931d) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

WweiL added 2 commits January 31, 2025 17:32

done

fe25067

retrigger

7a29f87

github-actions bot added STRUCTURED STREAMING PYTHON CONNECT labels Feb 1, 2025

fmt

4280077

grundprinzip approved these changes Feb 1, 2025

View reviewed changes

hvanhovell approved these changes Feb 1, 2025

View reviewed changes

jiateoh approved these changes Feb 3, 2025

View reviewed changes

fix

347253c

github-actions bot added the CORE label Feb 3, 2025

WweiL added 2 commits February 3, 2025 14:51

retrigger

c8476ce

retrigger

153f635

github-actions bot removed the CORE label Feb 3, 2025

HyukjinKwon approved these changes Feb 4, 2025

View reviewed changes

HyukjinKwon closed this in 505c644 Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51055][SS][CONNECT] Streaming foreachBatch should call init logic inside a try#49757

[SPARK-51055][SS][CONNECT] Streaming foreachBatch should call init logic inside a try#49757
WweiL wants to merge 6 commits intoapache:masterfrom
WweiL:feb-worker-init-error-propagate

WweiL commented Feb 1, 2025 •

edited

Loading

Uh oh!

hvanhovell left a comment

Uh oh!

HeartSaVioR commented Feb 2, 2025

Uh oh!

WweiL commented Feb 3, 2025

Uh oh!

jiateoh left a comment

Uh oh!

WweiL commented Feb 4, 2025

Uh oh!

HyukjinKwon commented Feb 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	val resFromPython = dataIn.readInt()
	if (resFromPython != 0) {
	val errMessage = PythonWorkerUtils.readUTF(dataIn)
	throw streamingPythonRunnerInitializationFailure(resFromPython, errMessage)
	}

Conversation

WweiL commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

hvanhovell left a comment

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Feb 2, 2025

Uh oh!

WweiL commented Feb 3, 2025

Uh oh!

jiateoh left a comment

Choose a reason for hiding this comment

Uh oh!

WweiL commented Feb 4, 2025

Uh oh!

HyukjinKwon commented Feb 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

WweiL commented Feb 1, 2025 •

edited

Loading