Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45859][ML] Make UDF objects in ml.functions lazy #43739

Closed
wants to merge 4 commits into from

Conversation

zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Nov 9, 2023

What changes were proposed in this pull request?

Since JVM runs static codes only once, if loading functions$ fails, it will always report java.lang.NoClassDefFoundError: Could not initialize class

23/11/01 23:06:21 WARN TaskSetManager: Lost task 136.0 in stage 9565.0 (TID 4557384) (10.4.35.209 executor 16): TaskKilled (Stage cancelled: Job aborted due to stage failure: Task 2 in stage 9565.0 failed 4 times, most recent failure: Lost task 2.3 in stage 9565.0 (TID 4558369) (10.4.56.6 executor 71): java.io.IOException: unexpected exception type
	at java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1750)
	at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1280)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)
	…
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:900)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:795)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.GeneratedMethodAccessor520.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)
	at sun.reflect.GeneratedMethodAccessor224.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274)
	... 388 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.ml.functions$
	... 396 more

This PR just changes functions.* as lazy avoid hitting this issue because the initialization codes of a lazy val is not in static codes.

Why are the changes needed?

to fix a intermittent bug

Does this PR introduce any user-facing change?

no

How was this patch tested?

new UT

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the ML label Nov 9, 2023
@zhengruifeng zhengruifeng marked this pull request as ready for review November 9, 2023 18:56
@zsxwing
Copy link
Member

zsxwing commented Nov 9, 2023

Could you add a unit test for this change?

Copy link
Contributor

@WeichenXu123 WeichenXu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add unit test first.

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending tests

test("SPARK-45859: 'functions$' should not be affected by a broken class loader") {
quietly {
// Only one SparkContext should be running in this JVM (see SPARK-2243)
sc.stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we add this after each test run ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, we'd better close sc in this UT, in case contributors add new UTs after this one and tested with non-expected sc


// this UT should be the last one in this test suite, since it uses
// a different `sc` from the standard one.
// stop it here in case new UTs are added after this one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to create a separate suite, and document this on the top to avoid such mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, let me move it to a separate file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zhengruifeng
Copy link
Contributor Author

@WeichenXu123 would you mind taking another look?

@HyukjinKwon
Copy link
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the ml_lazy_udfs branch November 13, 2023 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants