[SPARK-35261][SQL] Support static magic method for stateless Java ScalarFunction #32407

sunchao · 2021-04-30T06:21:20Z

What changes were proposed in this pull request?

This allows ScalarFunction implemented in Java to optionally specify the magic method invoke to be static, which can be used if the UDF is stateless. Comparing to the non-static method, it can potentially give better performance due to elimination of dynamic dispatch, etc.

Also added a benchmark to measure performance of: the default produceResult, non-static magic method and static magic method.

Why are the changes needed?

For UDFs that are stateless (e.g., no need to maintain intermediate state between each function call), it's better to allow users to implement the UDF function as static method which could potentially give better performance.

Does this PR introduce any user-facing change?

Yes. Spark users can now have the choice to define static magic method for ScalarFunction when it is written in Java and when the UDF is stateless.

How was this patch tested?

Added new UT.

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

dongjoon-hyun

Thank you, @sunchao .
BTW, in general, we recommend to generate both Java8 and Java11 benchmark results to avoid any regression in one Java version. Could you generate Java8 result file and add to this PR, please?

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

sunchao · 2021-04-30T06:54:50Z

Thanks @dongjoon-hyun . Yes will add result for JDK 8 too.

sunchao · 2021-04-30T07:23:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

-            val caller = Literal.create(scalarFunc, ObjectType(scalarFunc.getClass))
-            Invoke(caller, MAGIC_METHOD_NAME, scalarFunc.resultType(),
-              arguments, returnNullable = scalarFunc.isResultNullable)
+            StaticInvoke(scalarFunc.getClass, scalarFunc.resultType(),


Ideally we'd want to check if this method is actually static, otherwise there could be runtime error. However this only works for methods defined in Java; for Scala seems there is no easy way.

Scala doesn't have the concept of truly static methods, right? The equivalent (object methods) are actually just instance methods on a singleton.

Yes that's correct. The StaticInvoke calls the static method on the non-anonymous class which just forward to the non-static method defined in anonymous/singleton Java class (i.e., the class with $ at the end of its name).

For instance, for the LongAddWithStatic class, this is the method defined in LongAddWithStaticMagic.class:

public static long staticInvoke(long, long); Code: 0: getstatic #16 // Field org/apache/spark/sql/connector/functions/LongAddWithStaticMagic$.MODULE$:Lorg/apache/spark/sql/connector/functions/LongAddWithStaticMagic$; 3: lload_0 4: lload_2 5: invokevirtual #51 // Method org/apache/spark/sql/connector/functions/LongAddWithStaticMagic$.staticInvoke:(JJ)J 8: lreturn

and the same method defined in the singleton class LongAddWithStaticMagic$:

public long staticInvoke(long, long); Code: 0: lload_1 1: lload_3 2: ladd 3: lreturn

So I was expecting worse performance from Scala since it calls invokevirtual underneath while Java uses invokestatic, but the result doesn't look so. It could be that the performance is dominated by other factors.

Very interesting, thanks for the explanation!

SparkQA · 2021-04-30T07:36:19Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42620/

SparkQA · 2021-04-30T07:36:20Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42620/

SparkQA · 2021-04-30T11:11:22Z

Test build #138100 has finished for PR 32407 at commit 227ce76.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

xkrogen · 2021-04-30T15:08:38Z

sql/core/src/test/scala/org/apache/spark/sql/connector/functions/FunctionBenchmark.scala

+  }
+
+  private def scalarBenchmark(N: Long, resultNullable: Boolean): Unit = {
+    withSQLConf(s"spark.sql.catalog.$catalogName" -> classOf[InMemoryCatalog].getName) {


Just for my curiosity/education, why do we need to override the catalog here? Can't we just use the default in-memory catalog?

Hmm I need a way to specify the V2 InMemoryCatalog here since it implements the FunctionCatalog. What is the default in-memory catalog? it seems it is not the same as this InMemoryCatalog?

Ah. It appears I'm mixing up spark.sql.catalogImplementation (which defaults to org.apache.spark.sql.catalyst.catalog.InMemoryCatalog) vs. spark.sql.catalog (which defaults to V2SessionCatalog but you are overriding it to org.apache.spark.sql.connector.catalog.InMemoryCatalog). These two similarly named configurations and classes are quite confusing!

Ah yea, it's easy to confuse the two. We can perhaps change its name if necessary. There are a few other classes with the same name between DSv1 and v2, e.g., Table, AggregateFunction,

xkrogen · 2021-04-30T15:12:23Z

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

+scalar function (long + long) -> long/notnull wholestage off:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
+--------------------------------------------------------------------------------------------------------------------------------------------
+with long_add_default                                                 28601          28964         496         17.5          57.2       1.0X
+with long_add_magic                                                    6986           7156         150         71.6          14.0       4.1X
+with long_add_static_magic                                             6509           6539          32         76.8          13.0       4.4X


IIRC, one of the concerns with the magic-method approach was that things would get really slow when codegen wasn't used because of reflection overhead. But, these results look pretty much identical to the ones above with codegen on. Do we know why we don't see the performance impact that was expected? (or am I misremembering the concern?)

Yes very good point! I'm puzzled by this result too. There are some interesting stuff in the result:

why the magic method approach is so much faster than the produceResult approach: we didn't see that much difference in @cloud-fan 's benchmark.

why codegen on/off doesn't affect the result much

why we don't see Java outperform Scala when using the static magic method approach (since it uses invokestatic). I did some benchmark with Java UDFs but didn't include the result in this PR.

Let me do some profiling to find out why. Will update later.

So the reason for 1) is because ApplyFunctionExpression (especially BoundReference.eval within the method) is quite expensive as it involves lots of boxing and unboxing

This wasn't present in @cloud-fan 's benchmark.

For 2), there is another config spark.sql.codegen.factoryMode which controls whether to do codegen on expressions (e.g., in projection). Therefore even though whole stage codegen is turned off, underneath we are still calling Invoke.genCode to execute the UDF. After turning spark.sql.codegen.factoryMode off I'm seeing that the magic method is actually half slower than the produceResult approach. Looking right now why it is so much slower.

For 3) I think it is actually normal, apparently JIT does optimizations which makes the performance almost the same in these two scenarios.

viirya · 2021-04-30T21:02:49Z

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java

- * codegen, removal of Java boxing, etc.
- *
- * For example, a scalar UDF for adding two integers can be defined as follow with the magic
+ * a static magic method with name {@link #STATIC_MAGIC_METHOD_NAME}, or non-static magic


Does static magic method have significant benefit over non-static magic one? We shouldn't create the UDF object per row, so I think the cost of non-static magic method is not very different than static one?

The benchmark also doesn't show much difference.

Three different entry points to the UDF API look a bit verbose.

So far I didn't observe any major difference in terms of performance. I'm happy to re-purpose this JIRA/PR to just adding benchmark if others also feel that the static method is not worth it.

Another approach is, we allow UDFs defined in Java to add the static keyword for the invoke method, and Spark will translate it to StaticInvoke accordingly (by looking at Method modifiers). This keeps it simple as users only need to remember produceResult and invoke, and gives them flexibility to define the magic method as static if necessary.

we allow UDFs defined in Java to add the static keyword for the invoke method

Sounds like a better API! But we need to think about if it works for Scala.

Since there is no static method in Scala, invoke defined in object will not have the static modifier from the Method instance obtained through reflection, and so it will just go with the Invoke path. We might sacrifice a bit of performance such as unnecessary null check on receiver in Invoke but it shouldn't matter much.

Overall I don't see obvious benefit over non-static from static. Having static as a extra support in Java UDF, sounds okay as to support a language-specific feature.

So for Scala, as it doesn't have static method at all, it seems to me it is also natural to have only non-static invoke path.

SGTM, let's fix https://issues.apache.org/jira/browse/SPARK-35281 first and see how much speedup the new static UDF API can provide.

Turns out zipWithIndex in ApplyFunctionExpression is also quite expensive comparing to our simple add function. After removing that and fixing SPARK-35281, I no longer see regression when result is nullable, and the gap between the produceResult and magic method becomes narrower:

[info] OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Mac OS X 10.16 [info] Intel(R) Core(TM) i9-10910 CPU @ 3.60GHz [info] Java scalar function (long + long) -> long/notnull wholestage on: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------------------------ [info] with long_add_default 19794 19860 113 25.3 39.6 1.0X [info] with long_add_magic 7627 7772 228 65.6 15.3 2.6X [info] with long_add_static_magic 6631 6725 82 75.4 13.3 3.0X [info] OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Mac OS X 10.16 [info] Intel(R) Core(TM) i9-10910 CPU @ 3.60GHz [info] Java scalar function (long + long) -> long/nullable wholestage on: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------------------------- [info] with long_add_default 19990 20243 275 25.0 40.0 1.0X [info] with long_add_magic 7285 7435 141 68.6 14.6 2.7X [info] with long_add_static_magic 7077 7170 145 70.6 14.2 2.8X

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

SparkQA · 2021-05-05T03:09:09Z

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42678/

SparkQA · 2021-05-05T05:41:57Z

Test build #138157 has finished for PR 32407 at commit ca7ea47.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-05-05T23:38:59Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42704/

SparkQA · 2021-05-05T23:39:00Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42704/

sunchao · 2021-05-06T01:09:20Z

@cloud-fan @xkrogen @viirya @dongjoon-hyun this is ready for another round of review - could you take a look? Thanks!

sunchao · 2021-05-06T01:10:02Z

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

+Java scalar function (long + long) -> long result_nullable = true codegen = false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------
+with long_add_default                                                                      64039          64306         293          7.8         128.1       1.0X
+with long_add_magic                                                                       199121         199232         144          2.5         398.2       0.3X


A few optimizations can be done to bring this to ~0.8X. I'll do it as a follow-up.

If that is orthogonal to this PR, let's have a separate PR for that instead of the follow-up.

As you know, @sunchao . This is a request as one of the downstream users. I hope each patch is orthogonally backportable although we will not backport this to Apache Spark 3.1.

Yeah by "follow-up" I mean a separate PR :)

cloud-fan · 2021-05-06T02:29:38Z

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

+-----------------------------------------------------------------------------------------------------------------------------------------------------------
+with long_add_default                                                                62105          62467         341          8.1         124.2       1.0X
+with long_add_magic                                                                  20721          22705        1729         24.1          41.4       3.0X
+


do we have result for long_add_static_magic when codegen is on?

For Scala, no, since it'll always use Invoke and so there is no difference. We have long_add_static_magic for Java UDFs above.

SparkQA · 2021-05-06T03:22:49Z

Test build #138183 has finished for PR 32407 at commit 5a0b243.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java

...alyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ApplyFunctionExpression.scala

dongjoon-hyun · 2021-05-06T06:20:48Z

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

@@ -0,0 +1,60 @@
+OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure


Oh, this is Azure!

dongjoon-hyun · 2021-05-06T06:24:40Z

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt

+------------------------------------------------------------------------------------------------------------------------------------------------------------------
+with long_add_default                                                                       60515          60887         453          8.3         121.0       1.0X
+with long_add_magic                                                                        201052         202036         957          2.5         402.1       0.3X
+with long_add_static_magic                                                                 202037         202639         584          2.5         404.1       0.3X


Oh, in this case, static method is slower than the instance method?

Yes slightly. I think it could be within the margin of benchmark though.

SparkQA · 2021-05-07T06:53:26Z

Test build #138228 has finished for PR 32407 at commit a1382fc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-05-07T10:20:37Z

Test build #138238 has finished for PR 32407 at commit ac5303f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-05-07T20:16:12Z

sql/core/src/test/scala/org/apache/spark/sql/connector/functions/V2FunctionBenchmark.scala

+            s"codegen = $codegenEnabled"
+        val benchmark = new Benchmark(name, N, output = output)
+        benchmark.addCase(s"native_long_add", numIters = 3) { _ =>
+          spark.range(N).selectExpr("id + id").noop()


nit: this is a bit unfair that, native add is always not nullable, even though resultNullable = true. We can create a special expression for this benchmark

case class NativeAdd(left: Expression, right: Expression, nullable: Boolean) extends BinaryArithmetic { ... } ... spark.range(N).select(Column(NativeAdd($"id".expr, $"id".expr, resultNullable)))

Good point. I've updated the benchmark and result.

SparkQA · 2021-05-07T20:52:24Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42784/

SparkQA · 2021-05-07T20:52:25Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42784/

SparkQA · 2021-05-08T00:25:43Z

Test build #138262 has finished for PR 32407 at commit 51d3edc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

+1, LGTM.
The only left comment is the following, @sunchao ?

[SPARK-35261][SQL] Support static magic method for stateless Java ScalarFunction #32407 (comment)

sunchao · 2021-05-08T01:02:51Z

Yes, was waiting for benchmark to finish. Updated.

SparkQA · 2021-05-08T01:51:45Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42792/

SparkQA · 2021-05-08T01:51:46Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42792/

dongjoon-hyun

Thank you for updating, @sunchao .
Merged to master for Apache Spark 3.2.0.

cloud-fan · 2021-05-08T05:12:46Z

sql/core/src/test/scala/org/apache/spark/sql/connector/functions/V2FunctionBenchmark.scala

+      left: Expression,
+      right: Expression,
+      override val nullable: Boolean) extends BinaryArithmetic {
+    override protected val failOnError: Boolean = true


I think it's better to set it to false. The UDF just does left + right, while this native add is doing math.addExact(left, right), which is a bit unfair.

Oops you are right. I didn't realize this could impact performance too. Let me do it in a follow-up.

SparkQA · 2021-05-08T05:58:40Z

Test build #138270 has finished for PR 32407 at commit 5096fac.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class NativeAdd(

github-actions bot added the SQL label Apr 30, 2021

sunchao commented Apr 30, 2021

View reviewed changes

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Apr 30, 2021

View reviewed changes

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt Outdated Show resolved Hide resolved

sunchao commented Apr 30, 2021

View reviewed changes

xkrogen reviewed Apr 30, 2021

View reviewed changes

viirya reviewed Apr 30, 2021

View reviewed changes

HyukjinKwon reviewed May 3, 2021

View reviewed changes

sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt Outdated Show resolved Hide resolved

sunchao force-pushed the SPARK-35261 branch 2 times, most recently from c999638 to ca7ea47 Compare May 3, 2021 22:38

sunchao force-pushed the SPARK-35261 branch from ca7ea47 to 5a0b243 Compare May 5, 2021 22:45

sunchao commented May 6, 2021

View reviewed changes

cloud-fan reviewed May 6, 2021

View reviewed changes

dongjoon-hyun reviewed May 6, 2021

View reviewed changes

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java Outdated Show resolved Hide resolved

dongjoon-hyun reviewed May 6, 2021

View reviewed changes

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java Show resolved Hide resolved

dongjoon-hyun reviewed May 6, 2021

View reviewed changes

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java Show resolved Hide resolved

dongjoon-hyun reviewed May 6, 2021

View reviewed changes

...alyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ApplyFunctionExpression.scala Outdated Show resolved Hide resolved

dongjoon-hyun reviewed May 6, 2021

View reviewed changes

sunchao force-pushed the SPARK-35261 branch 2 times, most recently from 923d39e to 51d3edc Compare May 7, 2021 19:32

sunchao changed the title ~~[SPARK-35261][SQL] Support static magic method for stateless ScalarFunction~~ [SPARK-35261][SQL] Support static magic method for stateless Java ScalarFunction May 7, 2021

cloud-fan approved these changes May 7, 2021

View reviewed changes

cloud-fan reviewed May 7, 2021

View reviewed changes

viirya approved these changes May 7, 2021

View reviewed changes

sunchao added 8 commits May 7, 2021 14:03

first commit

20e8f1c

remove staticInvoke and only apply static for Java

108123b

address test failures

d2ff91d

add benchmark results

994d3b1

address comments

ae18622

change benchmark format

1f322a4

address more comments

2fcb3f3

update benchmark after SPARK-35333

2b7afc7

dongjoon-hyun approved these changes May 8, 2021

View reviewed changes

change native add to consider result nullness

5096fac

sunchao force-pushed the SPARK-35261 branch from 51d3edc to 5096fac Compare May 8, 2021 01:02

dongjoon-hyun approved these changes May 8, 2021

View reviewed changes

dongjoon-hyun closed this in f47e0f8 May 8, 2021

cloud-fan reviewed May 8, 2021

View reviewed changes

		@@ -0,0 +1,60 @@
		OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure

[SPARK-35261][SQL] Support static magic method for stateless Java ScalarFunction #32407

[SPARK-35261][SQL] Support static magic method for stateless Java ScalarFunction #32407

Conversation

sunchao commented Apr 30, 2021 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

dongjoon-hyun left a comment

Choose a reason for hiding this comment

sunchao commented Apr 30, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sunchao Apr 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Apr 30, 2021

SparkQA commented Apr 30, 2021

SparkQA commented Apr 30, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sunchao Apr 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 5, 2021

SparkQA commented May 5, 2021

SparkQA commented May 5, 2021

SparkQA commented May 5, 2021

sunchao commented May 6, 2021

sunchao May 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun May 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 7, 2021

SparkQA commented May 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 7, 2021

SparkQA commented May 7, 2021

SparkQA commented May 8, 2021

dongjoon-hyun left a comment

Choose a reason for hiding this comment

sunchao commented May 8, 2021

SparkQA commented May 8, 2021

SparkQA commented May 8, 2021

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 8, 2021

sunchao commented Apr 30, 2021 •

edited

Loading

sunchao Apr 30, 2021 •

edited

Loading

sunchao Apr 30, 2021 •

edited

Loading

sunchao May 6, 2021 •

edited

Loading

dongjoon-hyun May 6, 2021 •

edited

Loading