-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23582][SQL] StaticInvoke should support interpreted execution #20753
Conversation
Test build #88028 has finished for PR 20753 at commit
|
val parms = arguments.map(e => e.eval(input).asInstanceOf[Object]) | ||
val method = staticObject.getDeclaredMethod(functionName, parmTypes : _*) | ||
val ret = method.invoke(null, parms : _*) | ||
val retClass = CallMethodViaReflection.typeMapping.getOrElse(dataType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to support more types (e.g. ArrayType
)
ping @hvanhovell |
val method = staticObject.getDeclaredMethod(functionName, parmTypes : _*) | ||
val ret = method.invoke(null, parms : _*) | ||
val retClass = CallMethodViaReflection.typeMapping.getOrElse(dataType, | ||
Seq(dataType.asInstanceOf[ObjectType].cls))(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will dataType
always be an ObjectType
here? If returned data type is CalendarIntervalType
?
|
||
val parmTypes = arguments.map(e => | ||
CallMethodViaReflection.typeMapping.getOrElse(e.dataType, | ||
Seq(e.dataType.asInstanceOf[ObjectType].cls))(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The external types of native types CalendarIntervalType
and BinaryType
are not ObjectType
. If we have arguments in those types, we may not use ObjectType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I have to support other types before merging this change.
This is a small prototype for discussing whether we use reflection or not.
Test build #88029 has finished for PR 20753 at commit
|
override def eval(input: InternalRow): Any = | ||
throw new UnsupportedOperationException("Only code-generated evaluation is supported.") | ||
override def eval(input: InternalRow): Any = { | ||
if (staticObject == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this check, for sure. See line 132,
val objectName = staticObject.getName.stripSuffix("$")
^^ this will run as a part of the constructor, which will throw an NPE if staticObject
is null
, so it's redundant to null check it here.
val parmTypes = arguments.map(e => | ||
CallMethodViaReflection.typeMapping.getOrElse(e.dataType, | ||
Seq(e.dataType.asInstanceOf[ObjectType].cls))(0)) | ||
val parms = arguments.map(e => e.eval(input).asInstanceOf[Object]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need null checks here for inputs? Also, can we add a common function in InvokeLike
to handle input arguments for other InvokeLike
eprs? (I mean the interpreted version of InvokeLike.prepareArguments
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is good since this would be used in 'Invoke', too. After supporting other types, I will create an utility function.
To discuss design choice (reflection or method handle), I put earlier version.
@hvanhovell Is it better to implement method handler approach, too? |
@kiszk I think we should benchmark it. My rational for considering method handles is that they seem to be made for this purpose, and they should become more performant with newer versions of java. |
Which use case is typical? The same method is called many times. Are different method called? If the same method is called, reflection can also improve performance by creating custom bytecode. |
Ok, let's go with reflection then. cc @rednaxelafx |
and cc @viirya |
Sure |
CallMethodViaReflection.typeMapping.getOrElse(e.dataType, | ||
Seq(e.dataType.asInstanceOf[ObjectType].cls))(0)) | ||
val parms = arguments.map(e => e.eval(input).asInstanceOf[Object]) | ||
val method = staticObject.getDeclaredMethod(functionName, parmTypes : _*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we can also move getDeclaredMethod
outside eval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this movement caused failure of reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment following @maropu 's.
BTW, on the choice between reflection and method handles, the latter might be slightly faster if the code pattern is right (that the underlying JVM likes...). But in general invoking via old school reflection and invoking via MethodHandle
s are not that different. We always have the option to put the reflection-based version in first and then see if we can further improve it with MethodHandle
s.
override def eval(input: InternalRow): Any = | ||
throw new UnsupportedOperationException("Only code-generated evaluation is supported.") | ||
override def eval(input: InternalRow): Any = { | ||
if (staticObject == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this check, for sure. See line 132,
val objectName = staticObject.getName.stripSuffix("$")
^^ this will run as a part of the constructor, which will throw an NPE if staticObject
is null
, so it's redundant to null check it here.
@hvanhovell @kiszk Yes, the old school JDK reflection does perform bytecode generation to speed it up, since the JDK1.4 era. But the advantage of fully optimized But note that even in OpenJDK8, the HotSpot VM is still exploring ways to optimize |
f9f5701
to
9a36442
Compare
Test build #88090 has finished for PR 20753 at commit
|
Test build #88091 has finished for PR 20753 at commit
|
Test build #88093 has finished for PR 20753 at commit
|
Test build #88097 has finished for PR 20753 at commit
|
Test build #88089 has finished for PR 20753 at commit
|
Test build #88100 has finished for PR 20753 at commit
|
ret | ||
} else { | ||
// cast a primitive value using Boxed class | ||
val boxedClass = CallMethodViaReflection.typeBoxedJavaMapping(dataType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can directly do like this without the if
.
val boxedClass = CallMethodViaReflection.typeBoxedJavaMapping.get(dataType)
boxedClass.map(_.cast(ret)).getOrElse(ret)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. When discussions related to
typeJavaMapping,
typeBoxedJavaMapping`, and others are fixed, I will address this.
case ObjectType(cls) => cls | ||
case _ => typeJavaMapping.getOrElse(dt, classOf[java.lang.Object]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above should be in CallMethodViaReflection
or CodeGenerator
?
@@ -95,6 +162,21 @@ class ObjectExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { | |||
checkEvaluation(createExternalRow, Row.fromSeq(Seq(1, "x")), InternalRow.fromSeq(Seq())) | |||
} | |||
|
|||
// This is an alternative version of `checkEvaluation` to compare results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is intentionally imported for testing from #20757
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this pr is merged first, I'll remove this in my pr.
ping @hvanhovell |
1 similar comment
ping @hvanhovell |
Test build #88656 has finished for PR 20753 at commit
|
retest this please |
Test build #88663 has finished for PR 20753 at commit
|
|
retest this please |
Test build #88687 has finished for PR 20753 at commit
|
retest this please |
Test build #88822 has finished for PR 20753 at commit
|
import checkObjectExprEvaluation for test
Test build #88901 has finished for PR 20753 at commit
|
Test build #88905 has finished for PR 20753 at commit
|
Test build #88922 has finished for PR 20753 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - merging to master. Thanks!
## What changes were proposed in this pull request? This pr added interpreted execution for `StaticInvoke`. ## How was this patch tested? Added tests in `ObjectExpressionsSuite`. Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#20753 from kiszk/SPARK-23582.
What changes were proposed in this pull request?
This pr added interpreted execution for
StaticInvoke
.How was this patch tested?
Added tests in
ObjectExpressionsSuite
.