-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20335] [SQL] Children expressions of Hive UDF impacts the determinism of Hive UDF #17635
Conversation
@@ -509,6 +509,19 @@ abstract class AggregationQuerySuite extends QueryTest with SQLTestUtils with Te | |||
Row(null, null, 110.0, null, null, 10.0) :: Nil) | |||
} | |||
|
|||
test("non-deterministic children expressions of UDAF") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to improve the test case coverage.
@@ -84,6 +85,21 @@ class HiveUDAFSuite extends QueryTest with TestHiveSingleton with SQLTestUtils { | |||
Row(1, Row(1, 1)) | |||
)) | |||
} | |||
|
|||
test("non-deterministic children expressions of UDAF") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to improve the test case coverage.
Test build #75793 has started for PR 17635 at commit |
Test build #75791 has finished for PR 17635 at commit
|
test("non-deterministic children expressions of UDAF") { | ||
withTempView("view1") { | ||
spark.range(1).selectExpr("id as x", "id as y").createTempView("view1") | ||
withUserDefinedFunction("testUDAFPercentile" -> true, "testMock" -> true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testMock? Do we use it?
LGTM except a minor comment on the test. |
Test build #75824 has finished for PR 17635 at commit
|
cc @cloud-fan |
+1 |
LGTM, merging to master! @gatorsmile shall we backport this PR? |
Maybe, yes. Will do it later. Thank you! |
…acts the determinism of Hive UDF ### What changes were proposed in this pull request? This PR is to backport #17635 to Spark 2.1 --- ```JAVA /** * Certain optimizations should not be applied if UDF is not deterministic. * Deterministic UDF returns same result each time it is invoked with a * particular input. This determinism just needs to hold within the context of * a query. * * return true if the UDF is deterministic */ boolean deterministic() default true; ``` Based on the definition of [UDFType](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L42-L50), when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic. ### How was this patch tested? Added test cases. Author: Xiao Li <gatorsmile@gmail.com> Closes #17652 from gatorsmile/backport-17635.
…minism of Hive UDF ### What changes were proposed in this pull request? ```JAVA /** * Certain optimizations should not be applied if UDF is not deterministic. * Deterministic UDF returns same result each time it is invoked with a * particular input. This determinism just needs to hold within the context of * a query. * * return true if the UDF is deterministic */ boolean deterministic() default true; ``` Based on the definition of [UDFType](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L42-L50), when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic. ### How was this patch tested? Added test cases. Author: Xiao Li <gatorsmile@gmail.com> Closes apache#17635 from gatorsmile/udfDeterministic.
What changes were proposed in this pull request?
Based on the definition of UDFType, when Hive UDF's children are non-deterministic, Hive UDF is also non-deterministic.
How was this patch tested?
Added test cases.