-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark-1461] Deferred Expression Evaluation (short-circuit evaluation) #446
[Spark-1461] Deferred Expression Evaluation (short-circuit evaluation) #446
Conversation
Can one of the admins verify this patch? |
A few high-level comments:
|
Thank you @marmbrus , if we are not planning to support the stateful UDFs, the deferred evaluation can be removed, too, and we can just change the ordering of calling the eval method for the existed expressions for short-circuit evaluation. |
@marmbrus , I've removed the unrelated changes from this PR, and this PR only for the deferred expression evaluation. |
@@ -146,7 +158,7 @@ case class If(predicate: Expression, trueValue: Expression, falseValue: Expressi | |||
extends Expression { | |||
|
|||
def children = predicate :: trueValue :: falseValue :: Nil | |||
def nullable = trueValue.nullable || falseValue.nullable | |||
override def nullable = predicate.nullable || (trueValue.nullable && falseValue.nullable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was correct before. The nullability of the predicate
does not affect the nullability of the output since a null predicate will just cause the falseValue
to be output, not null.
ok to test |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14285/ |
@andrewor14, any idea what is up with |
It has the same problem as |
#446 faced a connection refused exception from these tests, causing them to timeout and fail after a long time. For now, let's disable these tests. (We recently disabled the corresponding test in streaming in 7863ecc. These tests are very similar). Author: Andrew Or <andrewor14@gmail.com> Closes #466 from andrewor14/ignore-ui-tests and squashes the following commits: 6f5a362 [Andrew Or] Ignore org.apache.spark.ui.UISuite tests (cherry picked from commit af46f1f) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14311/ |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14314/ |
test this please. |
Build triggered. |
Build started. |
Build finished. All automated tests passed. |
All automated tests passed. |
LGTM @pwendell can you please merge? |
@chenghao-intel this has some merge conflicts - mind updating it? |
Merged build triggered. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
false | ||
} else if (l == null || r == null ) { | ||
null | ||
if(l == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mind adding a space here?
Thanks @rxin. |
Merged build triggered. |
Merged build started. |
LGTM. I will merge it once Jenkins comes back green. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
I've merged this. Thanks a lot! |
This patch unify the foldable & nullable interface for Expression. 1) Deterministic-less UDF (like Rand()) can not be folded. 2) Short-circut will significantly improves the performance in Expression Evaluation, however, the stateful UDF should not be ignored in a short-circuit evaluation(e.g. in expression: col1 > 0 and row_sequence() < 1000, row_sequence() can not be ignored even if col1 > 0 is false) I brought an concept of DeferredObject from Hive, which has 2 kinds of children classes (EagerResult / DeferredResult), the former requires triggering the evaluation before it's created, while the later trigger the evaluation when first called its get() method. Author: Cheng Hao <hao.cheng@intel.com> Closes #446 from chenghao-intel/expression_deferred_evaluation and squashes the following commits: d2729de [Cheng Hao] Fix the codestyle issues a08f09c [Cheng Hao] fix bug in or/and short-circuit evaluation af2236b [Cheng Hao] revert the short-circuit expression evaluation for IF b7861d2 [Cheng Hao] Add Support for Deferred Expression Evaluation (cherry picked from commit a20fea9) Signed-off-by: Reynold Xin <rxin@apache.org>
apache#446 faced a connection refused exception from these tests, causing them to timeout and fail after a long time. For now, let's disable these tests. (We recently disabled the corresponding test in streaming in 7863ecc. These tests are very similar). Author: Andrew Or <andrewor14@gmail.com> Closes apache#466 from andrewor14/ignore-ui-tests and squashes the following commits: 6f5a362 [Andrew Or] Ignore org.apache.spark.ui.UISuite tests
This patch unify the foldable & nullable interface for Expression. 1) Deterministic-less UDF (like Rand()) can not be folded. 2) Short-circut will significantly improves the performance in Expression Evaluation, however, the stateful UDF should not be ignored in a short-circuit evaluation(e.g. in expression: col1 > 0 and row_sequence() < 1000, row_sequence() can not be ignored even if col1 > 0 is false) I brought an concept of DeferredObject from Hive, which has 2 kinds of children classes (EagerResult / DeferredResult), the former requires triggering the evaluation before it's created, while the later trigger the evaluation when first called its get() method. Author: Cheng Hao <hao.cheng@intel.com> Closes apache#446 from chenghao-intel/expression_deferred_evaluation and squashes the following commits: d2729de [Cheng Hao] Fix the codestyle issues a08f09c [Cheng Hao] fix bug in or/and short-circuit evaluation af2236b [Cheng Hao] revert the short-circuit expression evaluation for IF b7861d2 [Cheng Hao] Add Support for Deferred Expression Evaluation
This patch unify the foldable & nullable interface for Expression.
I brought an concept of DeferredObject from Hive, which has 2 kinds of children classes (EagerResult / DeferredResult), the former requires triggering the evaluation before it's created, while the later trigger the evaluation when first called its get() method.