-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29343][SQL][FOLLOW-UP] Remove floating-point Sum/Average/CentralMomentAgg from order-insensitive aggregates #26534
Conversation
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
Show resolved
Hide resolved
Test build #113828 has finished for PR 26534 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (Pending Jenkins.)
The previous failure might occur again due to the recent Python change.
cc @HyukjinKwon
Test build #113836 has finished for PR 26534 at commit
|
retest this please |
Test build #113858 has finished for PR 26534 at commit
|
retest this please |
Test build #113873 has finished for PR 26534 at commit
|
case _: Average => true | ||
case _: CentralMomentAgg => true | ||
// Arithmetic operations for floating-point values are order-sensitive | ||
// (they are not associative). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding more description.
Seq("float", "double").foreach { typeName => | ||
Seq("SUM", "AVG", "KURTOSIS", "SKEWNESS", "STDDEV_POP", "STDDEV_SAMP", | ||
"VAR_POP", "VAR_SAMP").foreach { aggName => | ||
val query1 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. query1
-> query
since we remove the other test query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maropu , you can ignore this renaming comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, right.
@@ -1002,12 +1002,13 @@ object EliminateSorts extends Rule[LogicalPlan] { | |||
|
|||
private def isOrderIrrelevantAggs(aggs: Seq[NamedExpression]): Boolean = { | |||
def isOrderIrrelevantAggFunction(func: AggregateFunction): Boolean = func match { | |||
case _: Sum => true | |||
case _: Min => true | |||
case _: Max => true | |||
case _: Count => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can collapse these cases too while you're here, but not necessary
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
Show resolved
Hide resolved
Test build #113882 has finished for PR 26534 at commit
|
Test build #113897 has finished for PR 26534 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM, again. Thank you all!
Merged to master.
Thanks as always, @dongjoon-hyun ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too. Thanks for cc'ing me.
What changes were proposed in this pull request?
This pr is to remove floating-point
Sum/Average/CentralMomentAgg
from order-insensitive aggregates inEliminateSorts
.This pr comes from the @gatorsmile suggestion: #26011 (comment)
Why are the changes needed?
Bug fix.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added tests in
SubquerySuite
.