-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7462] By default retain group by columns in aggregate #5996
Conversation
Merged build triggered. |
Merged build started. |
Test build #32181 has started for PR 5996 at commit |
Test build #32181 has finished for PR 5996 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build triggered. |
Merged build started. |
Test build #32192 has started for PR 5996 at commit |
Test build #32192 has finished for PR 5996 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
Merged build triggered. |
Merged build started. |
Test build #32206 has started for PR 5996 at commit |
We can also remove the workaround we used in SparkR Line 106 in f496bf3
Do you want to try this in this PR ? I can send a pull request to your branch too. |
Based on reverting code added in commit amplab-extras@9a6be74
Revert workaround in SparkR to retain grouped cols
Merged build triggered. |
Merged build started. |
Test build #32219 has started for PR 5996 at commit |
Test build #32206 has finished for PR 5996 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
Test build #32219 has finished for PR 5996 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
@@ -233,6 +236,9 @@ private[sql] class SQLConf extends Serializable { | |||
|
|||
private[spark] def dataFrameSelfJoinAutoResolveAmbiguity: Boolean = | |||
getConf(DATAFRAME_SELF_JOIN_AUTO_RESOLVE_AMBIGUITY, "true").toBoolean | |||
|
|||
private[spark] def dataFrameRetainGroupColumns: Boolean = | |||
getConf(DATAFRAME_RETAIN_GROUP_COLUMNS, "true").toBoolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasingly wondering if dataframe flags should be scoped (eager analysis affects sql(...)
too and not just dataframe DSL functions).,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's talk more about this. if we want to do it, we should do it in 1.4.
Conflicts: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Merged build triggered. |
Merged build started. |
Test build #32297 has started for PR 5996 at commit |
Test build #32297 timed out for PR 5996 at commit |
Merged build finished. Test FAILed. |
Test FAILed. |
Jenkins, retest this please. |
Merged build triggered. |
Merged build started. |
Test build #32375 has started for PR 5996 at commit |
Test build #32375 has finished for PR 5996 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
I'm merging this in branch-1.4. I will submit a followup PR for documentation. |
Updated Java, Scala, Python, and R. Author: Reynold Xin <rxin@databricks.com> Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #5996 from rxin/groupby-retain and squashes the following commits: aac7119 [Reynold Xin] Merge branch 'groupby-retain' of github.com:rxin/spark into groupby-retain f6858f6 [Reynold Xin] Merge branch 'master' into groupby-retain 5f923c0 [Reynold Xin] Merge pull request #15 from shivaram/sparkr-groupby-retrain c1de670 [Shivaram Venkataraman] Revert workaround in SparkR to retain grouped cols Based on reverting code added in commit amplab-extras@9a6be74 b8b87e1 [Reynold Xin] Fixed DataFrameJoinSuite. d910141 [Reynold Xin] Updated rest of the files 1e6e666 [Reynold Xin] [SPARK-7462] By default retain group by columns in aggregate (cherry picked from commit 0a4844f) Signed-off-by: Reynold Xin <rxin@databricks.com>
Updated Java, Scala, Python, and R. Author: Reynold Xin <rxin@databricks.com> Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes apache#5996 from rxin/groupby-retain and squashes the following commits: aac7119 [Reynold Xin] Merge branch 'groupby-retain' of github.com:rxin/spark into groupby-retain f6858f6 [Reynold Xin] Merge branch 'master' into groupby-retain 5f923c0 [Reynold Xin] Merge pull request apache#15 from shivaram/sparkr-groupby-retrain c1de670 [Shivaram Venkataraman] Revert workaround in SparkR to retain grouped cols Based on reverting code added in commit amplab-extras@9a6be74 b8b87e1 [Reynold Xin] Fixed DataFrameJoinSuite. d910141 [Reynold Xin] Updated rest of the files 1e6e666 [Reynold Xin] [SPARK-7462] By default retain group by columns in aggregate
Updated Java, Scala, Python, and R. Author: Reynold Xin <rxin@databricks.com> Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes apache#5996 from rxin/groupby-retain and squashes the following commits: aac7119 [Reynold Xin] Merge branch 'groupby-retain' of github.com:rxin/spark into groupby-retain f6858f6 [Reynold Xin] Merge branch 'master' into groupby-retain 5f923c0 [Reynold Xin] Merge pull request apache#15 from shivaram/sparkr-groupby-retrain c1de670 [Shivaram Venkataraman] Revert workaround in SparkR to retain grouped cols Based on reverting code added in commit amplab-extras@9a6be74 b8b87e1 [Reynold Xin] Fixed DataFrameJoinSuite. d910141 [Reynold Xin] Updated rest of the files 1e6e666 [Reynold Xin] [SPARK-7462] By default retain group by columns in aggregate
Updated Java, Scala, Python, and R. Author: Reynold Xin <rxin@databricks.com> Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes apache#5996 from rxin/groupby-retain and squashes the following commits: aac7119 [Reynold Xin] Merge branch 'groupby-retain' of github.com:rxin/spark into groupby-retain f6858f6 [Reynold Xin] Merge branch 'master' into groupby-retain 5f923c0 [Reynold Xin] Merge pull request apache#15 from shivaram/sparkr-groupby-retrain c1de670 [Shivaram Venkataraman] Revert workaround in SparkR to retain grouped cols Based on reverting code added in commit amplab-extras@9a6be74 b8b87e1 [Reynold Xin] Fixed DataFrameJoinSuite. d910141 [Reynold Xin] Updated rest of the files 1e6e666 [Reynold Xin] [SPARK-7462] By default retain group by columns in aggregate
Updated Java, Scala, Python, and R.