New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminates these repeated computation in multi aggregations query #45

Merged
merged 1 commit into from Oct 30, 2017

Conversation

Projects
None yet
4 participants
@RayeRen
Contributor

RayeRen commented Sep 23, 2017

When more than one aggregation are requested in one query, tispark will push down some aggregations repeatedly, like count(number#0L) in the following example.

scala> spark.sql ("select count(number),avg(number) from person").explain
== Physical Plan ==
*HashAggregate(keys=[], functions=[sum(count(number#0L)#42L), sum(sum(number#0L)#43L), sum(count(number#0L)#45L)])
+- Exchange SinglePartition
   +- *HashAggregate(keys=[], functions=[partial_sum(count(number#0L)#42L), partial_sum(sum(number#0L)#43L), partial_sum(count(number#0L)#45L)])
      +- Scan CoprocessorRDD[count(number#0L)#42L,sum(number#0L)#43L,count(number#0L)#45L]

This PR eliminates these repeated computation:

scala> spark.sql ("select count(number),avg(number) from person").explain()
== Physical Plan ==
*HashAggregate(keys=[], functions=[sum(count(number#0L)#20L), sum(sum(number#0L)#21L), sum(count(number#0L)#20L)])
+- Exchange SinglePartition
   +- *HashAggregate(keys=[], functions=[partial_sum(count(number#0L)#20L), partial_sum(sum(number#0L)#21L), partial_sum(count(number#0L)#20L)])
      +- Scan CoprocessorRDD[count(number#0L)#20L,sum(number#0L)#21L]
@RayeRen

This comment has been minimized.

Show comment
Hide comment
@RayeRen

RayeRen Oct 9, 2017

Contributor

Change references.head to expr.children.head so that it shouldn't be null anymore.

Contributor

RayeRen commented Oct 9, 2017

Change references.head to expr.children.head so that it shouldn't be null anymore.

@zhexuany

This comment has been minimized.

Show comment
Hide comment
@zhexuany

zhexuany Oct 26, 2017

Member

@RayeRen friendly ping. any updates?

Member

zhexuany commented Oct 26, 2017

@RayeRen friendly ping. any updates?

@RayeRen

This comment has been minimized.

Show comment
Hide comment
@RayeRen

RayeRen Oct 26, 2017

Contributor

@zhexuany Yeah, I accepted @Novemser 's suggestion. Thanks a lot!

Contributor

RayeRen commented Oct 26, 2017

@zhexuany Yeah, I accepted @Novemser 's suggestion. Thanks a lot!

@zhexuany

This comment has been minimized.

Show comment
Hide comment
@zhexuany
Member

zhexuany commented Oct 29, 2017

@ilovesoup PTAL.

@ilovesoup

LGTM

@ilovesoup ilovesoup merged commit dcca23b into pingcap:master Oct 30, 2017

@Novemser Novemser referenced this pull request Dec 27, 2017

Merged

Simplify aggregation push-down #153

2 of 2 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment