-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-4717][VL] Adapting the bind reference of agg that contains subquery in agg expressions #4705
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
@PHILO-HE Could you help to review? This bug causes some agg fallback. |
3ab5108
to
0872183
Compare
Run Gluten Clickhouse CI |
0872183
to
6c30e8b
Compare
Run Gluten Clickhouse CI |
6c30e8b
to
4bc0a11
Compare
Run Gluten Clickhouse CI |
Seems there are some relevant test failure. Please fix it. Thanks! |
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems aggregate buffer attribute name is not unique, e.g., two sum aggregate functions in one aggregate SELECT sum(select c1 from t), sum(select c2 from t) FROM t
. How about use aggregate buffer attribute index ?
I just switched to using index, can you help review the latest code? |
These failed UTs include some other scenarios that need to be addressed. The current code has been modified to be more comprehensive, and these failed UTs used to be fallback. |
// exprId. | ||
val attrsWithSameName = originalInputAttributes.collect { | ||
case a if a.name == attr.name => a | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if originalInputAttributes (grouping keys) has an attribute naing sum
,count
, etc ? Can we add a test case for that case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bad case; perhaps we should remove the group key in front of originalInputAttributes
, because the output of the final agg's child is always the group key ++ inputAggBufferAttributes
.
Run Gluten Clickhouse CI |
|""".stripMargin)( | ||
df => assert(getExecutedPlan(df).count(_.isInstanceOf[HashAggregateExecTransformer]) == 2)) | ||
|
||
runQueryAndCompare(""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ulysses-you Add a new test case.
lgtm if test pass |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Thanks for your great work! I just created an issue for possible reference in the future. |
What changes were proposed in this pull request?
When
aggregateExpressions
includes a subquery, Spark'sPlanAdaptiveSubqueries
Rule will transform the subquery within the final aggregation. TheaggregateFunction
in theaggregateExpressions
of the final aggregation will be cloned, resulting in creating newaggregateFunction
objects. TheinputAggBufferAttributes
will also generate newAttributeReference
instances with largerexprId
, which leads to a failure in binding with the output of the partial aggregation. We need to adapt to this situation; when encountering a failure to bind, it is necessary to allow the binding ofinputAggBufferAttribute
with the same name but differentexprId
.Fixes #4717.
How was this patch tested?
Add a test case in
VeloxAggregateFunctionsSuite
.