-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6466][SQL] Remove unnecessary attributes when resolving GroupingSets #5134
Conversation
Test build #28988 has finished for PR 5134 at commit
|
Seems more reasonable to me if we do this in |
Good suggestion. I will do that later. Thanks. |
Test build #29080 has finished for PR 5134 at commit
|
/cc @liancheng @marmbrus |
val substitution = projections.map { groupExpr => | ||
val newExprs = groupExpr.collect { | ||
case x: NamedExpression if a.references.contains(x) => x | ||
case l: Literal => l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need special handling here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there are some constant null values and bitmasks we need to keep them.
This change should also be accompanied by test cases. There are several examples in the optimizer section of the tests. |
Test build #31045 has started for PR 5134 at commit |
I don't think it's the correct way for this optimization. Sorry @viirya , I will ping you once the refactoring finished. |
@chenghao-intel |
I mean Specifying the |
@chenghao-intel I originally remove the unnecessary columns from projections in |
Sorry, I didn't make it clear, we definitely should do the column pruning in Optimizer. And also the |
ok. I will update the unit test first. |
Test build #31137 has finished for PR 5134 at commit
|
@chenghao-intel do you think we still need this? If no, I should close it. |
@chenghao-intel OK. then I close this now. |
When resolving
GroupingSets
, we currently list all outputs ofGroupingSets
's child plan. However, the columns that are not in groupBy expressions and not used by aggregation expressions are unnecessary and can be removed.