Grouping keys incorrectly reduce where they shouldn't#12047
Grouping keys incorrectly reduce where they shouldn't#12047LakshSingla wants to merge 2 commits intoapache:masterfrom
Conversation
kfaraz
left a comment
There was a problem hiding this comment.
Neat fix, @LakshSingla!
Requested a couple more tests, otherwise LGTM.
| // actually want to include a dimension 'dummy'. | ||
| final ImmutableBitSet aggregateProjectBits = RelOptUtil.InputFinder.bits(project.getChildExps(), null); | ||
| final int[] newDimIndexes = new int[dimensions.size()]; | ||
| Set<DruidExpression> literalsInInput = new HashSet<>(); |
There was a problem hiding this comment.
If it's only a literal, could we just keep a Set of String here?
| } | ||
|
|
||
| @Test | ||
| public void testRemovalOfRedundantLiteralsInGroupBy() throws Exception |
There was a problem hiding this comment.
Maybe add some simpler test cases too where
a) the literal is not a part of the projection and should be removed
b) the literal is a part of the projection and should not be removed
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
Description
In Grouping#applyProject, their is an attempt made to reduce the dimensions on which group by is done, if they are not appearing in the final result/
Project.However, this ends up removing all of the literal dimensions which are present in the group by.
Consider the following query
According to our intentions, the 'dummy' shouldn't be removed in the GROUP BY, since it is present in the projection, but it gets removed (Due to the implementation of the
RelOptUtil.InputFinder.bits(project.getChildExps(), null);not setting bit corresponding todummyto true. It is aRexLiteralas opposed toRexInputRef.In a more obscure case like:
The dim1 gets reduced to 'dummy' by Calcite (and is treated as a literal), which gets subsequently removed by the current implementation of the Groupings#applyProject, and it results a row of ('A', 'dummy') when it shouldn't have returned anything.
This fix changes the visitor which is used to find the literals in the input expression, and directly compares the DruidExpression of the literal in the input expression, with the dimensions in the group by.
Key changed/added classes in this PR
Grouping#applyProjectThis PR has: