Skip to content

Comments

Grouping keys incorrectly reduce where they shouldn't#12047

Closed
LakshSingla wants to merge 2 commits intoapache:masterfrom
LakshSingla:grouping-keys-false-reduce-2
Closed

Grouping keys incorrectly reduce where they shouldn't#12047
LakshSingla wants to merge 2 commits intoapache:masterfrom
LakshSingla:grouping-keys-false-reduce-2

Conversation

@LakshSingla
Copy link
Contributor

@LakshSingla LakshSingla commented Dec 9, 2021

Description

In Grouping#applyProject, their is an attempt made to reduce the dimensions on which group by is done, if they are not appearing in the final result/Project.
However, this ends up removing all of the literal dimensions which are present in the group by.
Consider the following query

SELECT dim1, 'dummy' FROM druid.foo GROUP BY dim1, 'dummy'

According to our intentions, the 'dummy' shouldn't be removed in the GROUP BY, since it is present in the projection, but it gets removed (Due to the implementation of the RelOptUtil.InputFinder.bits(project.getChildExps(), null); not setting bit corresponding to dummy to true. It is a RexLiteral as opposed to RexInputRef.

In a more obscure case like:

SELECT 'A', dim1 FROM druid.foo WHERE m1 = 10310313 AND dim1 = 'dummy' GROUP BY dim1

The dim1 gets reduced to 'dummy' by Calcite (and is treated as a literal), which gets subsequently removed by the current implementation of the Groupings#applyProject, and it results a row of ('A', 'dummy') when it shouldn't have returned anything.

This fix changes the visitor which is used to find the literals in the input expression, and directly compares the DruidExpression of the literal in the input expression, with the dimensions in the group by.


Key changed/added classes in this PR
  • Grouping#applyProject

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat fix, @LakshSingla!
Requested a couple more tests, otherwise LGTM.

// actually want to include a dimension 'dummy'.
final ImmutableBitSet aggregateProjectBits = RelOptUtil.InputFinder.bits(project.getChildExps(), null);
final int[] newDimIndexes = new int[dimensions.size()];
Set<DruidExpression> literalsInInput = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's only a literal, could we just keep a Set of String here?

}

@Test
public void testRemovalOfRedundantLiteralsInGroupBy() throws Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some simpler test cases too where
a) the literal is not a part of the projection and should be removed
b) the literal is a part of the projection and should not be removed

@stale
Copy link

stale bot commented Apr 17, 2022

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Apr 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants