[CALCITE-6749] RelMdUtil#setAggChildKeys may return an incorrect result#4115
[CALCITE-6749] RelMdUtil#setAggChildKeys may return an incorrect result#4115rubenada merged 1 commit intoapache:mainfrom
Conversation
|
As a consequence of the fix, there are a couple of plan changes on *.iq tests, but the result is still correct. |
| /** Test case for | ||
| * <a href="https://issues.apache.org/jira/browse/CALCITE-6749">[CALCITE-6749] | ||
| * RelMdUtil#setAggChildKeys may return an incorrect result</a>. */ | ||
| @Test void testSetAggChildKeys() { |
There was a problem hiding this comment.
Frankly, this is hard to review.
I saw the discussion in the JIRA, but it's not obvious at all that "9" is the right answer.
I wonder whether having the full plan here would actually be easier - at least in a comment.
There was a problem hiding this comment.
Basically we need to convert: "field index X on the Aggregate corresponds to which field index Y on the Aggregate's input?"
If we have this plan
LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)])
LogicalProject(DEPTNO=[$9], JOB=[$2])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
The Aggregate rowType has two fields (0 and 1), the first one (index 0) corresponds to the group 0, and the second one (index 1) corresponds to the argCall COUNT(DISTINCT $1). In this case, if we want to convert into the input's, we'll have 0 => 0, and 1 => 1; there is no problem on the current code.
However, if we have this equivalent plan (same query plan, but without the intermediate Project):
LogicalAggregate(group=[{9}], EXPR$1=[COUNT(DISTINCT $2)])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
Now the Aggregate still has two fields (0 and 1), the first one (index 0) corresponds to the group 9, and the second one (index 1) corresponds to the argCall COUNT(DISTINCT $2). In this case, if we want to convert into the input's, we'll have 0 => 9, and 1 => 2; the current code will work fine only for the argCall conversion, but not for the group conversion.
Notice I'm just applying here the same conversion that exists already e.g. when propagating the RelMdColumnOrigins computation past an Aggregate (in terms of our second example: "if we want to get origin of the column 0 of the Aggregate, we need to compute the origin of the column 9 of the Aggregate's input"). The auxiliary method RelMdUtil.setAggChildKeys (used by RelMdDistinctRowCount and RelMdPopulationSize for Aggregates) should behave in the same way as RelMdColumnOrigin for this conversion.
|



see https://issues.apache.org/jira/browse/CALCITE-6749