Skip to content

[CALCITE-4376] Materialized view recognition fails when target project different columns with GROUP BY#2277

Closed
yanlin-Lynn wants to merge 1 commit intoapache:masterfrom
yanlin-Lynn:CALCITE-4376
Closed

[CALCITE-4376] Materialized view recognition fails when target project different columns with GROUP BY#2277
yanlin-Lynn wants to merge 1 commit intoapache:masterfrom
yanlin-Lynn:CALCITE-4376

Conversation

@yanlin-Lynn
Copy link
Contributor

When target project a different columns with group by, materialized view recognition will fail, see the case below

@Test void testDifferentGroupBySequence() {
    final String mv = "" +
        "select \"deptno\", \"name\" from ("
        + "select \"name\", \"deptno\", \"commission\"\n"
        + "from \"emps\"\n"
        + " group by \"name\", \"deptno\", \"commission\") t";
    final String query = ""
        + "select \"deptno\", \"name\"\n"
        + "from \"emps\"\n"
        + "group by \"deptno\", \"name\"";
    sql(mv, query).withChecker(
        resultContains(""
            + "EnumerableTableScan(table=[[hr, MV0]])")).ok();
  }

After apply AggregateOnCalcToAggregateUnifyRule , query becomes

Holder
  Calc(program: (expr#0..1=[{inputs}], deptno=[$t1], name=[$t0]))
    Aggregate(groupSet: {0, 1}, groupSets: [{0, 1}], calls: []) (no match here)
      Aggregate(groupSet: {0, 1, 2}, groupSets: [{0, 1, 2}], calls: [])
        Calc(program: (expr#0..4=[{inputs}], name=[$t2], deptno=[$t1], commission=[$t4]))
          Scan(table: [hr, emps])

The target is

Calc(program: (expr#0..2=[{inputs}], deptno=[$t1], name=[$t0]))
  Aggregate(groupSet: {0, 1, 2}, groupSets: [{0, 1, 2}], calls: [])
    Calc(program: (expr#0..4=[{inputs}], name=[$t2], deptno=[$t1], commission=[$t4]))
      Scan(table: [hr, emps])

There is no match for
Aggregate(groupSet: {0, 1}, groupSets: [{0, 1}], calls: []) in query
and
Calc(program: (expr#0..2=[{inputs}], deptno=[$t1], name=[$t0])). in mv.

Always add a Calc between target aggregate and the rolled up aggregate,
but the Calc just projects columns used by the rolled up aggregate.
So we want the query to be like this:

Holder
  Calc(program: (expr#0..1=[{inputs}], deptno=[$t1], name=[$t0]))
    Aggregate(groupSet: {0, 1}, groupSets: [{0, 1}], calls: [])
     Calc (xxx) (Always add a Calc here)
      Aggregate(groupSet: {0, 1, 2}, groupSets: [{0, 1, 2}], calls: [])
        Calc(program: (expr#0..4=[{inputs}], name=[$t2], deptno=[$t1], commission=[$t4]))
          Scan(table: [hr, emps])

…t different columns with GROUP BY (Wang Yanlin)
this.target = MutableRels.toMutable(target_);
if (mutableQuery instanceof MutableCalc) {
this.query = Holder.of(mutableQuery);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we always try to compensate Calc in the original RelNode. Why can't we do it through custom normalization, or there will be more patterns in the future materialized view recognition. There should be a unified way to implement it.
How do you think?

final String optimized = ""
+ "LogicalProject(deptno=[CAST($0):TINYINT], count_sal=[$1])\n"
+ " LogicalTableScan(table=[[mv0]])\n";
+ "LogicalCalc(expr#0..1=[{inputs}], proj#0..1=[{exprs}])\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogicalCalc seems like unnecessary.

+ " EnumerableTableScan(table=[[hr, MV0]])"))
+ "LogicalCalc(expr#0..1=[{inputs}], expr#2=[1], expr#3=[+($t1, $t2)], C=[$t3], "
+ "deptno=[$t0])\n"
+ " LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)])\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to add LogicalCalc? Is it possible to be compatible with the original? If LogicalCalc is added, it may not be the optimal RelNode`, which will increase the cost of calculation in physical execution.

} else {
List<? extends RexNode> rexNodes =
cluster.getRexBuilder().identityProjects(mutableQuery.rowType);
RexProgram program = RexProgram.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a PR[1] to enhance the normalization ability before the recognition of materialized views by defining normalization rules. In the unit test, the ability of materialized recognition is enhanced by compensating the Calc operator. Through this way to achieve better versatility.
[1] Add an interface in RelOptMaterializations to allow registering normalization rules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants