[CALCITE-3963] Maintains logical properties at RelSet (equivalent gro… by xndai · Pull Request #1992 · apache/calcite

xndai · 2020-05-29T05:18:06Z

…up) instead of RelNode

Add new LogicalNode interface that supports reporting stats estimation confidence.
Re-purpose set.rel and rename it into set.originalRel to report logical properties of RelSet.
When a new RelNode is added to the set, we check the stats confidence of the new node, and update set.originalRel if it has a higher confidence level.
Meta data handler will always report logical properties from set.originalRel for RelSubset.

xndai · 2020-05-29T05:26:22Z

core/src/test/java/org/apache/calcite/tools/PlannerTest.java

-        + "      EnumerableTableScan(table=[[hr, emps]])\n"
-        + "      EnumerableProject(deptno=[$0], name=[$1], employees=[$2], x=[$3.x], y=[$3.y])\n"
-        + "        EnumerableTableScan(table=[[hr, depts]])";
+        + "EnumerableProject(empid=[$0], deptno=[$1], name=[$2], salary=[$3], commission=[$4], deptno0=[$5], name0=[$6], employees=[$7], location=[ROW($8, $9)], empid0=[$10], name1=[$11])\n"


In LoptOptimizeRule, it always swap inputs to make sure smaller input is on the right side, but this would cause the cost of hash join to increase, so we end up picking merge join as the best plan. Previously since the row count of MultiJoin always returns 1 (using the default estimateRowCount() implementation which is wrong), it was incorrectly treated as smaller input, and thus generated hash join plan. With this change, the row count is corrected, but based on the rule behavior and cost model, the best plan now is merge join plan. If hash join is expected, then the LoptOptimizeRule needs to be fixed.

The other two plan changes are due to the same issue.

hsyuan · 2020-05-29T11:49:26Z

core/src/main/java/org/apache/calcite/plan/volcano/RelSet.java

+   * The logical properties of the RelSet, including row count, uniqueness, etc,
+   * are determined by this RelNode.
+   */
+  RelNode originalRel;


The name is a little bit misleading. Before this patch, it is indeed original rel, but after this patch, it isn't original rel anymore. We could just call it as it is.

Yep, I find it awkward too. Do you have any suggestions?

How about continue using rel?

hsyuan · 2020-05-29T11:56:22Z

core/src/main/java/org/apache/calcite/plan/volcano/RelSet.java

+    assert planner != null;
+
+    for (RelNode rel : getParentRels()) {
+      RelSet set = planner.getSet(rel);


If it is already pruned, can we skip?

core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java

…up) instead of RelNode 1. Add new LogicalNode interface that supports reporting stats estimation confidence. 2. Re-purpose set.rel and rename it into set.originalRel to report logical properties of RelSet. 3. When a new RelNode is added to the set, we check the stats confidence of the new node, and update set.originalRel if it has a higher confidence level. 4. Meta data handler will always report logical properties from set.originalRel for RelSubset.

xndai · 2020-06-15T22:32:44Z

core/src/test/resources/org/apache/calcite/test/TopDownOptTest.xml

-EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{7}])
-  EnumerableSort(sort0=[$1], dir0=[ASC])
+EnumerableSort(sort0=[$1], dir0=[ASC])
+  EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{7}])


Correlate node doesn't implement row count estimate so it always returns 1 as the default implementation, which makes it the best plan with minimal cost. After this change, since we report stats from RelSet using Join row count, we are able to get the truly best plan according to current cost model.

liyafan82 · 2020-06-17T02:48:21Z

core/src/main/java/org/apache/calcite/plan/volcano/RelSet.java

    final RelSubset subset = getOrCreateSubset(
        rel.getCluster(), traitSet, rel.isEnforcer());
    subset.add(rel);
+    checkAndUpdateOriginalRel(rel);


This call seems duplicate with the call in addInternal, as subset.add(rel) will call addInternal?

liyafan82 · 2020-06-17T03:20:05Z

core/src/main/java/org/apache/calcite/rel/LogicalNode.java

+  /**
+   * Confidence levels of statistics estimation
+   */
+  enum StatsEstimateConfidenceLevel {


It seems the elements form a partial order?
If so, is it approriate to compare them using compareTo?

xndai commented May 29, 2020

View reviewed changes

hsyuan reviewed May 29, 2020

View reviewed changes

hsyuan changed the title ~~[CACLITE-3963] Maintains logical properties at RelSet (equivalent gro…~~ [CALCITE-3963] Maintains logical properties at RelSet (equivalent gro… May 31, 2020

xndai force-pushed the calcite-3963 branch from b0ef0e1 to 35068ef Compare June 2, 2020 23:58

xndai added 4 commits June 15, 2020 11:15

Update according to Haisheng's feedback

ecdf5ab

Update to use RelSet original rel to report other logical properties

3d1658e

Fix plan diff in TopDownOptTest and CassandraAdapterTest

c9846ed

xndai force-pushed the calcite-3963 branch from 707756f to c9846ed Compare June 15, 2020 22:28

xndai commented Jun 15, 2020

View reviewed changes

liyafan82 reviewed Jun 17, 2020

View reviewed changes

vlsi force-pushed the master branch from e4458fa to 5462be9 Compare July 17, 2020 20:19

vlsi force-pushed the master branch from fd20efd to 3311d45 Compare December 9, 2020 20:10

julianhyde force-pushed the master branch from 52c1284 to d4e1eea Compare March 1, 2021 02:56

vlsi force-pushed the master branch from 7f65cf2 to 4bc9166 Compare March 24, 2021 09:43

zabetak force-pushed the master branch from f14cf4c to dcbc493 Compare March 10, 2022 09:13

julianhyde force-pushed the main branch from fa65a2e to 1226d1a Compare June 20, 2022 20:27

asfgit force-pushed the main branch from 9fc50f2 to e2f949d Compare September 10, 2022 16:37

asfgit force-pushed the main branch from f8f8a51 to a326bd2 Compare January 25, 2023 07:39

julianhyde force-pushed the main branch 2 times, most recently from 8a5cf83 to cf7f71b Compare June 8, 2023 21:21

tanclary force-pushed the main branch from 4804912 to 00db001 Compare September 6, 2023 00:42

libenchao force-pushed the main branch from 47db81a to 0be8eae Compare November 10, 2023 13:18

F21 force-pushed the main branch from 7d38212 to cacf36a Compare February 17, 2025 03:33

asolimando force-pushed the main branch from 19400ab to 2d5ec10 Compare May 28, 2025 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CALCITE-3963] Maintains logical properties at RelSet (equivalent gro…#1992

[CALCITE-3963] Maintains logical properties at RelSet (equivalent gro…#1992
xndai wants to merge 4 commits intoapache:mainfrom
xndai:calcite-3963

xndai commented May 29, 2020 •

edited by hsyuan

Loading

Uh oh!

xndai May 29, 2020 •

edited

Loading

Uh oh!

hsyuan May 29, 2020

Uh oh!

xndai Jun 2, 2020

Uh oh!

hsyuan Jun 3, 2020

Uh oh!

hsyuan May 29, 2020

Uh oh!

xndai Jun 2, 2020

Uh oh!

Uh oh!

xndai Jun 15, 2020

Uh oh!

liyafan82 Jun 17, 2020

Uh oh!

liyafan82 Jun 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xndai commented May 29, 2020 • edited by hsyuan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xndai May 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xndai commented May 29, 2020 •

edited by hsyuan

Loading

xndai May 29, 2020 •

edited

Loading