[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

hbtoo · 2020-10-02T21:13:35Z

No description provided.

hsyuan

@hbtoo Thanks for your pull request. It seems like there are some test failures for Druid tests. Can you take a look? I believe most of them are plan diffs that are caused by the BFS strategy change.

hbtoo · 2020-10-06T21:53:43Z

@hbtoo Thanks for your pull request. It seems like there are some test failures for Druid tests. Can you take a look? I believe most of them are plan diffs that are caused by the BFS strategy change.

I did an investigation and found out that the old and new best plan has exactly the same cost. The best plan switched because of the change in update order in this PR. So this is expected, so I updated the expected result of the tests.

liupc · 2020-10-09T02:58:20Z

LGTM.

danny0405 · 2020-10-10T09:28:47Z

core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java

-      // This subset is already in the chain being propagated to. This
-      // means that the graph is cyclic, and therefore the cost of this
-      // relational expression - not this subset - must be infinite.
-      LOGGER.trace("cyclic: {}", this);


The cyclic check has been removed, does it mean the code is useless now ?

The cyclic check is necessary for the old update logic, i.e. DFS. Now since it is a Dijkstra like algorithm, always propagating the changed relNode with smallest best cost, the update will automatically stop after traveling a full cycle. So no special handling is needed any more.

Are there any other code that we can check a cyclic path now ?

I am not sure if there's any. Note that the cycle detection code I deleted here is already not working in this BFS implementation. It is left-over dead code when we changed from DFS to BFS.

hsyuan · 2020-10-10T17:09:42Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+   * @param mq        Metadata query
+   * @param rel       Relational expression whose cost has improved
+   */
+  void propagateCostImprovements(VolcanoPlanner planner,


Why do you need planner as a parameter?

good point, removed

hsyuan · 2020-10-10T17:12:17Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+   * @param rel       Relational expression whose cost has improved
+   */
+  void propagateCostImprovements(VolcanoPlanner planner,
+      RelMetadataQuery mq, RelNode rel) {


mq can be retrieved from rel.getCluster().getMetadataQuery(), so mq is not needed.

good point, removed

hsyuan · 2020-10-10T17:13:37Z

@vlsi Why druid test is not needed?

vlsi · 2020-10-10T17:37:01Z

Druid tests execute always, and there's no need in extra label

vlsi · 2020-10-10T19:01:39Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+            // since best was changed, cached metadata for this subset should be removed
+            mq.clearCache(subset);
+
+            subset.getParents().forEach(parent -> {


Should this be a regular for loop?

I thought they are syntactically the same?

forEach ends with }); which is slightly less nice than }

ok, changed to for loop

vlsi · 2020-10-10T19:05:36Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+              mq.clearCache(parent);
+              if (propagateRels.put(parent, getCost(parent, mq)) != null) {
+                // Cost changed, force the heap to adjust its ordering
+                propagateHeap.remove(parent);


This is O(N) :-/

True. However note that when this code is hit, it means that we've found a second update path to this same relNode. This is exactly the case where this patch improves performance. Before this patch, it means that this relNode (and thus everything above it) will be updated twice. With this patch, we only need to repopulate the heap with this relNode with updated cost. Although it is indeed O(N), it is already much faster than before.

Also note that technically, the heap implementation can be improved to reduce this from O(N) to O(logN). However it might not be essential here.

vlsi · 2020-10-10T19:11:09Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+        if (relNode.getTraitSet().satisfies(subset.getTraitSet())) {
+          // Update subset best cost when we find a cheaper rel or the current
+          // best's cost is changed
+          if (cost.isLt(subset.bestCost)) {


It would be nice to reduce the nesting level with continue;

danny0405 · 2020-10-11T03:20:41Z

core/src/main/java/org/apache/calcite/plan/volcano/RelSet.java

    for (Map.Entry<RelSubset, RelNode> subsetBestPair : changedSubsets.entrySet()) {
-      RelSubset relSubset = subsetBestPair.getKey();


The changedSubsets keys are never used.

danny0405 · 2020-10-11T03:32:10Z

core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java

+    RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
+    Map<RelNode, RelOptCost> propagateRels = new HashMap<>();
+    PriorityQueue<RelNode> propagateHeap = new PriorityQueue<>((o1, o2) -> {
+      RelOptCost c1 = propagateRels.get(o1);


After this change, the propagation is neither BFS nor DFS, can this cause problem such as StackOverFlow when the rel
node hierarchy is very deep ?

Although not exactly BFS, but it (Dijkstra) works very similar to BFS, I'd like to view it as a controlled special type of BFS. I think the heap size here and the queue size in BFS should be about the same.

It is different with Dijkstra, especially when you put different hierarchy nodes into one queue and only sort them with cost.

Or, can we make the promotion evidence more clear ? Is there any possibility that we do some benchmark test ?

If you consider cost as a kind of distance between relnodes/subsets, this propagation process is basically Dijkstra in a directed graph. Computing the best plan in this directed graph is finding the "shortest" path.

I did compare the running time using some big queries, with the patch the whole volcano phase is about 5% faster.

Thanks, +1 for this change.

…ation

hbtoo · 2020-10-14T04:41:10Z

Thanks everyone for the discussion and review!

…ation (Botong Huang) CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. However, there is still room for improvement. A subset can be updated more than once in a cost propagation process. For instance, A -> D, A -> B -> C -> D. When subset A has an update, using BFS subset D (and thus all subsets above/after D) can be updated twice, first via A -> D and then C -> D. We can further improve the BFS by always popping the relNode with the smallest cost from the queue, similar to the Dijkstra algorithm. So that whenever a relNode is popped from the queue, its current best cannot be further deceased any more. As a result, all subsets will only be propagated at most once. close apache#2187

vlsi added the slow-tests-needed label Oct 2, 2020

hsyuan reviewed Oct 3, 2020

View reviewed changes

hsyuan added the druid-tests-needed label Oct 3, 2020

hbtoo force-pushed the CALCITE-4302 branch 3 times, most recently from c4a8bf4 to 0ca875c Compare October 6, 2020 21:50

hbtoo force-pushed the CALCITE-4302 branch from 0ca875c to 0b0b53a Compare October 6, 2020 22:10

vlsi removed the druid-tests-needed label Oct 8, 2020

danny0405 reviewed Oct 10, 2020

View reviewed changes

hsyuan reviewed Oct 10, 2020

View reviewed changes

hbtoo force-pushed the CALCITE-4302 branch 2 times, most recently from dd4dd03 to 2295a8f Compare October 10, 2020 18:48

vlsi reviewed Oct 10, 2020

View reviewed changes

hbtoo force-pushed the CALCITE-4302 branch 2 times, most recently from fb18170 to a306316 Compare October 10, 2020 22:00

danny0405 reviewed Oct 11, 2020

View reviewed changes

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propag…

093c44f

…ation

hbtoo force-pushed the CALCITE-4302 branch from a306316 to 093c44f Compare October 11, 2020 03:57

danny0405 approved these changes Oct 13, 2020

View reviewed changes

danny0405 added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Oct 13, 2020

danny0405 closed this in c7fdae2 Oct 14, 2020

jinglinpeng mentioned this pull request Feb 25, 2021

feat(eda): specify colors in plot(df), plot(df, x) sfu-db/dataprep#510

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

hbtoo commented Oct 2, 2020

hsyuan left a comment

hbtoo commented Oct 6, 2020 •

edited

liupc commented Oct 9, 2020

danny0405 Oct 10, 2020

hbtoo Oct 10, 2020 •

edited

danny0405 Oct 11, 2020

hbtoo Oct 11, 2020

hsyuan Oct 10, 2020

hbtoo Oct 10, 2020

hsyuan Oct 10, 2020

hbtoo Oct 10, 2020

hsyuan commented Oct 10, 2020

vlsi commented Oct 10, 2020 via email

vlsi Oct 10, 2020

hbtoo Oct 10, 2020

vlsi Oct 10, 2020

hbtoo Oct 10, 2020

vlsi Oct 10, 2020

hbtoo Oct 10, 2020 •

edited

vlsi Oct 10, 2020

hbtoo Oct 10, 2020

danny0405 Oct 11, 2020

hbtoo Oct 11, 2020

danny0405 Oct 11, 2020

hbtoo Oct 11, 2020

danny0405 Oct 12, 2020

danny0405 Oct 12, 2020

hbtoo Oct 12, 2020

hbtoo Oct 12, 2020

danny0405 Oct 13, 2020

hbtoo Oct 13, 2020

hbtoo commented Oct 14, 2020

		for (Map.Entry<RelSubset, RelNode> subsetBestPair : changedSubsets.entrySet()) {
		RelSubset relSubset = subsetBestPair.getKey();

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

Conversation

hbtoo commented Oct 2, 2020

hsyuan left a comment

Choose a reason for hiding this comment

hbtoo commented Oct 6, 2020 • edited

liupc commented Oct 9, 2020

Choose a reason for hiding this comment

hbtoo Oct 10, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hsyuan commented Oct 10, 2020

vlsi commented Oct 10, 2020 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hbtoo Oct 10, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hbtoo commented Oct 14, 2020

hbtoo commented Oct 6, 2020 •

edited

hbtoo Oct 10, 2020 •

edited

hbtoo Oct 10, 2020 •

edited