Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187

Closed
wants to merge 1 commit into from

Conversation

hbtoo
Copy link
Contributor

@hbtoo hbtoo commented Oct 2, 2020

No description provided.

Copy link
Member

@hsyuan hsyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hbtoo Thanks for your pull request. It seems like there are some test failures for Druid tests. Can you take a look? I believe most of them are plan diffs that are caused by the BFS strategy change.

@hbtoo hbtoo force-pushed the CALCITE-4302 branch 3 times, most recently from c4a8bf4 to 0ca875c Compare October 6, 2020 21:50
@hbtoo
Copy link
Contributor Author

hbtoo commented Oct 6, 2020

@hbtoo Thanks for your pull request. It seems like there are some test failures for Druid tests. Can you take a look? I believe most of them are plan diffs that are caused by the BFS strategy change.

I did an investigation and found out that the old and new best plan has exactly the same cost. The best plan switched because of the change in update order in this PR. So this is expected, so I updated the expected result of the tests.

@liupc
Copy link
Contributor

liupc commented Oct 9, 2020

LGTM.

// This subset is already in the chain being propagated to. This
// means that the graph is cyclic, and therefore the cost of this
// relational expression - not this subset - must be infinite.
LOGGER.trace("cyclic: {}", this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cyclic check has been removed, does it mean the code is useless now ?

Copy link
Contributor Author

@hbtoo hbtoo Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cyclic check is necessary for the old update logic, i.e. DFS. Now since it is a Dijkstra like algorithm, always propagating the changed relNode with smallest best cost, the update will automatically stop after traveling a full cycle. So no special handling is needed any more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other code that we can check a cyclic path now ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if there's any. Note that the cycle detection code I deleted here is already not working in this BFS implementation. It is left-over dead code when we changed from DFS to BFS.

* @param mq Metadata query
* @param rel Relational expression whose cost has improved
*/
void propagateCostImprovements(VolcanoPlanner planner,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need planner as a parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, removed

* @param rel Relational expression whose cost has improved
*/
void propagateCostImprovements(VolcanoPlanner planner,
RelMetadataQuery mq, RelNode rel) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mq can be retrieved from rel.getCluster().getMetadataQuery(), so mq is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, removed

@hsyuan
Copy link
Member

hsyuan commented Oct 10, 2020

@vlsi Why druid test is not needed?

@vlsi
Copy link
Contributor

vlsi commented Oct 10, 2020 via email

@hbtoo hbtoo force-pushed the CALCITE-4302 branch 2 times, most recently from dd4dd03 to 2295a8f Compare October 10, 2020 18:48
// since best was changed, cached metadata for this subset should be removed
mq.clearCache(subset);

subset.getParents().forEach(parent -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a regular for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought they are syntactically the same?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forEach ends with }); which is slightly less nice than }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, changed to for loop

mq.clearCache(parent);
if (propagateRels.put(parent, getCost(parent, mq)) != null) {
// Cost changed, force the heap to adjust its ordering
propagateHeap.remove(parent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is O(N) :-/

Copy link
Contributor Author

@hbtoo hbtoo Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. However note that when this code is hit, it means that we've found a second update path to this same relNode. This is exactly the case where this patch improves performance. Before this patch, it means that this relNode (and thus everything above it) will be updated twice. With this patch, we only need to repopulate the heap with this relNode with updated cost. Although it is indeed O(N), it is already much faster than before.

Also note that technically, the heap implementation can be improved to reduce this from O(N) to O(logN). However it might not be essential here.

Comment on lines 933 to 936
if (relNode.getTraitSet().satisfies(subset.getTraitSet())) {
// Update subset best cost when we find a cheaper rel or the current
// best's cost is changed
if (cost.isLt(subset.bestCost)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to reduce the nesting level with continue;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@hbtoo hbtoo force-pushed the CALCITE-4302 branch 2 times, most recently from fb18170 to a306316 Compare October 10, 2020 22:00
for (Map.Entry<RelSubset, RelNode> subsetBestPair : changedSubsets.entrySet()) {
RelSubset relSubset = subsetBestPair.getKey();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changedSubsets keys are never used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
Map<RelNode, RelOptCost> propagateRels = new HashMap<>();
PriorityQueue<RelNode> propagateHeap = new PriorityQueue<>((o1, o2) -> {
RelOptCost c1 = propagateRels.get(o1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change, the propagation is neither BFS nor DFS, can this cause problem such as StackOverFlow when the rel
node hierarchy is very deep ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although not exactly BFS, but it (Dijkstra) works very similar to BFS, I'd like to view it as a controlled special type of BFS. I think the heap size here and the queue size in BFS should be about the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is different with Dijkstra, especially when you put different hierarchy nodes into one queue and only sort them with cost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, can we make the promotion evidence more clear ? Is there any possibility that we do some benchmark test ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you consider cost as a kind of distance between relnodes/subsets, this propagation process is basically Dijkstra in a directed graph. Computing the best plan in this directed graph is finding the "shortest" path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did compare the running time using some big queries, with the patch the whole volcano phase is about 5% faster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, +1 for this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :D

@danny0405 danny0405 added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Oct 13, 2020
@danny0405 danny0405 closed this in c7fdae2 Oct 14, 2020
@hbtoo
Copy link
Contributor Author

hbtoo commented Oct 14, 2020

Thanks everyone for the discussion and review!

MalteBellmann pushed a commit to caespes/calcite that referenced this pull request Feb 21, 2021
…ation (Botong Huang)

CALCITE-3330 changed the cost propagation in volcano from DFS to BFS.
However, there is still room for improvement. A subset can be updated
more than once in a cost propagation process. For instance, A -> D, A ->
B -> C -> D. When subset A has an update, using BFS subset D (and thus
all subsets above/after D) can be updated twice, first via A -> D and
then C -> D. We can further improve the BFS by always popping the
relNode with the smallest cost from the queue, similar to the Dijkstra
algorithm. So that whenever a relNode is popped from the queue, its
current best cannot be further deceased any more. As a result, all
subsets will only be propagated at most once.

close apache#2187
XuQianJin-Stars pushed a commit to XuQianJin-Stars/calcite that referenced this pull request Jul 14, 2021
…ation (Botong Huang)

CALCITE-3330 changed the cost propagation in volcano from DFS to BFS.
However, there is still room for improvement. A subset can be updated
more than once in a cost propagation process. For instance, A -> D, A ->
B -> C -> D. When subset A has an update, using BFS subset D (and thus
all subsets above/after D) can be updated twice, first via A -> D and
then C -> D. We can further improve the BFS by always popping the
relNode with the smallest cost from the queue, similar to the Dijkstra
algorithm. So that whenever a relNode is popped from the queue, its
current best cannot be further deceased any more. As a result, all
subsets will only be propagated at most once.

close apache#2187
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left. slow-tests-needed
Projects
None yet
5 participants