New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-4302] Improve cost propagation in volcano to avoid re-propagation #2187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hbtoo Thanks for your pull request. It seems like there are some test failures for Druid tests. Can you take a look? I believe most of them are plan diffs that are caused by the BFS strategy change.
c4a8bf4
to
0ca875c
Compare
I did an investigation and found out that the old and new best plan has exactly the same cost. The best plan switched because of the change in update order in this PR. So this is expected, so I updated the expected result of the tests. |
LGTM. |
// This subset is already in the chain being propagated to. This | ||
// means that the graph is cyclic, and therefore the cost of this | ||
// relational expression - not this subset - must be infinite. | ||
LOGGER.trace("cyclic: {}", this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cyclic check has been removed, does it mean the code is useless now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cyclic check is necessary for the old update logic, i.e. DFS. Now since it is a Dijkstra like algorithm, always propagating the changed relNode with smallest best cost, the update will automatically stop after traveling a full cycle. So no special handling is needed any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other code that we can check a cyclic path now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if there's any. Note that the cycle detection code I deleted here is already not working in this BFS implementation. It is left-over dead code when we changed from DFS to BFS.
* @param mq Metadata query | ||
* @param rel Relational expression whose cost has improved | ||
*/ | ||
void propagateCostImprovements(VolcanoPlanner planner, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need planner as a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, removed
* @param rel Relational expression whose cost has improved | ||
*/ | ||
void propagateCostImprovements(VolcanoPlanner planner, | ||
RelMetadataQuery mq, RelNode rel) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mq
can be retrieved from rel.getCluster().getMetadataQuery()
, so mq is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, removed
@vlsi Why druid test is not needed? |
Druid tests execute always, and there's no need in extra label
|
dd4dd03
to
2295a8f
Compare
// since best was changed, cached metadata for this subset should be removed | ||
mq.clearCache(subset); | ||
|
||
subset.getParents().forEach(parent -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a regular for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought they are syntactically the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forEach
ends with });
which is slightly less nice than }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, changed to for loop
mq.clearCache(parent); | ||
if (propagateRels.put(parent, getCost(parent, mq)) != null) { | ||
// Cost changed, force the heap to adjust its ordering | ||
propagateHeap.remove(parent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is O(N) :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. However note that when this code is hit, it means that we've found a second update path to this same relNode. This is exactly the case where this patch improves performance. Before this patch, it means that this relNode (and thus everything above it) will be updated twice. With this patch, we only need to repopulate the heap with this relNode with updated cost. Although it is indeed O(N), it is already much faster than before.
Also note that technically, the heap implementation can be improved to reduce this from O(N) to O(logN). However it might not be essential here.
if (relNode.getTraitSet().satisfies(subset.getTraitSet())) { | ||
// Update subset best cost when we find a cheaper rel or the current | ||
// best's cost is changed | ||
if (cost.isLt(subset.bestCost)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to reduce the nesting level with continue
;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
fb18170
to
a306316
Compare
for (Map.Entry<RelSubset, RelNode> subsetBestPair : changedSubsets.entrySet()) { | ||
RelSubset relSubset = subsetBestPair.getKey(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changedSubsets
keys are never used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
RelMetadataQuery mq = rel.getCluster().getMetadataQuery(); | ||
Map<RelNode, RelOptCost> propagateRels = new HashMap<>(); | ||
PriorityQueue<RelNode> propagateHeap = new PriorityQueue<>((o1, o2) -> { | ||
RelOptCost c1 = propagateRels.get(o1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this change, the propagation is neither BFS
nor DFS
, can this cause problem such as StackOverFlow when the rel
node hierarchy is very deep ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although not exactly BFS, but it (Dijkstra) works very similar to BFS, I'd like to view it as a controlled special type of BFS. I think the heap size here and the queue size in BFS should be about the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is different with Dijkstra
, especially when you put different hierarchy nodes into one queue and only sort them with cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, can we make the promotion evidence more clear ? Is there any possibility that we do some benchmark test ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you consider cost as a kind of distance between relnodes/subsets, this propagation process is basically Dijkstra in a directed graph. Computing the best plan in this directed graph is finding the "shortest" path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did compare the running time using some big queries, with the patch the whole volcano phase is about 5% faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, +1 for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks :D
Thanks everyone for the discussion and review! |
…ation (Botong Huang) CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. However, there is still room for improvement. A subset can be updated more than once in a cost propagation process. For instance, A -> D, A -> B -> C -> D. When subset A has an update, using BFS subset D (and thus all subsets above/after D) can be updated twice, first via A -> D and then C -> D. We can further improve the BFS by always popping the relNode with the smallest cost from the queue, similar to the Dijkstra algorithm. So that whenever a relNode is popped from the queue, its current best cannot be further deceased any more. As a result, all subsets will only be propagated at most once. close apache#2187
…ation (Botong Huang) CALCITE-3330 changed the cost propagation in volcano from DFS to BFS. However, there is still room for improvement. A subset can be updated more than once in a cost propagation process. For instance, A -> D, A -> B -> C -> D. When subset A has an update, using BFS subset D (and thus all subsets above/after D) can be updated twice, first via A -> D and then C -> D. We can further improve the BFS by always popping the relNode with the smallest cost from the queue, similar to the Dijkstra algorithm. So that whenever a relNode is popped from the queue, its current best cannot be further deceased any more. As a result, all subsets will only be propagated at most once. close apache#2187
No description provided.