-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-2018] Queries failed with AssertionError: rel has lower cost… #552
Conversation
7ebf238
to
f733d24
Compare
f733d24
to
73cb945
Compare
… than best cost of subset. (Second attempt) (cherry picked from commit 1c795d7)
if (otherSubset.bestCost.isLt(subset.bestCost)) { | ||
subset.bestCost = otherSubset.bestCost; | ||
subset.best = otherSubset.best; | ||
changedSubsets.put(subset, otherSubset.best); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should something like max(changedSubsets.get(subset), otherSubset.best)
be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good assumption, but when registering new rel nodes into this RelSet
from otherSubset
, the best cost for subset
may be changed, and therefore otherSubset.best
should be passed, since it is better according to the check above. Also, otherSubset
is not changed and therefore there is no need to try to update its best cost.
… than best cost of subset. (Inspired from pr: apache#552 which is committed by vvysotskyi). Summary: 1.Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. 2. Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. 3. Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. Note: 1. Does not import following points from pr because it will increase the lifetime of RelMetadataQuery. Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. Test Plan: NO Reviewers: 鲁尼, 晓令, 玉兆 Reviewed By: 晓令 Subscribers: P577102 Differential Revision: https://aone.alibaba-inc.com/code/D642962
RelSubset relSubset = subsetBestPair.getKey(); | ||
relSubset.propagateCostImprovements( | ||
planner, mq, subsetBestPair.getValue(), | ||
activeSet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okey, this is definitely needed. We should reCompute the best rel based on the new cost instead of replace it directly with the best rel in the merged RelSubSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this is one of the key changes in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason of recomputing otherSubset.best? Do you think its input can be changed at this point? Can you please give a scenario? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required for the case when this RelSet
contains RelSubset
which matches to the trait sets of otherSubset
, so best may be changed (otherSubset.bestCost.isLt(subset.bestCost)
check) and therefore parent rel nodes should be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Now I see why. It looks to me that if we move these codes to be after line 363 (renaming parentRel), then we won't need line 373 to line 378. What do you think?
78b4def
to
dfa8317
Compare
Also I notice under debugger that RelNode tree can have more than one RelOptCluster. I haven't dig into that. But if this is the case, your #2 bullet point in your change description may not be true. |
Hi @xndai if you have two clusters in the same |
@zabetak @vvysotskyi I look closer and find two tests cases in JdbcTest where I see rel nodes have different RelOptCluster instances - testLateralJoin and testExpandViewWithLateralJoin. It looks like during parsing when there's view table and ViewTable.expandView() would create a new SqlToRelConverter instance which then would use a different RelOptCluster instance for rel nodes it creates. Is this expected? |
@xndai thanks for looking into this further. I kind of understand why it happens but I think it should not. |
I would also suggest we turn on isValid() check by default in UT (instead of doing that only under Trace logger level). We would expect all these kinds of issues are gone with this fix (and 2116). Also to prevent any future regressions. |
Based on this change, I made a couple of modifications, now the isValid() check passes for all unit tests. You can find at my changes at - https://github.com/xndai/calcite/tree/2018 The key thing I add is that when a RelNode is renamed and it is subset's best RelNode, we propagate change improvements since the subset's best cost could be changed. I also turn on isValid() check under assert. Also with this change, we run into some CyclicMetadataException when accessing meta. Right now I just catch and ignore, similar to how it is handled in other parts of the code. But in the long run, we will need to figure out better way to handle cyclic. |
Thanks @xndai for the additional patch, the
|
But the stack trace is misleading if we reuse the exception instance. On the other hand, how much time will be wasted if we construct new |
I agree with Haisheng. The exception needs to be fixed. Currently we could get a wrong call stack info when reusing CyclicMetadataException. This is really confusing and creates a lot of pains during debugging. |
… than best cost of subset
The change completes the fix of CALCITE-2018. Below are a few key additions - 1. When a Rel is renamed and it is subset's best RelNode, we need to propagate change improvement since the subset's best cost could be changed. 2. Add codes to handle CyclicMetadataException. Right now we just ignore it. We will revisit once we have figured out a better way to handle CyclicMetadataException. 3. Enable isValid() check under assert. Also check validation after RelSet merge.
dfa8317
to
3920bc2
Compare
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@danny0405 You didn't squash commits? |
Yes, i want to keep the credit of every committer. |
You don't need to. There is a way to keep multiple authors, like 955d4ea |
Thanks, but i want to follow the CALCITE-1935 style because this fix lasts so long time and so many discussions, split it into multiple commits make the fix more clear. |
… than best cost of subset * Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. * Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. * Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. * Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. close apache#552
… than best cost of subset * Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. * Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. * Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. * Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. close apache#552
… than best cost of subset * Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. * Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. * Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. * Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. close apache#552
… than best cost of subset * Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. * Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. * Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. * Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. close apache#552
… than best cost of subset * Replaced Map usage by Table for query metadata caching. It allows finding cached metadata query for concrete RelNode instance more easily. * Made changes to use single RelMetadataQuery instance during planning between the first call of RelOptCluster.getMetadataQuery() and RelOptCluster.invalidateMetadataQuery() call. * Cached metadata values are removed for current RelSubset and all parent RelNodes when RelSubset.propagateCostImprovements() is called and best rel node was changed. * Made changes in RelSet.mergeWith() method to call RelSubset.propagateCostImprovements() when resulting best is known instead of just assigning RelSubset.best. close apache#552 Change-Id: I3590638395bde98699980b73a891aef441ec7167
… than best cost of subset.
Map
usage byTable
for query metadata caching. It allows finding cached metadata query for concreteRelNode
instance more easily.RelMetadataQuery
instance during planning between the first call ofRelOptCluster.getMetadataQuery()
andRelOptCluster.invalidateMetadataQuery()
call.RelSubset
and all parentRelNode
s whenRelSubset.propagateCostImprovements()
is called and best rel node was changed.RelSet.mergeWith()
method to callRelSubset.propagateCostImprovements()
when resulting best is known instead of just assigningRelSubset.best
.