-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve outer join lowering #16451
Improve outer join lowering #16451
Conversation
(This is a bit related to #4171) It would be good to add the query in the PR description as an slt test. |
I second @ggevay . We should use every opportunity to add new queries to the SLT if none of the existing queries shows any plan change. |
The |
You could add it to
I think it's mostly the startup time that is bad, so if you add it to an existing file, then maybe it's fine. |
Item No 1. This only slightly degenerate query has an addtional
|
Item No 2. This slightly more realistic query has one extra
The database schema is the same as in Item No 1. |
Item No 3. This query is detected as constant in the prior commit but is a join in this one:
I have also seen the opposite case, where this branch does better constant detection than the prior commit :-( This is all quite disconcerting and arbitrary, made worse by the fact that the impossible condition is not complex at all, it does not involve multiple tables and does not require any propagation or transformations to be detectable as impossible. Also, this is not the first PR that this dance happens for no apparent reason -- maybe we should consider a more general solution that somehow guarantees the detection of simple impossible cases regardless of any join context they appear in. |
Item No 4. Here is a plan over a TPC-* schema that has the extra join clearly shown:
schema:
|
I've only looked at the second one in detail, but it seems like it is 50-50 on whether it is a regression. The join that it introduced is a "narrow" join on just the keys of the join:
whereas in the original plan that join does not exist presumably due to CSE, but instead we have a "wide"
Let me think a bit about whether this is intrinsic, but my intuition is that the plan is in fact improved (the join is not as expensive as the arrangements that feed it, and the arrangements are larger in the |
If we look at the arrangements instead, we remove a
which will maintain all distinct occurrences of all of these columns. By comparison we add
which include a "narrow" |
Yeah, measured by wallclock time, this PR is doing better on Item No 4 as compared to the prior commit. So I guess we are at the stage where I should stop reading plans to divinate which one is better than the other based on the diff. @frankmcsherry can you suggest some query against the introspection table that will , as faithfully as possible, capture the total "effort" expended to evaluate the entire query across the entire dataflow? I could then have the RQG use such a summarization query to flag performance regressions automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No panics or wrong results. The plan differences that I examined all appear to fall into the types listed as Items.
Item 3: I think we are just missing a fundamental transform that looks for incompatible filters. Looking at it, there is no reason would shouldn't be able to take the output and optimize it further; it doesn't seem like the alternate lowering has made this any harder:
|
Item 2. I thought a fair bit more about this, and I am now much more confident that the new plan is better despite the superficial complexity. The easiest argument I see is that the new plan immediately projects away I can also explain why the old plans appears "simpler", and it is an interesting quirk! The LEFT JOIN is not on a primary key, but
which .. by virtue of |
I think wrt items 1-4, I'm ok on all the plans if we figure out what is going on here in Item 4.
This looks like either a missed opportunity or a regression, depending on what we expected. I think we see similar things in a recent PR of @ggevay and maybe the answer is "we do not intentionally try to prevent this, or do but occasionally miss, and putting that on the todo list is good enough". |
Actually, @philip-stoev I take it back and Item 4. isn't weird right at that moment, which may require that arrangement, but instead why is that arrangement not shared with the join in
and
in it. Sorta weird that we have both, but also sort of makes sense how we would get confused in that you can't really use the latter in place of the former (wrong number of columns) even though all the information is there. What is maybe interesting is that these two arrangements will be identical once we perform "thinning", which is something LIR does to reduce memory requirements. So, .. no reason MIR should be able to realize they would be similar. But I suspect LIR may not realize that there are patterns of arrangement re-use that only it can detect. |
Fair enough, thank you for your patience! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just two nits regarding comments. (Caveat: I am just the random reviewer.)
@@ -1727,17 +1727,48 @@ impl AggregateExpr { | |||
} | |||
} | |||
|
|||
// TODO: move this to the `mz_expr` crate. | |||
/// If the on clause of an outer join is an equijoin, figure out the join keys. | |||
/// Column version of `derive_equijoin_cols`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Column version of `derive_equijoin_cols`. | |
/// Column version of `derive_equijoin_exprs`. |
} | ||
} | ||
} | ||
// Offset columns in right expressions by the columns introduce by the left input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Offset columns in right expressions by the columns introduce by the left input. | |
// Offset columns in right expressions by the columns introduced by the left input. |
Should I review this now? Or should I wait until it is moved out of draft state? |
|
@aalexandrov my guess is that this is now outdated, but I wanted to check with you about the current state of outer join lowering. |
@frankmcsherry Yes, this was fixed in main with #22560. The current plan coincides with the plan from this PR:
|
This PR improves the "optimistic" outer join lowering from applying only to exact column equality to equalities of expressions supported by the two inputs. This is most clearly seen on outer joins that appear to be column equality but are in fact type conversions. For example,
On
main
this plans asand in this PR plans as
The two CTEs are identical, but in the
main
case there is aDistinct
on multiple columns offoo
, rather than just on the key column. Themain
plan generally will keep full copies of all columns offoo
, rather than just the distinct keys.Motivation
Outer join lowering was limited in scope to exact column equalities. In many cases there can be conversions of types, truncation of values, or other transformations that foil the existing lowering but are not incorrect to perform.
Tips for reviewer
This has not been exercised on exotic outer joins such as those that do not just equate simple transformations of columns. It still appears correct, under the premise that it finds outer joins that could have been column equality if each input had been subjected to
map(keys)
. Tagging in @philip-stoev as a review to point several unpleasant forms of outer joins at this PR.Checklist
This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.This PR includes the following user-facing behavior changes: