-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/sem: fix interaction between framing and peers in window functions #39308
Conversation
152333a
to
d3e844f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we should aggregate all these weird semantic oddities into a doc somewhere, they're pretty nasty.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, and @yuzefovich)
pkg/sql/distsqlrun/windower.go, line 589 at r1 (raw file):
// for "pure" window functions as well as for all modes of framing // except for ROWS. peerGrouper = allPeers{}
Haven't really looked at this code before—is there a reason this is a special case, and doesn't just have an empty ordering with the regular partitionPeerGrouper
?
pkg/sql/logictest/testdata/logic_test/window, line 3342 at r1 (raw file):
4 10 2 0 4 # Regression test verifying that ROWS mode of framing takes precedence over the
Can you add a test including an example of this not happening for non-aggregate functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)
pkg/sql/distsqlrun/windower.go, line 589 at r1 (raw file):
Previously, justinj (Justin Jaffray) wrote…
Haven't really looked at this code before—is there a reason this is a special case, and doesn't just have an empty ordering with the regular
partitionPeerGrouper
?
Thanks for pointing this out! Your intuition is correct - with an empty ordering, partitionPeerGrouper
will behave exactly like allPeers
but, at the moment, will be slower due to extra indirections and a copy. I will refactor this and remove allPeers
.
pkg/sql/logictest/testdata/logic_test/window, line 3342 at r1 (raw file):
Previously, justinj (Justin Jaffray) wrote…
Can you add a test including an example of this not happening for non-aggregate functions?
Again, thanks for this suggestion! It actually made me realize that my "fix" was not a real fix. The real issue is that with aggregates we first check whether a current row is the first peer in its peer group, and if not, we return agg.peerRes
. However, if the window frame is such that the first peer is not in the frame, that is an incorrect behavior. I will need to do more thinking about this and will postpone this PR for a little bit. These semantics are, indeed, very nasty :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess when I get back to working on this, I should start writing the doc you mentioned.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)
d3e844f
to
38b2674
Compare
This commit removes allPeers peerGroupChecker and refactors partitionPeerGrouper to support the case of omitted ORDER BY clause. Release note: None
Previously, whenever we had a result of a window function already computed for a peer of the row, we would return that result. However, the concept of window framing takes precedence over the concept of peers (i.e. if the window frame over which we computed the result for a peer is different from the window frame of the current row, we need to recompute the result). Now this is fixed by tracking the boundaries of the window frame over which peerRes is computed and recomputing the latter when needed. There was also a bug in IsDefaultFrame where we forgot to check whether the mode of framing is RANGE. This caused incorrect results in some cases when ROWS or GROUPS modes of framing were used. Release note (bug fix): CockroachDB previously was returning incorrect results for some aggregate functions when used as window functions with non-default window frame. Now this is fixed. Note that MIN, MAX, SUM, AVG as well as "pure" window functions (i.e. non-aggregates) were not affected.
38b2674
to
88ade52
Compare
@justinj @jordanlewis I pushed an updated fix. This code seems too fragile in general :/ |
Friendly ping @justinj. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)
TFTR! bors r+ |
39308: sql/sem: fix interaction between framing and peers in window functions r=yuzefovich a=yuzefovich **rowexec: minor refactor of partitionPeerGrouper** This commit removes allPeers peerGroupChecker and refactors partitionPeerGrouper to support the case of omitted ORDER BY clause. Release note: None **sql/sem: fix interaction between framing and peers in window functions** Previously, whenever we had a result of a window function already computed for a peer of the row, we would return that result. However, the concept of window framing takes precedence over the concept of peers (i.e. if the window frame over which we computed the result for a peer is different from the window frame of the current row, we need to recompute the result). Now this is fixed by tracking the boundaries of the window frame over which peerRes is computed and recomputing the latter when needed. There was also a bug in IsDefaultFrame where we forgot to check whether the mode of framing is RANGE. This caused incorrect results in some cases when ROWS or GROUPS modes of framing were used. Fixes: #38901. Fixes: #42935. Release note (bug fix): CockroachDB previously was returning incorrect results for some aggregate functions when used as window functions with non-default window frame. Now this is fixed. Note that MIN, MAX, SUM, AVG as well as "pure" window functions (i.e. non-aggregates) were not affected. 43293: builtins: add extract functionality for intervals r=knz a=otan Resolves #43209 This adds the postgres-esque implementation of extract for intervals. Tested a variety of edge cases and compared against what postgres returns. Currently leaving the `extract_duration` function alone - we could thereoetically get rid of it after this PR. It does behave slightly differently as well. Release note (sql change): Added new ability to call extract on an interval, e.g. `extract(day from interval '254 days')`. This follows the postgres implementation. Furthermore, this deprecates `extract_duration`, which will be removed at a later date. 43834: colexec: speed up ordered sync r=yuzefovich a=yuzefovich **colexec: refactor ordered sync to use assign instead of append** Ordered synchronizer sets the output one row at a time. Previously, we were calling Append on the output vectors, but it is faster to use Sets in such scenario. Release note: None **colexec: speed up ordered sync** This commit speeds up the ordered synchronizer with two modifications: 1. instead of updating the allocator after we copy a single whole row, we will now update the allocator after we copy a full batch of rows 2. to reduce the number of interface conversions we will "strip" output vectors to get to the underlying slices and will use those directly when copying the data. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
Build succeeded |
rowexec: minor refactor of partitionPeerGrouper
This commit removes allPeers peerGroupChecker and refactors
partitionPeerGrouper to support the case of omitted ORDER BY clause.
Release note: None
sql/sem: fix interaction between framing and peers in window functions
Previously, whenever we had a result of a window function already
computed for a peer of the row, we would return that result. However,
the concept of window framing takes precedence over the concept of peers
(i.e. if the window frame over which we computed the result for a peer
is different from the window frame of the current row, we need to
recompute the result). Now this is fixed by tracking the boundaries of
the window frame over which peerRes is computed and recomputing the
latter when needed.
There was also a bug in IsDefaultFrame where we forgot to check whether
the mode of framing is RANGE. This caused incorrect results in some
cases when ROWS or GROUPS modes of framing were used.
Fixes: #38901.
Fixes: #42935.
Release note (bug fix): CockroachDB previously was returning incorrect
results for some aggregate functions when used as window functions with
non-default window frame. Now this is fixed. Note that MIN, MAX, SUM,
AVG as well as "pure" window functions (i.e. non-aggregates) were not
affected.