Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/sem: fix interaction between framing and peers in window functions #39308

Merged
merged 2 commits into from
Jan 9, 2020

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Aug 4, 2019

rowexec: minor refactor of partitionPeerGrouper

This commit removes allPeers peerGroupChecker and refactors
partitionPeerGrouper to support the case of omitted ORDER BY clause.

Release note: None

sql/sem: fix interaction between framing and peers in window functions

Previously, whenever we had a result of a window function already
computed for a peer of the row, we would return that result. However,
the concept of window framing takes precedence over the concept of peers
(i.e. if the window frame over which we computed the result for a peer
is different from the window frame of the current row, we need to
recompute the result). Now this is fixed by tracking the boundaries of
the window frame over which peerRes is computed and recomputing the
latter when needed.

There was also a bug in IsDefaultFrame where we forgot to check whether
the mode of framing is RANGE. This caused incorrect results in some
cases when ROWS or GROUPS modes of framing were used.

Fixes: #38901.
Fixes: #42935.

Release note (bug fix): CockroachDB previously was returning incorrect
results for some aggregate functions when used as window functions with
non-default window frame. Now this is fixed. Note that MIN, MAX, SUM,
AVG as well as "pure" window functions (i.e. non-aggregates) were not
affected.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@justinj justinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we should aggregate all these weird semantic oddities into a doc somewhere, they're pretty nasty.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, and @yuzefovich)


pkg/sql/distsqlrun/windower.go, line 589 at r1 (raw file):

				// for "pure" window functions as well as for all modes of framing
				// except for ROWS.
				peerGrouper = allPeers{}

Haven't really looked at this code before—is there a reason this is a special case, and doesn't just have an empty ordering with the regular partitionPeerGrouper?


pkg/sql/logictest/testdata/logic_test/window, line 3342 at r1 (raw file):

4  10  2  0  4

# Regression test verifying that ROWS mode of framing takes precedence over the

Can you add a test including an example of this not happening for non-aggregate functions?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)


pkg/sql/distsqlrun/windower.go, line 589 at r1 (raw file):

Previously, justinj (Justin Jaffray) wrote…

Haven't really looked at this code before—is there a reason this is a special case, and doesn't just have an empty ordering with the regular partitionPeerGrouper?

Thanks for pointing this out! Your intuition is correct - with an empty ordering, partitionPeerGrouper will behave exactly like allPeers but, at the moment, will be slower due to extra indirections and a copy. I will refactor this and remove allPeers.


pkg/sql/logictest/testdata/logic_test/window, line 3342 at r1 (raw file):

Previously, justinj (Justin Jaffray) wrote…

Can you add a test including an example of this not happening for non-aggregate functions?

Again, thanks for this suggestion! It actually made me realize that my "fix" was not a real fix. The real issue is that with aggregates we first check whether a current row is the first peer in its peer group, and if not, we return agg.peerRes. However, if the window frame is such that the first peer is not in the frame, that is an incorrect behavior. I will need to do more thinking about this and will postpone this PR for a little bit. These semantics are, indeed, very nasty :/

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess when I get back to working on this, I should start writing the doc you mentioned.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Aug 6, 2019
This commit removes allPeers peerGroupChecker and refactors
partitionPeerGrouper to support the case of omitted ORDER BY clause.

Release note: None
Previously, whenever we had a result of a window function already
computed for a peer of the row, we would return that result. However,
the concept of window framing takes precedence over the concept of peers
(i.e. if the window frame over which we computed the result for a peer
is different from the window frame of the current row, we need to
recompute the result). Now this is fixed by tracking the boundaries of
the window frame over which peerRes is computed and recomputing the
latter when needed.

There was also a bug in IsDefaultFrame where we forgot to check whether
the mode of framing is RANGE. This caused incorrect results in some
cases when ROWS or GROUPS modes of framing were used.

Release note (bug fix): CockroachDB previously was returning incorrect
results for some aggregate functions when used as window functions with
non-default window frame. Now this is fixed. Note that MIN, MAX, SUM,
AVG as well as "pure" window functions (i.e. non-aggregates) were not
affected.
@yuzefovich yuzefovich changed the title sql: fix a bug with some aggregate functions used as window functions sql/sem: fix interaction between framing and peers in window functions Dec 4, 2019
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Dec 4, 2019
@yuzefovich
Copy link
Member Author

@justinj @jordanlewis I pushed an updated fix. This code seems too fragile in general :/

@yuzefovich
Copy link
Member Author

Friendly ping @justinj.

Copy link
Contributor

@justinj justinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)

@yuzefovich
Copy link
Member Author

TFTR!

bors r+

craig bot pushed a commit that referenced this pull request Jan 9, 2020
39308: sql/sem: fix interaction between framing and peers in window functions r=yuzefovich a=yuzefovich

**rowexec: minor refactor of partitionPeerGrouper**

This commit removes allPeers peerGroupChecker and refactors
partitionPeerGrouper to support the case of omitted ORDER BY clause.

Release note: None

**sql/sem: fix interaction between framing and peers in window functions**

Previously, whenever we had a result of a window function already
computed for a peer of the row, we would return that result. However,
the concept of window framing takes precedence over the concept of peers
(i.e. if the window frame over which we computed the result for a peer
is different from the window frame of the current row, we need to
recompute the result). Now this is fixed by tracking the boundaries of
the window frame over which peerRes is computed and recomputing the
latter when needed.

There was also a bug in IsDefaultFrame where we forgot to check whether
the mode of framing is RANGE. This caused incorrect results in some
cases when ROWS or GROUPS modes of framing were used.

Fixes: #38901.
Fixes: #42935.

Release note (bug fix): CockroachDB previously was returning incorrect
results for some aggregate functions when used as window functions with
non-default window frame. Now this is fixed. Note that MIN, MAX, SUM,
AVG as well as "pure" window functions (i.e. non-aggregates) were not
affected.

43293: builtins: add extract functionality for intervals r=knz a=otan

Resolves #43209

This adds the postgres-esque implementation of extract for intervals.
Tested a variety of edge cases and compared against what postgres
returns.

Currently leaving the `extract_duration` function alone - we could
thereoetically get rid of it after this PR. It does behave slightly
differently as well.


Release note (sql change): Added new ability to call extract on an
interval, e.g. `extract(day from interval '254 days')`. This follows
the postgres implementation. Furthermore, this deprecates
`extract_duration`, which will be removed at a later date.

43834: colexec: speed up ordered sync r=yuzefovich a=yuzefovich

**colexec: refactor ordered sync to use assign instead of append**

Ordered synchronizer sets the output one row at a time. Previously, we
were calling Append on the output vectors, but it is faster to use Sets
in such scenario.

Release note: None

**colexec: speed up ordered sync**

This commit speeds up the ordered synchronizer with two modifications:
1. instead of updating the allocator after we copy a single whole row,
we will now update the allocator after we copy a full batch of rows
2. to reduce the number of interface conversions we will "strip" output
vectors to get to the underlying slices and will use those directly when
copying the data.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jan 9, 2020

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants