[BugFix] Fix SQL aggregate window with ORDER BY defaulting to whole partition#5526
Conversation
PR Reviewer Guide 🔍(Review updated until commit 9999d08)Here are some key observations to aid the review process:
|
Aggregate window functions with ORDER BY but no explicit frame defaulted to the whole partition, so COUNT(DISTINCT x) OVER(ORDER BY y) returned the total for every row instead of a running value. Default SQL aggregates to RANGE UNBOUNDED PRECEDING .. CURRENT ROW, matching the SQL standard and the non-Calcite engine's peer semantics (ranking functions ignore frames). PPL windows carry no ORDER BY, so are unaffected. Signed-off-by: Chen Dai <daichen@amazon.com>
820da3e to
9999d08
Compare
|
Persistent review updated to latest commit 9999d08 |
PR Code Suggestions ✨Explore these optional code suggestions:
|
|
Flaky test: |
Description
In the unified SQL path, an aggregate window function with an
ORDER BYbut no explicit frame (e.g.COUNT(DISTINCT x) OVER(ORDER BY y)) defaulted to the whole partition, returning the partition-wide total on every row. This PR defaults such aggregate windows toROWS UNBOUNDED PRECEDING .. CURRENT ROWso each row reflects a running aggregate — PPL window functions carry noORDER BYand are unaffected.Related Issues
Part of #5248
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.