Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal #751: Shared Window Partition #9839

Merged
merged 9 commits into from Dec 7, 2023

Conversation

hawkfish
Copy link
Contributor

  • Extend the partitioning and window internals to handle multiple partial sorts.
  • Add CSE and and fix BoundWindowExpression::Equals in the process.
  • Add partition fusion.

Richard Wesley added 4 commits November 27, 2023 15:20
Extend the partitioning and window internals to handle multiple partial sorts.
Add CSE and and fix BoundWindowExpression::Equals in the process.
@hawkfish
Copy link
Contributor Author

fixes: duckdblabs/duckdb-internal#751

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Could we add some tests with shared partitions/orderings? I'm not sure if this is tested enough by the base set of tests. I've tried this myself and found what I believe to be at least one incorrect result:

CALL dbgen(sf=0.01);
select l_extendedprice, l_partkey, l_orderkey, sum(l_extendedprice) over(order by l_partkey), sum(l_extendedprice) over (order by l_partkey, l_orderkey) from lineitem order by l_partkey, l_orderkey;

This PR

┌─────────────────┬───────────┬────────────┬───────────────┬────────────────┐
│ l_extendedprice │ l_partkey │ l_orderkey │   sum_part    │ sum_part_order │
│  decimal(15,2)  │   int32   │   int32    │ decimal(38,2) │ decimal(38,2)  │
├─────────────────┼───────────┼────────────┼───────────────┼────────────────┤
│        29733.00 │         1 │       2883 │      29733.00 │       29733.00 │
│         1802.00 │         1 │       5121 │      31535.00 │       31535.00 │
│         4505.00 │         1 │       6179 │      36040.00 │       36040.00 │
│            ·    │         · │         ·  │          ·    │           ·    │
│            ·    │         · │         ·  │          ·    │           ·    │
│            ·    │         · │         ·  │          ·    │           ·    │
│        20746.00 │      2000 │      58916 │ 2152182544.47 │  2152182544.47 │
│         7216.00 │      2000 │      59269 │ 2152189760.47 │  2152189760.47 │
├─────────────────┴───────────┴────────────┴───────────────┴────────────────┤
│ 60175 rows (5 shown)                                            5 columns │
└───────────────────────────────────────────────────────────────────────────┘

0.9.2

┌─────────────────┬───────────┬────────────┬───────────────┬────────────────┐
│ l_extendedprice │ l_partkey │ l_orderkey │   sum_part    │ sum_part_order │
│  decimal(15,2)  │   int32   │   int32    │ decimal(38,2) │ decimal(38,2)  │
├─────────────────┼───────────┼────────────┼───────────────┼────────────────┤
│        29733.00 │         1 │       2883 │     607274.00 │       29733.00 │
│         1802.00 │         1 │       5121 │     607274.00 │       31535.00 │
│         4505.00 │         1 │       6179 │     607274.00 │       36040.00 │
│            ·    │         · │         ·  │         ·     │           ·    │
│            ·    │         · │         ·  │         ·     │           ·    │
│            ·    │         · │         ·  │         ·     │           ·    │
│        20746.00 │      2000 │      58916 │ 2152189760.47 │  2152182544.47 │
│         7216.00 │      2000 │      59269 │ 2152189760.47 │  2152189760.47 │
├─────────────────┴───────────┴────────────┴───────────────┴────────────────┤
│ 60175 rows (5 shown)                                            5 columns │
└───────────────────────────────────────────────────────────────────────────┘

@hawkfish
Copy link
Contributor Author

hawkfish commented Dec 1, 2023

Even worse, this still gives the right answer:

select 
	l_extendedprice, 
	l_partkey, 
	l_orderkey, 
	sum(l_extendedprice) over (order by l_partkey), 
from lineitem 
order by l_partkey, l_orderkey

Fix incorrect fixed width in SBIterator::Compare.
Add test to verify.
@github-actions github-actions bot marked this pull request as draft December 1, 2023 20:35
@hawkfish
Copy link
Contributor Author

hawkfish commented Dec 1, 2023

Sorry about that! Dumb problem in an unexpected place. Restricting to l_partkey = 1 gives a simple repro.

@hawkfish hawkfish marked this pull request as ready for review December 1, 2023 23:13
Add test for window function CSE
@github-actions github-actions bot marked this pull request as draft December 2, 2023 17:39
@hawkfish hawkfish marked this pull request as ready for review December 4, 2023 16:08
@github-actions github-actions bot marked this pull request as draft December 4, 2023 23:17
@hawkfish hawkfish marked this pull request as ready for review December 4, 2023 23:23
@Mytherin Mytherin merged commit c14535a into duckdb:main Dec 7, 2023
44 of 45 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented Dec 7, 2023

Thanks! LGTM

@hawkfish hawkfish deleted the window-suborder branch December 7, 2023 18:14
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Dec 14, 2023
Merge pull request duckdb/duckdb#9839 from hawkfish/window-suborder
Merge pull request duckdb/duckdb#9906 from taniabogatsch/disable-32-bit-test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants