-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix limit + offset pushdown #2638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Yes. The current design of plan builder tries to mimics the semantics of SQL language; this PR tries to mimics the tree structure of logical operator. We can add another api for the builder to combines |
Before i impl the rule i test in pg Use offset before limit should also return 10 rows. |
In pg, the order of "offset" and "limit" doen't matter. It always means "OFFSET rows are skipped before starting to count the LIMIT rows that are returned". In the logical plan tree, the order of "offset" and “limit” does matter. This PR makes several "assumptions":
|
|
For api of plan builder, I like the way Calcite is doing https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/RelBuilder.html#limit(int,int) |
|
I plan to review this later today or tomorrow if no one else has a chance to do so before |
@alamb I am reviewing this one, I will finish this evening. |
Ted-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks @ming535 👍
i run this in pg
explain VERBOSE select id from users limit 10 offset 1;
QUERY PLAN
------------------------------------------------------------------------
Limit (cost=0.02..0.26 rows=10 width=8)
Output: id
-> Seq Scan on public.users (cost=0.00..239.02 rows=10002 width=8)
Output: id
(4 rows)
For api of plan builder, I like the way Calcite is doing https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/RelBuilder.html#limit(int,int) which combines "limit" and "offset".
Like @ming535 said, I think combines "limit" and "offset" them is a good way, avoid create a new plan struct. @alamb need your opinion on this
| /// it is updated to (None, min(n1, n2))). | ||
| /// 2. Ancestor_Limit(n1) -> .. -> Current_Offset(m1) | ||
| /// it is updated to (m1, n1 + m1). | ||
| /// 3. Ancestor_Offset(m1) -> .. -> Current_Offset(m2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think without subquery, it should not allowed two offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think without subquery, it should not allowed two
offset.
I think the sql parser should forbids this (two offset without subquery). As of the plan tree building with builder api, I think it is fine that it is not aware of this as long as the semantics looks right.
14b7cf1 to
59d9b1f
Compare
I wonder if we should consider renaming the |
|
I also like the idea of combining limit and offset in one DataFrame / builder method |
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM based on the current tests. I left some opinions on possible changes to the builder API.
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the plan and code changes carefully. I think it looks very nice 👌 @ming535 Thank you very much.
| ancestor: Ancestor, | ||
| ancestor_offset: Option<usize>, | ||
| ancestor_limit: Option<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanted to be fancy / more idomatic rust and have the compiler check more of the invariants you could consider encoding the ancestor, offset and limit together.
Something like
enum Ancestor {
/// Limit
FromLimit(usize),
/// Offset
FromOffset(usize),
/// Other nodes that don't affect the adjustment of "Limit"
NotRelevant,
}
|
|
||
| let expected = "Offset: 10\ | ||
| \n Limit: 1010\ | ||
| \n Limit: 1000\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this plan change is due to the change in semantics of LogicalPlanBuiler::offset(), right?
| let expected = "Limit: 1000\ | ||
| \n Offset: 10\ | ||
| \n Projection: #test.a\ | ||
| \n TableScan: test projection=None, limit=1010"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
I merged this PR as it has three reviewers and approvers -- I think any additional work can be done as follow on PRs. Thanks again @ming535 @Ted-Jiang and @andygrove 🏅 |
Which issue does this PR close?
Closes #2624
Rationale for this change
See below.
What changes are included in this PR?
When SqlToRel build the plan,
Limitshould be the parent ofOffset, according to https://www.postgresql.org/docs/current/queries-limit.html(If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting to count the LIMIT rows that are returned.)
fix limit_push_down to consider every combination of
OffsetandLimitwhen traversing down the plan tree.Are there any user-facing changes?
The semantic of LogicalPlan#offset, and LogicalPlan#limit is different.
Previously,
builder.limit(100).offset(10)means ignore the first 10 rows and then starting to count 100 rows. Now itmeans returns 100 rows and then skip 10 rows.
The current semantic is consistent with how data is flowing in the plan tree.
Does this PR break compatibility with Ballista?