-
Notifications
You must be signed in to change notification settings - Fork 1.8k
optimize limit push for join case #4411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @jackwener |
| \n Limit: skip=0, fetch=1000\ | ||
| \n TableScan: test2, fetch=1000"; | ||
|
|
||
| assert_optimized_plan_eq(&plan, expected).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_optimized_plan_eq(&plan, expected).unwrap(); | |
| assert_optimized_plan_eq(&plan, expected)?; |
| JoinType::Left => push_down_join(join, Some(limit), None), | ||
| JoinType::Right => push_down_join(join, None, Some(limit)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those are the lesser restrictive versions of push down (only to left/right side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these
JoinType::Left | JoinType::Right | JoinType::Full
if is_no_join_condition(join) =>
the on condition is empty, the rule can be applied, but if the on condition is not empty,
JoinType::Left => push_down_join(join, Some(limit), None),
JoinType::Right => push_down_join(join, None, Some(limit)),
can be applied.
cc @jackwener
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understanded it
| // push left and right | ||
| push_down_join(join, Some(limit), Some(limit)) | ||
| } | ||
| JoinType::LeftSemi | JoinType::LeftAnti |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget RightSemi/RightAnti?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 good catch
|
A good improvement👍Thanks @liukun4515 |
|
related issue to track problems like this #4413 |
| // push left | ||
| push_down_join(join, Some(limit), None) | ||
| } | ||
| JoinType::RightSemi | JoinType::RightAnti |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the right semi and right anti
cc @jackwener
jackwener
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good improvement.
LGTM
|
cc @Dandandan if it looks good to you, I will merge this. |
| let limit = fetch + skip; | ||
| let new_join = match join.join_type { | ||
| JoinType::Left | JoinType::Right | JoinType::Full | ||
| if is_no_join_condition(join) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how often there will be no join conditions 🤔 that effectively then becomes a CROSS JOIN
To be clear I think the optimization is still correct, I just wonder how often it will appear in real queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how often there will be no join conditions 🤔 that effectively then becomes a CROSS JOIN
To be clear I think the optimization is still correct, I just wonder how often it will appear in real queries
Good question for this, in the current framework of optimizer in the datafusion, the join query will be converted to the cross join by other rule.
The limit push down rule just do the best to optimize the plan.
Some SQL are generated by the program, and the join type and on condition are added by the program, it will generate sql like select * from left_table right join right_table.
|
Benchmark runs are scheduled for baseline = 3fe542f and contender = 48f0f3a. 48f0f3a is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
add more join case which can be optimized for the
limit push downrule.Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?