-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Remove the PhysicalSortExpr restriction on union get meet #6273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ns of the columns
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
|
Nice optimization |
|
Can you extend the optimization to non-column cases? For example select a + b from table_2 order by a + b
union all
select x + y from table_2 order by x + y |
My last commit can succeed in the cases of your example. However, for the cases like a+b+c (having two or more depth), the build order of the binary expression is important. We can get false negative results. Actually, I think this is a general problem with binary expressions in Datafusion. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there and SQL / plan level test that could be written that would cover this code? It seems like a neat optimization but I don't fully understand how to map it to how it would affect queries
c1 and c1a columns are provided as ascending, and this PR optimizes the plan such: However, the main version's result is: |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good improvement to me. Thank you @berkaysynnada
@ozankabak and/ or @mustafasrepo do you have time to review this PR (as I think you authored the code originally and most recently modified it): https://github.com/apache/arrow-datafusion/blame/894beb1d3b2b42ecc58c582f98e17800ac2b9527/datafusion/core/src/physical_plan/common.rs#L333
ozankabak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went over the code and checked in a commit simplifying it a little bit.
It looks good to me at this point, feel free to merge when you like.
|
Thanks @berkaysynnada and @ozankabak |
|
For some reason github claims this PR has conflicts that must be resolved 🤔 since I can't push changes to your branch to resolve this myself, @ozankabak or @berkaysynnada can you please resolve the problem? I will then merge it
|
|
Done, thank you |
mingmwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM

Which issue does this PR close?
Closes #6272 .
Rationale for this change
Union get meet can only consider sort option and corresponding vector index during comparison. No need to check for PhysicalExpr equality.
What changes are included in this PR?
If the PhysicalSortExpr's can be downcasted to Column, compare the Column's indexes and sort options. Otherwise, the existing comparison is sustained.
Are these changes tested?
Existing tests are sufficient.
Are there any user-facing changes?