-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix: Pick correct columns in Sort Merge Equijoin #18772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| let mut left_columns = null_joined_batch | ||
| .columns() | ||
| .iter() | ||
| .take(right_columns_length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should also be changed to:
| .take(left_columns_length) |
|
I was able to reproduce the bug when changing the sort_merge_join.slt to have 3 columns in Can you please update the description and update/add tests (I would update the sort_merge_join.slt like in the diff below to make sure all tests are testing that. make sure to add a comment on why 3 columns and t1 2). Updated the slt file to reproduce the errordiff --git a/datafusion/sqllogictest/test_files/sort_merge_join.slt b/datafusion/sqllogictest/test_files/sort_merge_join.slt
--- a/datafusion/sqllogictest/test_files/sort_merge_join.slt (revision f3980641660997345af6061dc3b34f365020bd07)
+++ b/datafusion/sqllogictest/test_files/sort_merge_join.slt (date 1763411346050)
@@ -26,7 +26,7 @@
CREATE TABLE t1(a text, b int) AS VALUES ('Alice', 50), ('Alice', 100), ('Bob', 1);
statement ok
-CREATE TABLE t2(a text, b int) AS VALUES ('Alice', 2), ('Alice', 1);
+CREATE TABLE t2(a text, b int, c int) AS VALUES ('Alice', 2, 77), ('Alice', 1, 66);
# inner join query plan with join filter
query TT
@@ -64,83 +64,83 @@
----
# left join without join filter
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Alice 50 Alice 2
-Bob 1 NULL NULL
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Alice 50 Alice 2 77
+Bob 1 NULL NULL NULL
# left join with join filter
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a AND t2.b * 50 <= t1.b
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Bob 1 NULL NULL
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Bob 1 NULL NULL NULL
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a AND t2.b < t1.b
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Alice 50 Alice 2
-Bob 1 NULL NULL
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Alice 50 Alice 2 77
+Bob 1 NULL NULL NULL
# right join without join filter
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 RIGHT JOIN t2 ON t1.a = t2.a
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Alice 50 Alice 2
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Alice 50 Alice 2 77
# right join with join filter
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 RIGHT JOIN t2 ON t1.a = t2.a AND t2.b * 50 <= t1.b
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 RIGHT JOIN t2 ON t1.a = t2.a AND t1.b > t2.b
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Alice 50 Alice 2
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Alice 50 Alice 2 77
# full join without join filter
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 FULL JOIN t2 ON t1.a = t2.a
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 Alice 1
-Alice 50 Alice 2
-Bob 1 NULL NULL
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 Alice 1 66
+Alice 50 Alice 2 77
+Bob 1 NULL NULL NULL
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 FULL JOIN t2 ON t1.a = t2.a AND t2.b * 50 > t1.b
----
-Alice 100 NULL NULL
-Alice 50 Alice 2
-Bob 1 NULL NULL
-NULL NULL Alice 1
+Alice 100 NULL NULL NULL
+Alice 50 Alice 2 77
+Bob 1 NULL NULL NULL
+NULL NULL Alice 1 66
-query TITI rowsort
+query TITII rowsort
SELECT * FROM t1 FULL JOIN t2 ON t1.a = t2.a AND t1.b > t2.b + 50
----
-Alice 100 Alice 1
-Alice 100 Alice 2
-Alice 50 NULL NULL
-Bob 1 NULL NULL
+Alice 100 Alice 1 66
+Alice 100 Alice 2 77
+Alice 50 NULL NULL NULL
+Bob 1 NULL NULL NULL
statement ok
DROP TABLE t1; |
|
Thanks @rluvaton, Sure |
4c629e2 to
9df5882
Compare
ec3ca20 to
11a25cb
Compare
11a25cb to
e1de055
Compare
8eaeff5 to
29eae5b
Compare
5c530b0 to
29eae5b
Compare
29eae5b to
e7d80ae
Compare
dba2a02 to
0c457ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tglanz LGTM.
Hope to see you contributing again soon
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Take correct columns
Are these changes tested?
Yes,
Fuzz tests are taken from @rluvaton 's #18788 , excluding those this PR doesn't fix:
Are there any user-facing changes?