Skip to content

Conversation

@jonahgao
Copy link
Member

Which issue does this PR close?

Closes #11621.

Rationale for this change

Given an example query:

SELECT * FROM t1 JOIN t2 ON t1.v1 = t2.v1 WHERE (t1.v1 == t2.v1) OR t1.v1

When removing the join predicate t1.v1 = t2.v1 from the parent filter (t1.v1 == t2.v1) OR t1.v1, this process can be considered as replacing the join predicate with true. The result should be TRUE OR t1.v1 and further simplified to empty, but currently, it is t1.v1. This will lead to incorrect join results.

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Jul 28, 2024

# Test issue: https://github.com/apache/datafusion/issues/11621
query BB
SELECT * FROM t1 JOIN t2 ON t1.v1 = t2.v1 WHERE (t1.v1 == t2.v1) OR t1.v1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query returns an empty result on the current main.

" Inner Join: t3.a = t4.a [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Inner Join: t1.a = t3.a [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Filter: t2.c < UInt32(15) AND t2.c = UInt32(688) [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Filter: t2.c = UInt32(688) [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove t1.a = t2.a from (t1.a = t2.a OR t2.c < 15) AND (t1.a = t2.a AND tc.2 = 688), and the result should be tc.2 = 688

@jonahgao

This comment was marked as resolved.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. Thank you @jonahgao for the fix (and @2010YOUY01 for SQLancer that found it 🙏 )

jonahgao and others added 2 commits July 30, 2024 10:04
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@jonahgao jonahgao merged commit 2f5e73c into apache:main Jul 30, 2024
@jonahgao
Copy link
Member Author

Thanks for the review @alamb

@jonahgao jonahgao deleted the remove_join_expr branch July 30, 2024 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect predicate evaluation result in a query (SQLancer-NoREC)

2 participants