Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move conditions from JOIN ON to WHERE #18720

Closed
alexey-milovidov opened this issue Jan 4, 2021 · 9 comments
Closed

Move conditions from JOIN ON to WHERE #18720

alexey-milovidov opened this issue Jan 4, 2021 · 9 comments
Assignees

Comments

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jan 4, 2021

Use case
Conditions that are not related to JOIN:
t1 JOIN t2 ON t1.k = t2.k AND t1.hello = 'world'
should be transformed to:
t1 JOIN t2 ON t1.k = t2.k WHERE t1.hello = 'world'
otherwise it does not work.

@hexiaoting
Copy link
Contributor

hexiaoting commented Jan 7, 2021

assign to me

@alexey-milovidov
Copy link
Member Author

@hexiaoting Ok. Thank you in advance!

@hexiaoting
Copy link
Contributor

hexiaoting commented Jan 18, 2021

@alexey-milovidov I have two questions:
Q1:
t1 JOIN t2 ON t1.k = t2.k AND t1.size > 10 will this query throw exception or translated to: t1 JOIN t2 ON t1.k = t2.k where t1.size > 10

Q2: t1 JOIN t2 ON t1.size == 10 and t2.size == 8 what kind of new sql should this query translated to ?

@hexiaoting
Copy link
Contributor

@alexey-milovidov I need some help, related to above questions. Thank you.

@vdimir
Copy link
Member

vdimir commented Feb 3, 2021

@hexiaoting sorry for long delay.

t1 JOIN t2 ON t1.k = t2.k AND t1.size > 10

It's very close to query in first comment of this issue, so correct behavior is t1 JOIN t2 ON t1.k = t2.k where t1.size > 10

Q2: t1 JOIN t2 ON t1.size == 10 and t2.size == 8 what kind of new sql should this query translated to ?

It should not be allowed as it is now.

@hexiaoting
Copy link
Contributor

hexiaoting commented Feb 3, 2021

@vdimir yeah, I have implemented this optimization and now the behavior is same with your opinion.

@vdimir
Copy link
Member

vdimir commented Mar 11, 2021

Closed via #19685

@vdimir vdimir closed this as completed Mar 11, 2021
@PHaroZ
Copy link

PHaroZ commented Mar 12, 2021

I think this could lead to error with ANY JOIN. A query like

SELECT *
FROM
	(SELECT 1 AS id) AS l
	INNER ANY JOIN (SELECT 1 AS id, number as v FROM system.numbers LIMIT 2) AS r
	          ON l.id = r.id AND r.v = 1

is not the same as

SELECT *
FROM
	(SELECT 1 AS id) AS l
	INNER ANY JOIN (SELECT 1 AS id, number as v FROM system.numbers LIMIT 2) AS r
	          ON l.id = r.id
WHERE
	r.v = 1

The first one should return a line while the second returns anything because it joins on the first machting line with v=0 and then apply v=1.

IMHO the where clause should be added to a subselect like

SELECT *
FROM
	(SELECT 1 AS id) AS l
	INNER ANY JOIN (SELECT * FROM (SELECT 1 AS id, number AS v FROM system.numbers LIMIT 2) WHERE v = 1) AS r
	          ON l.id = r.id;

@vdimir
Copy link
Member

vdimir commented Mar 12, 2021

@PHaroZ makes sense, thank you for your notice. We are going to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants