Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why ClickHouse didn't allow more than one JOIN in query? #873

Closed
happyharry123 opened this issue Jun 14, 2017 · 14 comments
Closed

Why ClickHouse didn't allow more than one JOIN in query? #873

happyharry123 opened this issue Jun 14, 2017 · 14 comments
Assignees

Comments

@happyharry123
Copy link

@happyharry123 happyharry123 commented Jun 14, 2017

Dear All, I'm just moving MySQL to ClickHouse. However, when I wrote my query with more than one JOIN. ClickHouse is unable to execute. The following is my query.

select hours.hour, IFNULL(data.countError, 0) as countError, IFNULL(data2.countFatal, 0) as countFatal from (
                        SELECT 0 hour UNION SELECT 1 hour UNION SELECT 2 hour UNION SELECT 3 hour UNION SELECT 4 hour UNION SELECT 5 hour UNION 
                        SELECT 6 hour UNION SELECT 7 hour UNION SELECT 8 hour UNION SELECT 9 hour UNION SELECT 10 hour UNION SELECT 11 hour UNION 
                        SELECT 12 hour UNION SELECT 13 hour UNION SELECT 14 hour UNION SELECT 15 hour UNION SELECT 16 hour UNION SELECT 17 hour UNION 
                        SELECT 18 hour UNION SELECT 19 hour UNION SELECT 20 hour UNION SELECT 21 hour UNION SELECT 22 hour UNION SELECT 23 hour 
                        ) as hours left join 
                        ( 
                        select toHour(log_datetime) as hour, count(*) as countError from DB where priority='ERROR' and toDateTime(log_datetime) = '2017-06-05' group by toHour(log_datetime) 
                        ) as data on hours.hour = data.hour
			left join
                        (select toHourour(log_datetime) as hour, count(*) as countFatal from DB where priority='FATAL' and toDateTime(log_datetime) =  '2017-06-05' group by toHour(log_datetime)
                        ) as data2 on hours.hour = data2.hour

Suppose, the result will be

Hour|countError|countFatal
0	0	0
1	0	0
2	1	0
3	0	3
4	0	2
5	0	0
6	0	9
7	0	7
8	2	0
9	0	5
10	2	0
11	1	0
12	2	4
13	4	1
14	4	0
15	2	2
16	6	0
17	1	0
18	3	0
19	0	0
20	2	0
21	0	0
22	12	0
23	0	0

I spend a long time to look at the reference in https://clickhouse.yandex/reference_en.html
Since I'm a layman in database/ClickHouse. I found very hard to convert all MySQL query into ClickHouse's one. Every time I write a query, I have to check the reference and confirm it is right.
Thank you for all your attention. Cheers

@ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Jun 16, 2017

Yes, ClickHouse SQL dialect is pretty non-standard (though we are working on making it more standards-compliant). Specifically, more than one JOIN in a query is currently not allowed. You can simulate multi-way JOIN with pairwise JOINs and subqueries. Here is an example: #532 (comment)

@dveselov
Copy link

@dveselov dveselov commented Jul 4, 2018

Any progress on this?

@filimonov
Copy link
Collaborator

@filimonov filimonov commented Jul 4, 2018

https://clickhouse.yandex/docs/en/roadmap/
In roadmap on Q4 of 2018 (but it's just a roadmap, not a hard schedule).

@sartor
Copy link

@sartor sartor commented Sep 12, 2018

Is there any progress for standard join syntax?

@blinkov
Copy link
Member

@blinkov blinkov commented Sep 12, 2018

@sartor There is some progress like basic ON support, removal of most limitations for right side, better asterisk behaviour, ANY/ALL can be configured to be optional.

Multiple JOINs per SELECT are still not implemented yet, but they are next in queue of SQL compatibility tasks.

@sartor
Copy link

@sartor sartor commented Sep 12, 2018

@simPod
Copy link
Contributor

@simPod simPod commented Sep 12, 2018

Do you know it is possible to multi join already, right? Queries are just a bit ugly but it works.

@zhang2014
Copy link
Member

@zhang2014 zhang2014 commented Sep 13, 2018

By the way, does this task introduce a cost model ? or how do you determine the access path for the base table ?

@blinkov
Copy link
Member

@blinkov blinkov commented Sep 13, 2018

@zhang2014 syntax and execution strategies are separate stories. You can write multi-way join even right now, but it requires explicit additional subqueries with two-way joins of inner subquery and Nth table. Execution improvements are also planned, but in previous comment I meant only syntax.

@xiaods
Copy link

@xiaods xiaods commented Sep 14, 2018

need this feature , please asap. +1

@ed-fruty
Copy link

@ed-fruty ed-fruty commented Jan 18, 2019

2k19. Up ;)

@blinkov
Copy link
Member

@blinkov blinkov commented Jan 18, 2019

This has been partially implemented in #3946, but with some column resolution issues, so it's not announced yet. Stay tuned.

@4ertus2
Copy link
Member

@4ertus2 4ertus2 commented Feb 20, 2019

Some progress #4462 #4474

@4ertus2
Copy link
Member

@4ertus2 4ertus2 commented Mar 6, 2019

Enabled in master with some restrictions:

  • you need to list all selected columns. Asterisks (* / t.*) do not work
  • prewhere does not work
  • complex aliases in JOIN ON section do not work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet