Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple join seems buggy #1086

Closed
eitsupi opened this issue Oct 27, 2022 · 5 comments
Closed

Multiple join seems buggy #1086

eitsupi opened this issue Oct 27, 2022 · 5 comments
Labels
bug Invalid compiler output or panic compiler

Comments

@eitsupi
Copy link
Member

eitsupi commented Oct 27, 2022

Sorry if this has already been reported.
Queries containing multiple joins do not seem to translate well to SQL?

In the following example, I think it need to use WITH one more time.

from tbl_a
group [id, col_a] (
  sort col_a
  take 1
)
select [
  id_a = id,
  col_b
]
join side:left tbl_b id_a
select [
  id_b = id,
  id_a,
  col_b
]
join side:left tbl_c id_b
select [
  id_a,
  id_b,
  col_b,
  col_c
]
WITH table_0 AS (
  SELECT
    tbl_a.*,
    ROW_NUMBER() OVER (
      PARTITION BY id,
      col_a
      ORDER BY
        col_a
    ) AS _rn_88
  FROM
    tbl_a
),
table_1 AS (
  SELECT
    id AS id_a,
    col_b
  FROM
    table_0
  WHERE
    _rn_88 <= 1
)
SELECT
  id_a,
  id_b,
  table_1.col_b,
  tbl_c.col_c
FROM
  table_1
  LEFT JOIN tbl_b USING(id_a)
  LEFT JOIN tbl_c USING(id_b)
@max-sixty
Copy link
Member

Thanks for the issue @eitsupi

There are a couple of issues on this that we're working through, as part of a broader refactor. #820 is an example.

It's useful to have additional examples. Could you show the query you think it should produce? I wasn't sure what "use WITH one more time" referred to. Thanks!

@aljazerzen
Copy link
Member

I don't think this was reported already.

It seems that the problem is that id_a should be in a SELECT preceeding the SELECT with JOIN. Is this right?

If so, this will probably be solved by #593.

@eitsupi
Copy link
Member Author

eitsupi commented Oct 28, 2022

Thank you for your reply and sorry for the lack of explanation.
I was assuming the following SQL.

WITH table_0 AS (
  SELECT
    tbl_a.*,
    ROW_NUMBER() OVER (
      PARTITION BY id,
      col_a
      ORDER BY
        col_a
    ) AS _rn_88
  FROM
    tbl_a
),
table_1 AS (
  SELECT
    id AS id_a,
    col_b
  FROM
    table_0
  WHERE
    _rn_88 <= 1
),
table_2 AS (
  SELECT
    id AS id_b,
    id_a,
    col_b
  FROM
    table_1
    LEFT JOIN tbl_b USING(id_a)
),
SELECT
  id_a,
  id_b,
  col_b,
  col_c
FROM
  table_2
  LEFT JOIN tbl_c USING(id_b)

I think the generated SQL's last join will fail because the left table does not have id_b.

Sorry the column names are confusing, but the purpose is to join multiple tables by traversing foreign keys.

@aljazerzen
Copy link
Member

Yes, this is a bug. I don't think it a problem with multiple joins, but how USING is implemented. Here is a minimal example:

from tbl_a
select id_b = id
join tbl_b id_b

I'm fairly certain that #593 indeed solves this, let's revisit this issue when it merges in a few weeks.

@aljazerzen aljazerzen added bug Invalid compiler output or panic compiler labels Oct 28, 2022
@aljazerzen
Copy link
Member

This has indeed been solved:

from tbl_a
select id_b = id
join tbl_b [~id_b]
SELECT
  tbl_b.*,
  tbl_a.id AS id_b
FROM
  tbl_a
  JOIN tbl_b ON tbl_a.id = tbl_b.id_b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic compiler
Projects
None yet
Development

No branches or pull requests

3 participants