Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Source Tables recognition when using SqlFluffLineageAnalyzer #398

Closed
Aka-shi opened this issue Jun 22, 2023 · 2 comments
Closed
Labels
bug Something isn't working

Comments

@Aka-shi
Copy link

Aka-shi commented Jun 22, 2023

Sample query

sql = """
Insert into target_table

WITH pl AS (
  Select id, school
From table1)

,fb AS (
Select id, school
FROM table2)

SELECT
table3.id, table3.name, table1.school
FROM table3
LEFT JOIN table1 ON table3.id = table1.id

UNION

SELECT
table4.id, table4.name, table2.school
FROM table4
LEFT JOIN table2 ON table4.id = table2.id
"""

When I run result = LineageRunner(sql) I get the following result which is the correct one.

Result is  Statements(#): 1
Source Tables:
    <default>.table1
    <default>.table2
    <default>.table3
    <default>.table4
Target Tables:
    <default>.target_table

But when I provide a dialect result = LineageRunner(sql, 'ansi'), I get the following result

Result is  Statements(#): 1
Source Tables:
    <default>.table1
    <default>.table2
Target Tables:
    <default>.target_table

It doesn't seem to parse the query after CTEs. This looks like a bug. If anyone can guide me on how I can fix this or take this up it would be really helpful since sqlparse is being deprecated in future versions.

@Aka-shi
Copy link
Author

Aka-shi commented Jun 22, 2023

It gives correct result when I modify the query to this

sql = """
Insert into target_table

WITH pl AS (
  Select id, school
From table1)

,fb AS (
Select id, school
FROM table2)

SELECT
table3.id, table3.name, table1.school
FROM table3
LEFT JOIN table1 ON table3.id = table1.id

@Aka-shi Aka-shi changed the title Union is not recognized when using SqlFluffLineageAnalyzer Incorrect Source Tables recognition when using SqlFluffLineageAnalyzer Jun 22, 2023
@reata
Copy link
Owner

reata commented Jun 24, 2023

Yes, it's a bug. It's because we didn't handle UNION in DmlCteExtractor

UNION is of type set_expression in sqlfluff AST, and we should hanlde set_expression the same way as select_statement. This should be a simple fix, looking forward to your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants