Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lineage inaccurate when CTE used in subquery #476

Closed
robertmatthewcook opened this issue Oct 23, 2023 · 1 comment · Fixed by #493
Closed

lineage inaccurate when CTE used in subquery #476

robertmatthewcook opened this issue Oct 23, 2023 · 1 comment · Fixed by #493
Labels
bug Something isn't working

Comments

@robertmatthewcook
Copy link

>>> from sqllineage.runner import LineageRunner

>>> sql_1="""

... with unique_nicknames as (select nickname from nicknames group by nickname having count(distinct name) = 1)

... select name, min(replacements) as smallest_step, max(replacements) as largest_step from (

...       select names.fname, [nicknames.name](http://nicknames.name/), count(*) as replacements from unique_nicknames join nicknames on nicknames.nickname = unique_nicknames.nickname join names on names.fname = nicknames.nickname group by names.fname, [nicknames.name](http://nicknames.name/)

... ) as replaceable group by name

... """

>>> sql_2="""

... select name, min(replacements) as smallest_step, max(replacements) as largest_step from (

...       with unique_nicknames as (select nickname from nicknames group by nickname having count(distinct name) = 1)

...       select names.fname, [nicknames.name](http://nicknames.name/), count(*) as replacements from unique_nicknames join nicknames on nicknames.nickname = unique_nicknames.nickname join names on names.fname = nicknames.nickname group by names.fname, [nicknames.name](http://nicknames.name/)

... ) as replaceable group by name

... """

>>> LineageRunner(sql=sql_1,dialect="redshift").source_tables

[Table: <default>.names, Table: <default>.nicknames]

>>> LineageRunner(sql=sql_2,dialect="redshift").source_tables

[Table: <default>.nicknames]

>>>
@robertmatthewcook robertmatthewcook added the bug Something isn't working label Oct 23, 2023
@reata
Copy link
Owner

reata commented Oct 24, 2023

Is this the SQL you're using?

with unique_nicknames as (select nickname from nicknames group by nickname having count(distinct name) = 1)
select name, min(replacements) as smallest_step, max(replacements) as largest_step from (
select names.fname, nicknames.name, count(*) as replacements from unique_nicknames join nicknames on nicknames.nickname = unique_nicknames.nickname join names on names.fname = nicknames.nickname group by names.fname, nicknames.name
) as replaceable group by name;



select name, min(replacements) as smallest_step, max(replacements) as largest_step from (
with unique_nicknames as (select nickname from nicknames group by nickname having count(distinct name) = 1)
select names.fname, nicknames.name, count(*) as replacements from unique_nicknames join nicknames on nicknames.nickname = unique_nicknames.nickname join names on names.fname = nicknames.nickname group by names.fname, nicknames.name
) as replaceable group by name;

Because I see markdown syntax like [nicknames.name](http://nicknames.name/) in your sql text and it won't parse.

If the SQL is as I pasted above, I can confirm this is a bug and ansi also suffers from the same issue. Looks like we are not handling CTE within subquery.

@reata reata changed the title lineage inaccurate when cte part of subquery for redshift dialect lineage inaccurate when CTE used in subquery Dec 9, 2023
@reata reata closed this as completed in #493 Dec 9, 2023
@reata reata changed the title lineage inaccurate when CTE used in subquery lineage inaccurate when CTE used in CTE Dec 10, 2023
@reata reata changed the title lineage inaccurate when CTE used in CTE lineage inaccurate when CTE used in subquery Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants