Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doesn't support analyzing statement type [set_statement] for SQL: set hive.execution.engine=mr; #504

Open
Light-Towers opened this issue Dec 13, 2023 · 6 comments
Labels
bug Something isn't working parser something that requires a strict/validating SQL parser

Comments

@Light-Towers
Copy link

Describe the bug

  • error: sqllineage.exceptions.UnsupportedStatementException: SQLLineage doesn't support analyzing statement type [set_statement] for SQL:set hive.auto.convert.join=false;

To Reproduce

# -*- coding: utf-8 -*-

from sqllineage.runner import LineageRunner

def test_create_as():

    sql = """
set hive.auto.convert.join=false;

drop table if exists dw.temp_A;
CREATE TABLE dw.temp_A(
  aaa string COMMENT 'aaa值')
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS ORC
 LOCATION 'hdfs://nameservice1/user/hive/warehouse/dw/temp/temp_A';
    """

    # result = LineageRunner(sql)
    result = LineageRunner(sql, 'hive')

    result.print_column_lineage()
    print(result.source_tables)
    print(result.target_tables)


if __name__ == "__main__":
    test_create_as()

Expected behavior
In our Hive SQL file, excute task need change engine (Tez or MR). Ignore set statement during parsing .

Python version (available via python --version)

  • 3.10.11

SQLLineage version (available via sqllineage --version):

  • 1.4.9
@Light-Towers Light-Towers added the bug Something isn't working label Dec 13, 2023
@reata
Copy link
Owner

reata commented Dec 23, 2023

This is already supported in master branch via #501

@reata reata closed this as completed Dec 23, 2023
@Light-Towers
Copy link
Author

This is already supported in master branch via #501
@reata you mean the next version is ok?

@reata
Copy link
Owner

reata commented Dec 25, 2023

Yes, or you can install from master branch if you need it now.

@Light-Towers Light-Towers changed the title doesn't support analyzing statement type [set_statement] for SQL:set hive.auto.convert.join=false; doesn't support analyzing statement type [set_statement] for SQL: set hive.execution.engine=mr; Dec 28, 2023
@Light-Towers
Copy link
Author

Yes, or you can install from master branch if you need it now.

Hello, I have maked install from master, but it can't support set hive.execution.engine=mr;

# -*- coding: utf-8 -*-
from sqllineage.runner import LineageRunner
def test_create_as():
    sql = """
set hive.execution.engine=mr;
insert into db1.table1 select * from db2.table2;
    """
    # result = LineageRunner(sql)
    result = LineageRunner(sql, 'hive')

    result.print_column_lineage()
    print(result.source_tables)
    print(result.target_tables)

if __name__ == "__main__":
    test_create_as()

@Light-Towers
Copy link
Author

use LineageRunner(sql, 'sparksql') can parse

@reata
Copy link
Owner

reata commented Dec 28, 2023

Yes, sqlfluff hive dialect is flawed in parsing set_statement. When set value is string, it's only parsable when quoted.

set hive.execution.engine="mr";

We need to fix sqlfluff to get this supported then.

@reata reata reopened this Dec 28, 2023
@reata reata added the parser something that requires a strict/validating SQL parser label Feb 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parser something that requires a strict/validating SQL parser
Projects
None yet
Development

No branches or pull requests

2 participants