Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presto to Spark query translation incorrect #3374

Closed
cploonker opened this issue Apr 29, 2024 · 0 comments · Fixed by #3376
Closed

Presto to Spark query translation incorrect #3374

cploonker opened this issue Apr 29, 2024 · 0 comments · Fixed by #3376
Assignees

Comments

@cploonker
Copy link

Before you file an issue

  • Make sure you specify the "read" dialect eg. parse_one(sql, read="spark")
  • Make sure you specify the "write" dialect eg. ast.sql(dialect="duckdb")
  • Check if the issue still exists on main

Fully reproducible code snippet
Please include a fully reproducible code snippet or the input sql, dialect, and expected output.

Presto query

SELECT
  JSON_EXTRACT_SCALAR(
    TRY(
      FILTER(
        CAST(JSON_EXTRACT(context, '$.active_sessions') AS ARRAY(MAP(VARCHAR, VARCHAR))),
        x -> x['event_data_schema'] = 'PresentationSession'
      )[1]['event_data']
    ),
    '$.thread_id'
  )

SQL Glot translated it to following incorrect Spark sql

SELECT
  GET_JSON_OBJECT(
    TRY(
      FILTER(
        CAST(GET_JSON_OBJECT(context, '$.active_sessions') AS ARRAY<MAP<STRING, STRING>>),
        x -> x['event_data_schema'] = 'PresentationSession'
      )[0]['event_data']
    ),
    '$.thread_id'
  )
  1. TRY does not work in spark.
  2. CAST output of GET_JSON_OBJECT to ARRAY does not work in Spark.

Correct Spark sql is as follows

SELECT
  GET_JSON_OBJECT(
      FILTER(
        FROM_JSON(GET_JSON_OBJECT(context, '$.active_sessions'), 'ARRAY<MAP<STRING, STRING>>'),
        x -> x['event_data_schema'] = 'PresentationSession'
      )[0]['event_data']
    ,
    '$.thread_id'
  )

Official Documentation
Please include links to official SQL documentation related to your issue.

@cploonker cploonker changed the title Presto to hive query translation incorrect Presto to Spark query translation incorrect Apr 30, 2024
@georgesittas georgesittas self-assigned this Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants