Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible regression when parsing complex types with databricks dialect #3540

Closed
crash-g opened this issue May 23, 2024 · 0 comments · Fixed by #3541
Closed

Possible regression when parsing complex types with databricks dialect #3540

crash-g opened this issue May 23, 2024 · 0 comments · Fixed by #3541
Assignees

Comments

@crash-g
Copy link

crash-g commented May 23, 2024

The following code will throw sqlglot.errors.ParseError: Expecting >, even though it was previously working.

import sqlglot

sql = """
CREATE TABLE IF NOT EXISTS db.table (
  a ARRAY<
    STRUCT<
      b INT COMMENT 'nested comment',
      c: MAP<STRING,STRING>
    >
  > COMMENT 'complex type'
)
USING DELTA
"""

sqlglot.parse_one(sql, read="databricks")

This code works until version 23.16.0 and fails starting from version 23.17.0 (and including the latest version, 24.0.0).

Additional notes

In addition to the behavior described above, version 23.17.0 also changed the result of parsing complex types, even when the parsing does not fail:

from sqlglot import exp, parse_one

sql = """
CREATE TABLE db.table (
  area STRUCT<
    base: INT COMMENT 'a nested int',
    height: INT NOT NULL COMMENT 'another nested int'
  > NOT NULL COMMENT 'a struct column'
)
USING DELTA
"""

parsed_sql = parse_one(sql, read="databricks")
exp_schema = parsed_sql.find(exp.Schema, bfs=True)
column_def = exp_schema.find(exp.ColumnDef, bfs=True)
column_type = column_def.find(exp.DataType, bfs=True)

nested_type = column_type.find(exp.ColumnDef, bfs=True)
print([type(x) for x in nested_type.iter_expressions()])

Running the above code, the output with version 23.16.0 is

[<class 'sqlglot.expressions.Identifier'>, <class 'sqlglot.expressions.DataType'>, <class 'sqlglot.expressions.ColumnConstraint'>]

With version 23.17.0 instead, it is

[<class 'sqlglot.expressions.JSONExtract'>, <class 'sqlglot.expressions.ColumnConstraint'>]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants