Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parse between redshift JSON_EXTRACT_PATH_TEXT and databricks JSON path expression #2839

Closed
FlaviovLeal opened this issue Jan 16, 2024 · 1 comment · Fixed by #2883
Assignees

Comments

@FlaviovLeal
Copy link

FlaviovLeal commented Jan 16, 2024

Hello,

So I tried the following code and got no errors. But the result SQL would not work on Databricks.

import sqlglot

sql = """select JSON_EXTRACT_PATH_TEXT('{ "farm": {"barn": { "color": "red", "feed stocked": true }}}', 'farm', 'barn', 'color');"""
print(sqlglot.transpile(sql, read='redshift', write='databricks')[0])
# The printed results are as follow:
#  SELECT JSON_EXTRACT_PATH_TEXT('{ "farm": {"barn": { "color": "red", "feed stocked": true }}}', 'farm', 'barn', 'color')

There are two possible solutions that would work on Databricks

select '{ "farm": {"barn": { "color": "red", "feed stocked": true }}}':['farm']['barn']['color'];

And

select '{ "farm": {"barn": { "color": "red", "feed stocked": true }}}':farm.barn.color;

I recommend the first solution, as the following would work

select '{ "farm": {"barn": { "color": "red", "feed stocked": true }}}':['farm']['barn']['feed stocked'];

but the following would not

select '{ "farm": {"barn": { "color": "red", "feed stocked": true }}}':farm.barn.feed stocked;

I tried looking into the code but I was unable to find a solution. On the databricks side, it looks like it would work (Transformer, Parser
), as long as no more than one keyword was used and there were no spaces on the keyword ('color' would work, but 'feed stocked' would not). But I'm unsure if it should use JSONExtract as its a child from the binary class and it can take multiple arguments.

Lastly, despite saying it's redshift. The same problem should happen on postgres, as this function exists there as well.

@tobymao tobymao self-assigned this Jan 18, 2024
@tobymao tobymao removed their assignment Jan 21, 2024
@georgesittas georgesittas self-assigned this Jan 21, 2024
@georgesittas
Copy link
Collaborator

Planning to work on improving our parsing / generation logic for json paths this week, so this should be addressed soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants