Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Clickhouse JSON Extract* Functions #2051

Closed
BTheunissen opened this issue Aug 14, 2023 · 7 comments · Fixed by #2925
Closed

Support for Clickhouse JSON Extract* Functions #2051

BTheunissen opened this issue Aug 14, 2023 · 7 comments · Fixed by #2925
Assignees

Comments

@BTheunissen
Copy link

Is your feature request related to a problem? Please describe.
Clickhouse supports a series of JSON string member extraction functions as documented here.

It would be helpful for these to be implemented in SQLGlot in a manner which will allow transpilation of Clickhouse queries into other SQL languages that also support JSON manipulation functions, in my current use-case these functions are used quite often for parsing columns containing unstructured data.

Describe the solution you'd like
Support JSONExtractString function in Clickhouse dialect (and other appropriate JSON functions supported by Clickhouse that make sense to support in SQLGlot).

@georgesittas
Copy link
Collaborator

Hey, thanks for the report.

I'll take a look at these JSON functions soon, but I don't think it's currently in scope to add support for all of them. I can add support for JSONExtractString and if you want you can use that as an example to implement the rest.

@georgesittas georgesittas self-assigned this Aug 14, 2023
@BTheunissen
Copy link
Author

I'd be happy to! Of the functions I believe JSONExtractString is probably the lowest hanging fruit that is used the most broadly.

@georgesittas
Copy link
Collaborator

@BTheunissen I'm a bit skeptical about mapping JSONExtractString into an expression type: this function is used a bit differently in ClickHouse, compared to the equivalent tools in other dialects, so it doesn't seem obvious that doing this will help with the transpilation much.

For instance, the Postgres ->> operator is used on JSON-typed values, whereas the above function expects a string argument. Also, other dialects have different formatting conventions for extracting JSON fields, e.g. we have a test like stream_data ->> '$.data.results' for MySQL.

Perhaps it would help if you shared a bit more context on what you want to transpile it to, but I'm not sure if the ROI of adding this will be worth it.

@BTheunissen
Copy link
Author

I think I was misinterpreting the expression type usage in the codebase as it currently stands. I was hoping that as JSON functions are generally supported across most SQL dialects, we should be able to find a common form of "get an element from this JSON document", and then identify those functions as being interchangeable between SQL dialects, e.g. JSONExtractString in a Clickhouse query could be mapped into JSON_EXTRACT_STRING in DuckDB, etc.

If mapping functions such as these between dialects is currently not in scope within the project then that is totally understandable.

@georgesittas
Copy link
Collaborator

Thanks for clarifying.

I think I was misinterpreting the expression type usage in the codebase as it currently stands.

Your understanding of the Expression type usage actually seems correct: indeed, one goal is to create expression types that form a "common denominator" between different dialects, so that generating SQL back for them becomes easier.

I was hoping that as JSON functions are generally supported across most SQL dialects, we should be able to find a common form of "get an element from this JSON document", and then identify those functions as being interchangeable between SQL dialects

What you're describing makes sense, but it looks like it would require some effort to get right and it's not a high priority at the moment for us. If you want to work on this we could get in touch through Slack.

I'll close this as "not planned" for now, may revisit in the future.

@georgesittas
Copy link
Collaborator

georgesittas commented Feb 7, 2024

@BTheunissen check out the PR that's linked to this issue, it should address the JSONExtractString case as we'd discussed. There have been several improvements lately concerning JSON path transpilation and so it was simpler to tackle.

@BTheunissen
Copy link
Author

@georgesittas This is fantastic, thank you! Testing this out now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants