Skip to content

feat: pyspark support json array operation #11036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
karta0807913 opened this issue Mar 23, 2025 · 0 comments · May be fixed by #11064
Open
1 task done

feat: pyspark support json array operation #11036

karta0807913 opened this issue Mar 23, 2025 · 0 comments · May be fixed by #11064
Labels
feature Features or general enhancements

Comments

@karta0807913
Copy link

karta0807913 commented Mar 23, 2025

Is your feature request related to a problem?

No response

What is the motivation behind your request?

In my usage scenario, I often use JSON operations. However, Ibis does not support JSON array parsing.

import ibis
ibis.pyspark.connect().compile(ibis.literal("[1,2,3]", type="json").array)

the output is

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/__init__.py", line 135, in compile
    query = self.compiler.to_sqlglot(expr, limit=limit, params=params)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 597, in to_sqlglot
    sql = self.translate(table_expr.op(), params=params)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 665, in translate
    results = op.map(fn)
  File "/usr/local/lib/python3.10/dist-packages/ibis/common/graph.py", line 305, in map
    results[node] = fn(node, results, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 645, in fn
    result = self.visit_node(node, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 691, in visit_node
    return method(op, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 739, in visit_Literal
    return self.visit_DefaultLiteral(op, value=value, dtype=dtype)
  File "/usr/local/lib/python3.10/dist-packages/ibis/backends/sql/compilers/base.py", line 835, in visit_DefaultLiteral
    raise NotImplementedError(f"Unsupported type: {dtype!r}")

Describe the solution you'd like

I can create a new PR to implement this feature by using the FROM_JSON function which has been provided after spark 2.1.0

What version of ibis are you running?

10.3.1

What backend(s) are you using, if any?

pyspark

Code of Conduct

  • I agree to follow this project's Code of Conduct
@karta0807913 karta0807913 added the feature Features or general enhancements label Mar 23, 2025
karta0807913 added a commit to karta0807913/ibis that referenced this issue Mar 31, 2025
* use FROM_JSON function to unwrap the json
* compress the casting operation to match user's intention.

fixes ibis-project#11036
karta0807913 added a commit to karta0807913/ibis that referenced this issue Mar 31, 2025
* use FROM_JSON function to unwrap the json
* compress the casting operation to match user's intention.

fixes ibis-project#11036
@karta0807913 karta0807913 linked a pull request Mar 31, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: backlog
Development

Successfully merging a pull request may close this issue.

1 participant