Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typed Series for json_path_match #3373

Open
cjermain opened this issue May 12, 2022 · 1 comment
Open

Typed Series for json_path_match #3373

cjermain opened this issue May 12, 2022 · 1 comment
Labels
enhancement New feature or an improvement of an existing feature

Comments

@cjermain
Copy link
Contributor

Nested JSON fields are unfortunately common in some parquet and JSON files. While polars provides a json_path_match function, it can only return a string in the current implementation. This makes it of limited value when working with nested JSON fields of various types within a Utf8Chunked array. I did a POC locally showing that correct typing can be achieved with a few tweaks to json_path_match, and a modification of the deserialization approach in arrow2. I put up a PR on arrow2 (jorgecarleitao/arrow2#989) to address the underlying support for JSON parsing from a arrow2::array::Utf8Array, and will follow up with a PR that uses that support to output a Series with proper type.

When this is added, I think the API for json_path_match should change as follows:

fn json_path_match(&self, json_path: &str) -> Result<Series> {
    ...
}
@ritchie46
Copy link
Member

That sounds like a good idea to me!

@stinodego stinodego added enhancement New feature or an improvement of an existing feature and removed feature labels Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants