Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create DataFrame from json have garbled code #8424

Closed
2 tasks done
zemelLeong opened this issue Apr 22, 2023 · 2 comments · Fixed by #8922
Closed
2 tasks done

Create DataFrame from json have garbled code #8424

zemelLeong opened this issue Apr 22, 2023 · 2 comments · Fixed by #8922
Labels
bug Something isn't working rust Related to Rust Polars

Comments

@zemelLeong
Copy link

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

As the title indicates. If the json string exists \n and Chinese characters. DataFrame create result will be garbled code

Reproducible example

use serde_json::json;
use std::io::Cursor;

use polars::prelude::*;

fn main() {
    let json_data = json!([{
        "id": 8,
        "name": "小P",
        "exception": "你好,polars。\n",
        "normal": "你好,polars。",
        "姓\n名": "小P",
    }]);
    let json_data_str = json_data.to_string();
    let json_cursor = Cursor::new(json_data_str);
    let df = JsonReader::new(json_cursor).finish().unwrap();
    println!("{:?}", df);
}

Run result

shape: (1, 5)
┌─────────────────┬─────┬──────┬────────────────┬────────────┐
│ exception       ┆ id  ┆ name ┆ normal         ┆ å§         │
│ ---             ┆ --- ┆ ---  ┆ ---            ┆ å          │
│ str             ┆ i64 ┆ str  ┆ str            ┆ ---        │
│                 ┆     ┆      ┆                ┆ str        │
╞═════════════════╪═════╪══════╪════════════════╪════════════╡
│ 你好ï¼polarsã ┆ 8   ┆ 小P  ┆ 你好,polars。 ┆ 小P        │
│                 ┆     ┆      ┆                ┆            │
└─────────────────┴─────┴──────┴────────────────┴────────────┘

image

Expected behavior

Expect the result be normal.

Installed versions

serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0.68"
polars = { version = "0.28.0", features = ["json", "lazy", "serde"] }
polars-sql = "0.28.0"
@universalmind303
Copy link
Collaborator

this seems to be a bug in the upstream parser. I opened up an issue in that repo with a minimum example
jorgecarleitao/json-deserializer#22

@zemelLeong
Copy link
Author

When can it be released? is any plan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rust Related to Rust Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants