Skip to content

ENH: json_normalize should work with JSON #61006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
jessekv opened this issue Feb 25, 2025 · 3 comments
Closed
2 of 3 tasks

ENH: json_normalize should work with JSON #61006

jessekv opened this issue Feb 25, 2025 · 3 comments
Assignees
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@jessekv
Copy link

jessekv commented Feb 25, 2025

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I wish pd.json_normalize accepted JSON (as str or bytes), and not just dict.

Or, as a joke, there could be a pd.dict_normalize that only accepts JSON ;)

Feature Description

Given a Series with JSON as str or bytes:

>>> df["data"]
0                  {"value":0.0}
1          {"value":0.005787037}
2         {"value":0.0115740741}
3         {"value":0.0173611111}

It should be possible to parse the JSON with pd.json_normalize, e.g.

>>> pd.json_normalize(df["data"])
            value
0        0.000000
1        0.005787
2        0.011574
3        0.017361

Pandas already has good JSON integration, so don't see why it can't be done.

Alternative Solutions

From what I understand, right now it must be first parsed with some other library, e.g. with apply, before using pd.json_normalize.

>>> import json
>>> pd.json_normalize(df["data"].apply(json.loads))
            value
0        0.000000
1        0.005787
2        0.011574
3        0.017361

Additional Context

With better JSON/JSONB support in databases like postgres and sqlite, encountering this sort of data is becoming more common, and the intermediate apply step is a performance and usability issue:

>>> import json
>>> df = pd.read_sql(sql=query, con=conn)
>>> pd.json_normalize(df["data"].apply(json.loads))
            value
0        0.000000
1        0.005787
2        0.011574
3        0.017361
@jessekv jessekv added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 25, 2025
@arthurlw
Copy link
Contributor

take

@Abhibhav2003
Copy link

Abhibhav2003 commented Mar 1, 2025

Can you please clarify @vdwees

Is this what you are expecting ?

df = pd.json_normalize([
    '{"value": 0.0}',
    '{"value": 0.005787037}',
    '{"value": 0.0115740741}',
    '{"value": 0.0173611111}'
])

print(df)

Output :

value
0.0
0.005787037
0.0115740741
0.0173611111

@mroeschke
Copy link
Member

As explained in #61056 (review), I would be -1 on the feature request.

Thanks for the suggestion, but going to close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants