Skip to content

ENH: When running json_normalize over a pandas dataframe, be able to insert/keep an index/Id series/column #60272

@frbelotto

Description

@frbelotto

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Currently I am running a pandas Json_normalize in a dataframe where one of its columns is a nested json. The output dataframe is a unnested json with all fields and values.

After facing my needed, I´ve questioned about it on stackoverflow, but I´ve realized that the solution os not really "user friendly".

Feature Description

So, I would like to suggest that when running a json_normalize over a dataframe, I could set anoter column/series to be kept with the resulting dataframe.

Alternative Solutions

Concatenating commands is a solution, but not really "easy" (and I don´t know about performance impact using this)

import pandas as pd
import json

data = {
            "id de transação": [1, 2, 3, 4, 5],
            "nome": ["Alice", "Bob", "Charlie", "David", "Eve"],
            "dados": [
                {"data": "2024-01-01", "local": "São Paulo", "valor": 100.50},
                {"data": "2024-01-02", "local": "Rio de Janeiro", "valor": 200.75},
                {"data": "2024-01-03", "local": "Belo Horizonte", "valor": 300.00},
                {"data": "2024-01-04", "local": "Curitiba", "valor": 400.25},
                {"data": "2024-01-05", "local": "Porto Alegre", "valor": 500.50}
            ]
        }
df = pd.DataFrame(data)

out = (df[['id de transação', 'nome']]
       .join(pd.json_normalize(data=df['dados'], record_path=None)
               .set_axis(df.index)
            )
      )

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsEnhancementIO JSONread_json, to_json, json_normalize

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions