Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists)

Hi there,

I recently wrote a very flexible module for flattening hierarchical (json) data into CSV: https://github.com/tkluck/Text-CSV-Flatten.

It encodes the exact semantics of the flattening by a pattern string. For example, the pattern `.<index>.*` flattens in the same way as `orient="records"`. The pattern `.*.<index>` flattens in the same way as `orient="columns"`.

The module is quite new and has already been very useful for me: I maintain an internal reporting tool for my employer, in which we expose hierarchical data. This module allows my users to download it as CSV with a minimum of effort on my side (no boilerplate) and on theirs.

I realized that the same thing might be very useful for Pandas. Not only does the pattern flatten the hierarchical data into rows and columns, it also encodes which columns are supposed to be an index. In CSV, this information in lost in the output format, but in a DataFrame, that remains meaningful.

This module fulfils a very similar function to what `io.json.read_json`, `io.json.json_normalize` and `io.json.nested_to_record` do. In fact, I think it can reproduce any of their semantics by just tweaking the pattern string.

I wouldn't mind at all to spend some time on adding this to Pandas. If I do that, would you be interested in merging it into your next version?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists) #12286

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature suggestion: flexible hierarchical data (json) importer (will implement if interest exists) #12286

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions