You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This format isn't directly supported by pandas. The data is in the "records" orient, but there is an extra layer. Currently to load this file I am using the requests module to load from https, then using the json module to strip out the outer layer, then feeding this data to pd.read_json as text. This feels like overkill, since pandas can read from https, but for this format (which is common for APIs) I need multiple other packages and lines of code.
This will change a three import, multi-line issue into a single import, single line solution.
Describe the solution you'd like
While I initially describe this as a new orient, I don't think that is the best way to implement this. I believe the read_json function should have a new parameter (such as "strip_layer") which will be the value of that outer layer. In the example above that would be "results". I make this suggestion as what is inside the outer layer could be several different orients, so we need to leave that as a possibility. This is something that happens first, then the data is processed.
True, I wasn't the most efficient in my example. Still need another package either way.
attack68
added
IO JSON
read_json, to_json, json_normalize
Styler
conditional formatting using DataFrame.style
and removed
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Feb 19, 2021
Is your feature request related to a problem?
I see many APIs return results in the form:
This format isn't directly supported by pandas. The data is in the "records" orient, but there is an extra layer. Currently to load this file I am using the requests module to load from https, then using the json module to strip out the outer layer, then feeding this data to pd.read_json as text. This feels like overkill, since pandas can read from https, but for this format (which is common for APIs) I need multiple other packages and lines of code.
This will change a three import, multi-line issue into a single import, single line solution.
Describe the solution you'd like
While I initially describe this as a new orient, I don't think that is the best way to implement this. I believe the
read_json
function should have a new parameter (such as "strip_layer") which will be the value of that outer layer. In the example above that would be "results". I make this suggestion as what is inside the outer layer could be several different orients, so we need to leave that as a possibility. This is something that happens first, then the data is processed.API breaking implications
Need to consider what this means for chunking.
Additional context
My current code:
versus my desired code with this improvement:
Might I suggest this gets added to the IO Method Robustness/Input Types Project?
The text was updated successfully, but these errors were encountered: