You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
How can I used ijson to parse a large gzipped json file?
Background: I have a 3 GB compressed json file, that will expand to 32 GB. I'd rather process it record by record.
Detailed description
It's actually handled completely transparently. Just open the gzip file with gzip
import gzip
with gzip.open("very_large_file.json.gz", 'rb') as f:
parser = ijson.items(f, 'item')
for item in parser:
print(item)
Why is this not clear from the documentation
It's not mentioned, so I created this pseudo issue, so it's obvious to other people in the future.
The text was updated successfully, but these errors were encountered:
@davies-w thanks for submitting the question/answer combo. I'll close this issue and won't take any further action, although it will remain for perpetuity in the system, and will eventually be indexed by search engines, hence potentially being found by whoever runs into this same issue.
As pointed out by @jpmckinney though, this is the expected behaviour of the gzip module (i.e., its open function returns an object with a file-like interface). I do agree that the ijson documentation could do better in specifying its requirements, for example providing signatures of methods that are expected in the input file-like objects, or having a more elaborate "Examples" section, but that's further work that I'm not planning to do at the moment (I'm open to discuss PRs though).
Description
How can I used ijson to parse a large gzipped json file?
Background: I have a 3 GB compressed json file, that will expand to 32 GB. I'd rather process it record by record.
Detailed description
It's actually handled completely transparently. Just open the gzip file with gzip
Why is this not clear from the documentation
It's not mentioned, so I created this pseudo issue, so it's obvious to other people in the future.
The text was updated successfully, but these errors were encountered: