-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for user-specified mapping type [was: Parsing into OrderedDict] #7
Comments
|
Here is some sample code: echo "{}{}" > test.json import ijson.backends.yajl2_cffi as ijson
with open('test.json', 'rb') as f:
for item in ijson.common.items(ijson.parse(f, multiple_values=True), ''):
print(type(item)) Output is:
|
In your example you are selecting the top-level element, which is an object; thus you get dictionaries. Have you had a look at the examples in https://github.com/ICRAR/ijson/blob/master/README.rst? I think you are basically after the lower-level |
I essentially want the option for this line to be I can re-implement |
There are a couple of gotchas with modifying the object builder directly:
So again, I think such a change would be a bit of an overkill. On the other hand, I think you basically want a modified version of this: isagalaev#62 (comment) from collections import OrderedDict
import ijson
from ijson.common import ObjectBuilder
def objects(data):
key = '-'
builder = None
for prefix, event, value in ijson.parse(data):
if not prefix and event == 'map_key':
if builder:
yield key, builder.value
key = value
builder = ObjectBuilder()
elif prefix.startswith(key):
builder.event(event, value)
if builder:
yield key, builder.value
with open('json.json', 'rb') as data:
result = OrderedDict(objects(data))
for key, value in result.items():
print(key, value) |
I do want it to affect all levels :) (like with the |
Right now, I need to do something like this (won't work with all backends): import ijson.backends.yajl2_cffi as ijson
# Copy of ijson.common.items, using different builder.
def items(prefixed_events, prefix):
prefixed_events = iter(prefixed_events)
try:
while True:
current, event, value = next(prefixed_events)
if current == prefix:
if event in ('start_map', 'start_array'):
builder = OrderedObjectBuilder()
end_event = event.replace('start', 'end')
while (current, event) != (prefix, end_event):
builder.event(event, value)
current, event, value = next(prefixed_events)
del builder.containers[:]
yield builder.value
else:
yield value
except StopIteration:
pass
# Copy of ObjectBuilder, using OrderedDict instead of dict.
class OrderedObjectBuilder(ijson.common.ObjectBuilder):
def event(self, event, value):
if event == 'start_map':
map = OrderedDict()
self.containers[-1](map)
def setter(value):
map[self.key] = value
self.containers.append(setter)
else:
super().event(event, value) Later, my code calls |
@jpmckinney thanks for the pointer to the mechanism used by the standard lib, I actually didn't know about it. Such a generic solution sounds good actually i.e., provide a |
@jpmckinney while we are at this, maybe offering an option for using something other than lists could also be a possibility worth considering. |
I'm not sure that I can get to it this week – and I'm not familiar with Python in C. Using something other than lists sounds interesting; however, it hasn't come up as an option in the standard library. I think it's fine to start with The standard library does allow alternative constructors through |
I already implemented the a new |
It works! Thanks |
Great! I'll close then for now, and will also adjust the title for future reference. |
I have code that re-orders JSON keys into a standardized order using
OrderedDict.move_to_end
. I want to use ijson to read the input iteratively. Presently, I think I would need to convert thedict
that ijson returns into anOrderedDict
, but my data has deep JSON objects, so this would be a fairly expensive operation. It would be faster to parse the data into anOrderedDict
directly.Is there an interest in adding this feature?
The text was updated successfully, but these errors were encountered: