-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I'm struggling to put together the concepts you've laid out well in the docs into a solution where I can stream in some JSON, update some values, and stream it out.
I have this toy example JSON:
{
"0": {"foo": "bar"},
"1": {"foo": "bar"},
"2": {"foo": "bar"},
"3": {"foo": "bar"},
"4": {"foo": "bar"},
"5": {"foo": "bar"},
"6": {"foo": "bar"},
"7": {"foo": "bar"},
"8": {"foo": "bar"},
"9": {"foo": "bar"}
}where I want to update the value for every odd (int-ified) key to {"foo": "BAR"}:
{
"0": {"foo": "bar"},
"1": {"foo": "BAR"},
"2": {"foo": "bar"},
"3": {"foo": "BAR"},
"4": {"foo": "bar"},
"5": {"foo": "BAR"},
"6": {"foo": "bar"},
"7": {"foo": "BAR"},
"8": {"foo": "bar"},
"9": {"foo": "BAR"}
}The only I thing I've made work is:
@streamable_dict
def update(data):
for key, value in data.items():
if int(key) % 2 == 1:
value = {"foo": "BAR"}
else:
value = dict(value)
yield key, value
with open("input.json") as f_in:
data = json_stream.load(f_in, persistent=True)
updated_data = update(data)
with open("output.json", "w") as f_out:
json.dump(updated_data, f_out, indent=1)But I have to use persistent=True to make that work and that uses 2X more memory over the standard lib's load and dump functions.
I've looked at the Encoding json-stream objects section, but cannot figure out what it'd take to make either json-stream's default function or JSONStreamEncoder class work for me. I've also tried to figure out if the visitor pattern is applicable.
Generally, my stumbling block seems to be getting the worker/procesor in between json-stream's decoder and standar lib's encoder.
Do you have a concrete example of doing a "streaming transform" of some JSON?