Write large strings in bounded memory #30

chkno · 2020-12-03T07:46:45Z

jsonstreams is a big win over the built-in json for bounding memory usage when encoding JSON documents that are large because they contain many elements, but it doesn't help for JSON documents that are large because they contain one large element -- the current implementation requires that each element be entirely loaded into memory for encoding.

I sketched a method of overcoming this limitation in this string-streams branch. The key thing there is the test_memory_usage test, which verifies that memory usage does not scale with element size. The changes currently in that branch to make that test pass are inelegant.

Thoughts?

The text was updated successfully, but these errors were encountered:

dcbaker · 2020-12-03T19:09:37Z

Hmmm. I'm just thinking out loud, but if jsonstreams used iterencode(), instead of encode(), you could probably (at least for values) just use a custom JsonEncoder class that know how to handle very large objects I think. If that would work that would be a more generic solution.

chkno · 2020-12-03T19:33:19Z

Yea, that sounds promising, as long as jsonstreams only holds a bounded number of iterencode-output chunks at a time (probably just one chunk). This would mostly affect the pretty printer, which is the only part of jsonstreams that looks at the encoded data before writing it.

chkno · 2020-12-04T05:29:55Z

Thanks for your help with this!

#32 serves my use case, so I'm going to close this now.

I feel kinda bad leaving leaving the pretty-printing code still using encode() rather than iterencode(), and so not getting the memory efficiency benefit. On the other hand, folks using pretty=True are probably using it with human consumption in mind, and so probably not using it on enormous elements that cause memory consumption problems.

Looking forward to the next release!

chkno mentioned this issue Dec 3, 2020

Use iterencode #32

Merged

chkno closed this as completed Dec 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write large strings in bounded memory #30

Write large strings in bounded memory #30

chkno commented Dec 3, 2020

dcbaker commented Dec 3, 2020

chkno commented Dec 3, 2020

chkno commented Dec 4, 2020

Write large strings in bounded memory #30

Write large strings in bounded memory #30

Comments

chkno commented Dec 3, 2020

dcbaker commented Dec 3, 2020

chkno commented Dec 3, 2020

chkno commented Dec 4, 2020