Skip to content

Conversation

@stafak
Copy link

@stafak stafak commented Aug 3, 2020

Hi!

I'm dealing with dataclasses that contain lists and numpy arrays that are very large but are excluded when encoding to JSON. Unfortunately, these lists and arrays make the serialization take a very long time despite being excluded. This appears to be due to _asdict running recursively on each list/array item before the exclude predicate is checked.

Unless I'm missing something, it's not possible to control this from outside the library, so I moved the logic that checks the exclude predicate to _asdict – before the function calls itself recursively on the items of the excluded field.

With a dataclass that contains a few simple fields as well as an excluded List[int] of length 10 000 000, this reduced the encoding time from over 30 seconds to a small fraction of a second on my machine.

This makes encoding of dataclasses that contain large excluded lists much faster.
@george-zubrienko
Copy link
Collaborator

@stafak hello :) could you resolve conflicts and add unit tests to back your statement?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants