Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up results serialization #46

Open
mkardas opened this issue Dec 6, 2022 · 0 comments · May be fixed by #47
Open

Speed up results serialization #46

mkardas opened this issue Dec 6, 2022 · 0 comments · May be fixed by #47
Labels
enhancement New feature or request

Comments

@mkardas
Copy link

mkardas commented Dec 6, 2022

Describe a requested feature

I was running some performance tests and I noticed that checking if an object is pickable:

outputs = self.check_picklable(outputs)
takes a lot of time when the output is big (f.e., when a model returns a large logits tensor), because the whole object is being serialized into memory and then deserialized. I wonder what are the cases in which check_pickable helps, as dataclasses and ModelOutput should be as pickable as its dictionary representation.

If the check is still needed, I guess the code could be still sped up by modifying an object only on pickle failure. That would require some workarounds (perhaps overriding https://github.com/python/cpython/blob/9dc787ea96916552695e79397588fdfa68f22024/Lib/multiprocessing/queues.py#L275) so I want to make sure the check is still necessary, before giving it a shot. Another option is to always check for

if _is_dataclass_instance(obj) or isinstance(obj, ModelOutput):
_obj = asdict(obj)
_obj["orig_dataclass_type"] = obj.__class__
obj = _obj
and modify the object even if it's pickable, but that would remove custom fields added outside a definition of a given class.

@mkardas mkardas added the enhancement New feature or request label Dec 6, 2022
@mkardas mkardas linked a pull request Dec 6, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant