Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH use cloudpickle wrapper for initializer #147

Merged
merged 6 commits into from
Aug 24, 2018
Merged

Conversation

tomMoral
Copy link
Collaborator

Consistently serialize function from __main__ with cloudpickle when it is available.

Fix #146

Copy link
Collaborator

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will fix the following comments I made in the review myself and then merge.

docs/API.rst Outdated
Task & results serialization
----------------------------

To share function definition across multiple python processes, it is necessary to rely on a serialization protocol. The standard protocol in python is :mod:`pickle` but this protocol has several limitation. For instance, it cannot serialize functions which are defined interactively or in the :code:`__main__` module.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, cloudpickle respects the pickle protocol: you can use pickle.load to unpickle something generated by cloudpickle.dump

So instead I would write:

The standard protocol in python is :mod:pickle but its default implementation in the standard library has several limitations.

docs/API.rst Outdated

To share function definition across multiple python processes, it is necessary to rely on a serialization protocol. The standard protocol in python is :mod:`pickle` but this protocol has several limitation. For instance, it cannot serialize functions which are defined interactively or in the :code:`__main__` module.

To avoid this limitation, :mod:`loky` relies on |cloudpickle| when it is present. |cloudpickle| is a fork of the pickle protocol which allows the serialization of a greater number of objects and it can be installed using :code:`pip install cloudpickle`. As this library is slower than the standard protocol, by default, :mod:`loky` uses it only to serialize objects which are detected to be in the :code:`__main__` module.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this library is slower than the pickle module in the standard library, ...

docs/API.rst Outdated
There is two way to temper with the serialization in :mod:`loky`:

- Using the arguments :code:`job_reducers` and :code:`result_reducers`, it is possible to register custom reducers for the serialization process.
- Setting the variable :code:`LOKY_PICKLER` to an available and valid serialization module. This module must present a valid :code:`Pickler` object. For instance, setting :code:`LOKY_PICKER=cloudpickle` will force :mod:`loky` to serialize everything with |cloudpickle| instead of just serializing the object detected to be in the :code:`__main__` module.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the environment variable :code:LOKY_PICKLER

@tomMoral tomMoral force-pushed the FIX_pickling_initializer branch 2 times, most recently from 3a9f6b3 to a115f17 Compare August 24, 2018 11:41
@ogrisel ogrisel merged commit 8002023 into master Aug 24, 2018
@ogrisel ogrisel deleted the FIX_pickling_initializer branch August 24, 2018 12:54
@ogrisel
Copy link
Collaborator

ogrisel commented Aug 24, 2018

Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants