Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support method_descriptor #46

Merged
merged 2 commits into from
Jan 6, 2016
Merged

Conversation

mrocklin
Copy link
Contributor

@mrocklin mrocklin commented Jan 4, 2016

Previously serializing bare class methods would fail in Python 2
Now it works fine.

>>> cloudpickle.dumps(str.upper)

We accomplish this by extending pickle's reduce function registry.

Previously serializing bare class methods would fail in Python 2
Now it works fine.

    >>> cloudpickle.dumps(str.upper)

We accomplish this by extending pickle's reduce function registry.
In cases like ``dumps(pd.Series.sum)`` we used to depend on
`obj.__self__`, which in this case is None.  In these cases we should
instead depend on `obj.__class__`
@rgbkrk
Copy link
Member

rgbkrk commented Jan 6, 2016

With a test too! Thanks @mrocklin.

I'm going to add you to the org for cloudpipe, though traffic is low.

rgbkrk added a commit that referenced this pull request Jan 6, 2016
@rgbkrk rgbkrk merged commit 7f1ff09 into cloudpipe:master Jan 6, 2016
@ogrisel
Copy link
Contributor

ogrisel commented Jan 6, 2016

@mrocklin do you have other bugfixes like that in your pipe? If not I think those 2 fixes alone deserve a quick bugfix release + upload to PyPI. WDYT?

@@ -698,3 +701,18 @@ def _make_skel_func(code, closures, base_globals = None):
def _getobject(modname, attribute):
mod = __import__(modname, fromlist=[attribute])
return mod.__dict__[attribute]


""" Use copy_reg to extend global pickle definitions """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: better use # instead of inline string objects for commenting the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Will do this in the future.

@mrocklin mrocklin deleted the method-descriptor branch January 6, 2016 15:27
@mrocklin
Copy link
Contributor Author

mrocklin commented Jan 6, 2016

Regarding timeline, I hope to provide a fix for #45 a few weeks from now. I'm not sure how expensive a release is; I'll leave that judgement to you all

@rgbkrk
Copy link
Member

rgbkrk commented Jan 6, 2016

Release is simple enough. I'm not reliant on this currently, it will be months before I touch cloudpipe again.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 7, 2016

+1 for making a quick bugfix release now then. I can give it a shot.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 7, 2016

Actually we had not done any release since 0.1.0. git log says that changes where mostly related to test improvements but as there are many of them and a couple of new features like #36 and support for class methods. So next release should be 0.2.0. We can wait a few weeks.

rgbkrk added a commit to rgbkrk/spark that referenced this pull request Jun 12, 2017
This brings in fixes and upgrades from the [cloudpickle](https://github.com/cloudpipe/cloudpickle) module, notably:

* Import submodules accessed by pickled functions (cloudpipe/cloudpickle#80)
* Support recursive functions inside closures (cloudpipe/cloudpickle#89, cloudpipe/cloudpickle#90)
* Fix ResourceWarnings and DeprecationWarnings (cloudpipe/cloudpickle#88)
* Assume modules with __file__ attribute are not dynamic (cloudpipe/cloudpickle#85)
* Make cloudpickle Python 3.6 compatible (cloudpipe/cloudpickle#72)
* Allow pickling of builtin methods (cloudpipe/cloudpickle#57)
* Add ability to pickle dynamically created modules (cloudpipe/cloudpickle#52)
* Support method descriptor (cloudpipe/cloudpickle#46)
* No more pickling of closed files, was broken on Python 3 (cloudpipe/cloudpickle#32)
ghost pushed a commit to dbtsai/spark that referenced this pull request Aug 22, 2017
## What changes were proposed in this pull request?

Based on apache#18282 by rgbkrk this PR attempts to update to the current released cloudpickle and minimize the difference between Spark cloudpickle and "stock" cloud pickle with the goal of eventually using the stock cloud pickle.

Some notable changes:
* Import submodules accessed by pickled functions (cloudpipe/cloudpickle#80)
* Support recursive functions inside closures (cloudpipe/cloudpickle#89, cloudpipe/cloudpickle#90)
* Fix ResourceWarnings and DeprecationWarnings (cloudpipe/cloudpickle#88)
* Assume modules with __file__ attribute are not dynamic (cloudpipe/cloudpickle#85)
* Make cloudpickle Python 3.6 compatible (cloudpipe/cloudpickle#72)
* Allow pickling of builtin methods (cloudpipe/cloudpickle#57)
* Add ability to pickle dynamically created modules (cloudpipe/cloudpickle#52)
* Support method descriptor (cloudpipe/cloudpickle#46)
* No more pickling of closed files, was broken on Python 3 (cloudpipe/cloudpickle#32)
* ** Remove non-standard __transient__check (cloudpipe/cloudpickle#110)** -- while we don't use this internally, and have no tests or documentation for its use, downstream code may use __transient__, although it has never been part of the API, if we merge this we should include a note about this in the release notes.
* Support for pickling loggers (yay!) (cloudpipe/cloudpickle#96)
* BUG: Fix crash when pickling dynamic class cycles. (cloudpipe/cloudpickle#102)

## How was this patch tested?

Existing PySpark unit tests + the unit tests from the cloudpickle project on their own.

Author: Holden Karau <holden@us.ibm.com>
Author: Kyle Kelley <rgbkrk@gmail.com>

Closes apache#18734 from holdenk/holden-rgbkrk-cloudpickle-upgrades.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants