Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pickle path objects #223

Closed
ringohoffman opened this issue May 9, 2022 · 4 comments · Fixed by #224
Closed

Cannot pickle path objects #223

ringohoffman opened this issue May 9, 2022 · 4 comments · Fixed by #224

Comments

@ringohoffman
Copy link
Contributor

Python 3.9.12 (main, Apr  5 2022, 06:56:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
>>> import pickle
>>> pickle.format_version
'4.0'
>>> import cloudpathlib
>>> cloudpathlib.__version__
'0.7.1'
>>> with open("/home/matthew/s3.pkl", "w") as f:
...     pickle.dump(cloudpathlib.S3Path("s3://bucket/key"), f)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: Can't pickle local object 'lazy_call.<locals>._handler'

pickle docs: What can be pickled and unpickled?

@pjbull
Copy link
Member

pjbull commented May 9, 2022

Not sure about other providers, but for S3 I believe boto3 session objects not being pickleable is the root cause of the issue you're seeing.

We'd have to implement a workaround to make objects pickleable. At least *Client if not *Path would need to override __getstate__ and __setstate__ so things work: https://docs.python.org/3/library/pickle.html#handling-stateful-objects

Did a quick spike and adding this to CloudPath works (at least for S3). I think setting client to the default client likely handles at least 90% of cases.

    def __getstate__(self):
        state = self.__dict__.copy()

        # don't pickle client
        del state["client"]

        return state

    def __setstate__(self, state):
        client = self._cloud_meta.client_class.get_default_client()
        state["client"] = client
        return self.__dict__.update(state)

This needs tests and documentation.

@pjbull
Copy link
Member

pjbull commented May 16, 2022

@ringohoffman this has been merged into the main branch. Would you mind installing the development version and testing your use case to confirm it works as expected?

Thanks!

@pjbull
Copy link
Member

pjbull commented May 19, 2022

@ringohoffman released in 0.8.0 which is on pypi now

@ringohoffman
Copy link
Contributor Author

ringohoffman commented May 19, 2022

Verified it! @pjbull

>>> import cloudpathlib
>>> import torch
>>> 
>>> 
>>> p = cloudpathlib.S3Path("s3://bucket/key")
>>> torch.save(obj=p, f="./s3.txt")
>>> 
>>> torch.load(f="./s3.txt")
S3Path('s3://bucket/key')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants