Make resources pickleable/serializable #678

maxrothman · 2016-06-11T03:57:39Z

Boto3 resources (e.g. instances, s3 objects, etc.) are not pickleable and have no to_json() method or similar. Therefore, there's currently no way to cache resources retrieved via boto3. This is problematic when retrieving a large number of resources that change infrequently. Even a cache of 30s or so can greatly increase the performance of certain programs and drastically reduce the number of necessary API calls to AWS.

Would it be possible to have some way to serialize resources?

The text was updated successfully, but these errors were encountered:

jamesls · 2016-06-15T19:12:09Z

In general, I'd like to improve things with regard to pickling/serializing objects. However, this is going to be challenging to implement giving the dynamic nature of resources/instances.

Marking as a feature request. If anyone has any ideas/suggestions they want to share, feel free to chime in.

maxrothman · 2016-06-15T21:33:28Z

This should be possible by giving the classes a __reduce__ method. See this StackOverflow question and the Python docs for more info.

maxrothman · 2016-06-19T04:49:36Z

I might be interested in contributing a patch for this issue if someone could help orient me in the codebase so I can find the callable that generates resources. @jamesls do you have thoughts on potential challenges in making said patch?

maxrothman · 2016-06-29T14:24:15Z

@jamesls ping. Any update on this?

maxrothman · 2016-07-18T14:11:58Z

@jamesls ping

maxrothman · 2016-07-21T17:31:02Z

I'm working on a patch for this. Is there any way currently that given a resource object you can get a reference to the ServiceContext that was used to create it? Or alternatively, is there a way to create a resource object from raw response JSON?

maxrothman · 2016-08-03T15:34:51Z

@jamesls any insight on the above question?

maxrothman · 2016-09-30T22:24:20Z

Ping. Is there any way I can get some support on this? I've expressed interest in submitting a patch, but I have some questions (above).

dazza-codes · 2019-02-28T01:52:35Z

Wow, this is old! The factory pattern (as implemented) screws up easy exception handling and pickling. Looks like this one is thrown in the too hard basket.

E       _pickle.PicklingError: Can't pickle <class 'boto3.resources.factory.s3.ObjectSummary'>: attribute lookup s3.ObjectSummary on boto3.resources.factory failed

A work-around that might somehow find it's way into boto3:

# substitute a namedtuple if necessary for py 2.x or earlier 3.x
# https://docs.python.org/3/library/collections.html#collections.namedtuple
@dataclass(frozen=True)
class S3Object:
    """Just the bucket_name and key for an s3.ObjectSummary
    This simple data class should work around problems with Pickle
    for an s3.ObjectSummary, so if obj is an s3.ObjectSummary, then:
    S3Object(bucket=obj.bucket_name, key=obj.key)
    """
    bucket: str
    key: str

bucket_name = 'example'
s3 = boto3.resource('s3')
s3_bucket = s3.Bucket(bucket_name)

objects = (
    S3Object(bucket=obj.bucket_name, key=obj.key)
    for obj in s3_bucket.objects.filter(Prefix='example_prefix')
)

with multiprocessing.Pool() as pool:
    processed_objects = pool.map(YourProcessor, objects)

dazza-codes · 2019-02-28T20:21:45Z

Symptoms of the factory patterns gone wrong?

>>> type(objects)
<class 'boto3.resources.collection.s3.Bucket.objectsCollection'>
>>> isinstance(objects, boto3.resources.collection.s3.Bucket.objectsCollection)
E       AttributeError: module 'boto3.resources.collection' has no attribute 's3'

luiscastillocr · 2019-05-08T22:46:59Z

 File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memoize/__init__.py", line 339, in decorated_function
    timeout=decorated_function.cache_timeout
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memoize/__init__.py", line 82, in set
    self.cache.set(key=key, value=value, timeout=timeout)
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/django/core/cache/backends/memcached.py", line 86, in set
    if not self._cache.set(key, value, self.get_backend_timeout(timeout)):
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memcache.py", line 727, in set
    return self._set("set", key, val, time, min_compress_len, noreply)
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memcache.py", line 1055, in _set
    return _unsafe_set()
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memcache.py", line 1030, in _unsafe_set
    store_info = self._val_to_store_info(val, min_compress_len)
  File "/home/vagrant/home/vagrant/wikirealty/lib/python3.4/site-packages/memcache.py", line 994, in _val_to_store_info
    pickler.dump(val)
_pickle.PicklingError: Can't pickle <class 'boto3.resources.factory.s3.Bucket'>: attribute lookup s3.Bucket on boto3.resources.factory failed

I am getting this error using django-memoize with memcached, looks like it is still an issue!

dazza-codes · 2020-04-10T20:05:37Z

A similar issue was resolved in:

It might help to add a test suite with pickle tests like:

import pickle
import boto3.session
import botocore.session

def test_pickle_botocore_session():
    session = botocore.session.get_session()
    assert pickle.loads(pickle.dumps(session))

def test_pickle_boto3_session():
    session = boto3.session.Session()
    assert pickle.loads(pickle.dumps(session))

Unfortunately they fail:


    def test_pickle_botocore_session():
        session = botocore.session.get_session()
>       assert pickle.loads(pickle.dumps(session))
E       AttributeError: Can't pickle local object '_createenviron.<locals>.encode'

tests/test_clients.py:52: AttributeError
___________________________________________________________________________________ test_pickle_boto3_session ____________________________________________________________________________________

    def test_pickle_boto3_session():
        session = boto3.session.Session()
>       assert pickle.loads(pickle.dumps(session))
E       AttributeError: Can't pickle local object 'lazy_call.<locals>._handler'

SoraDevin · 2020-10-12T05:10:44Z

Any update on this?

cygniv404 · 2021-05-20T10:50:30Z

Any new updates coming this year?

Excludes the boto3 client from the S3CloudInterface state so that it is not pickled by multiprocessing. This fixes barman-cloud-backup with Python >= 3.8. Previously this would fail with the following error: ERROR: Backup failed uploading data (Can't pickle <class 'boto3.resources.factory.s3.ServiceResource'>: attribute lookup s3.ServiceResource on boto3.resources.factory failed) This is because boto3 cannot be pickled using the default pickle protocol in Python >= 3.8. See the following boto3 issue: boto/boto3#678 The workaround of forcing pickle to use an older version of the pickle protocol is not available because it is multiprocessing which invokes pickle and it does not allow us to specify the protocol version. We therefore exclude the boto3 client from the pickle operation by implementing custom `__getstate__` and `__setstate__` methods as documented here: https://docs.python.org/3/library/pickle.html#handling-stateful-objects This works because the worker processes create their own boto3 session anyway due to race conditions around re-using the boto3 session from the parent process. It is also necessary to defer the assignment of the `worker_processes` list until after all worker processes have been spawned as the references to those worker processes also cannot be pickled with the default pickle protocol in Python >= 3.8. As with the boto3 client, the `worker_processes` list was not being used by the worker processes anyway.

iainelder · 2021-08-17T13:43:46Z

This would make it much easier to use the multiprocessing library with boto3. For example, I would like to pass a session object and a list of organization accounts to the Pool.starmap function that calls a function that gets gets the tags on the account and merges them into the existing account objects.

alexandrosandre · 2021-09-17T14:17:55Z

Would have used this to avoid globals in multiprocessing.

MrBeeMovie · 2022-05-11T14:25:14Z

Is this issue still being looked at? Would greatly appreciate this being added if at all possible.

Xezed · 2022-08-24T14:37:24Z

Just reminding that this feature would be useful.

dhkim0225 · 2022-12-22T05:10:49Z

Any update on this?

RyanFitzSimmonsAK · 2023-01-18T22:37:02Z

The boto3 team has recently announced that the Resource interface has entered a feature freeze and won’t be accepting new changes at this time: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html. We’ll be closing existing feature requests, such as this issue, to avoid any confusion on current implementation status. We do appreciate your feedback and will ensure it’s considered in future feature decisions.

We’d like to highlight that all existing code using resources is supported and will continue to work in Boto3. No action is needed from users of the library.

jamesls added feature-request This issue requests a feature. others-chime-in labels Jun 15, 2016

swetashre mentioned this issue Jan 8, 2020

{PicklingError}Can't pickle <class 'boto3.resources.factory.s3.ServiceResource'>: attribute lookup boto3.resources.factory.s3.ServiceResource failed #2255

Closed

swetashre added the needs-discussion label Feb 27, 2020

kyleknap removed the others-chime-in label Feb 27, 2020

swetashre mentioned this issue Feb 28, 2020

while reading s3 files parallel processing "_pickle.PicklingError: Can't pickle <class 'boto3.resources.factory.s3.Object'>: attribute lookup s3.Object on boto3.resources.factory failed " #2311

Closed

swetashre mentioned this issue Jan 28, 2021

_pickle.PicklingError: Can't pickle <class 'boto3.resources.factory.s3.ServiceResource'>: attribute lookup s3.ServiceResource on boto3.resources.factory failed #2741

Closed

kdaily added needs-review and removed needs-discussion labels Sep 1, 2021

pjbull mentioned this issue May 9, 2022

Cannot pickle path objects drivendataorg/cloudpathlib#223

Closed

aBurmeseDev added the p2 This is a standard priority issue label Nov 10, 2022

tim-finnigan added the resources label Nov 23, 2022

RyanFitzSimmonsAK added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make resources pickleable/serializable #678

Make resources pickleable/serializable #678

maxrothman commented Jun 11, 2016

jamesls commented Jun 15, 2016

maxrothman commented Jun 15, 2016

maxrothman commented Jun 19, 2016

maxrothman commented Jun 29, 2016

maxrothman commented Jul 18, 2016

maxrothman commented Jul 21, 2016

maxrothman commented Aug 3, 2016

maxrothman commented Sep 30, 2016

dazza-codes commented Feb 28, 2019 •

edited

dazza-codes commented Feb 28, 2019

luiscastillocr commented May 8, 2019 •

edited

dazza-codes commented Apr 10, 2020

SoraDevin commented Oct 12, 2020

cygniv404 commented May 20, 2021

iainelder commented Aug 17, 2021

alexandrosandre commented Sep 17, 2021

MrBeeMovie commented May 11, 2022

Xezed commented Aug 24, 2022

dhkim0225 commented Dec 22, 2022

RyanFitzSimmonsAK commented Jan 18, 2023 •

edited by aBurmeseDev

Make resources pickleable/serializable #678

Make resources pickleable/serializable #678

Comments

maxrothman commented Jun 11, 2016

jamesls commented Jun 15, 2016

maxrothman commented Jun 15, 2016

maxrothman commented Jun 19, 2016

maxrothman commented Jun 29, 2016

maxrothman commented Jul 18, 2016

maxrothman commented Jul 21, 2016

maxrothman commented Aug 3, 2016

maxrothman commented Sep 30, 2016

dazza-codes commented Feb 28, 2019 • edited

dazza-codes commented Feb 28, 2019

luiscastillocr commented May 8, 2019 • edited

dazza-codes commented Apr 10, 2020

SoraDevin commented Oct 12, 2020

cygniv404 commented May 20, 2021

iainelder commented Aug 17, 2021

alexandrosandre commented Sep 17, 2021

MrBeeMovie commented May 11, 2022

Xezed commented Aug 24, 2022

dhkim0225 commented Dec 22, 2022

RyanFitzSimmonsAK commented Jan 18, 2023 • edited by aBurmeseDev

dazza-codes commented Feb 28, 2019 •

edited

luiscastillocr commented May 8, 2019 •

edited

RyanFitzSimmonsAK commented Jan 18, 2023 •

edited by aBurmeseDev