Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumPy arrays and instances of datetime, date and time pickled in Python 2 cannot be deserialized in Python 3 #264

Open
LaughInJar opened this issue Sep 16, 2020 · 1 comment

Comments

@LaughInJar
Copy link

I realize that I am very late with the python3 migration, so I understand if this ticket is closed right away. However others running into the same issue can hopefully benefit from this report.

According to the docs for pickle inPython 3:

Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2.

However the default encoding for unpickling is US-ASCII, thus one cannot deserialize datetime, date and time objects as well as NumPy arrays if they have been serialized in Python 2. Many migrate from Python 2 to Python 3 using a path were both versions run in parallel for some time.

Allowing to configure the encoding used for pickle.load()/_PylibMC_pickle_loads would resolve the issue. I also added a workaround below.

Environment

pylibmc==1.6.1
memcached 1.6.6
python 2.7.18 and 3.6.11

Steps to reproduce

In the Python 2 environment:

import sys
import pylibmc
from pickle import load as pickle_load
from io import BytesIO
from datetime import datetime
            

client = pylibmc.Client(["localhost:11211"], behaviors={"pickle_protocol": 2}, binary=True)

client.set("testText", "asdf")
client.set("testDict", {"a": 123})
client.set("testDatetime", datetime.now())

In the Python 3 environment:

import sys
import pylibmc
from pickle import load as pickle_load
from io import BytesIO
from datetime import datetime
            

client = pylibmc.Client(["localhost:11211"], behaviors={"pickle_protocol": 2}, binary=True)

# this works
client.get("testText")
client.get("testDict")

# this fails with: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)
client.get("testDatetime")

Expected behaviour

The datetime object (as well as the others) is correctly retrieved from the cache

Actual Behaviour

The datetime object cannot be deserialized

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)

Workaround

import sys
import pylibmc
from pickle import load as pickle_load
from io import BytesIO

class MemcachedClient(pylibmc.Client):

     if sys.version_info.major >= 3:        
        PYLIBMC_FLAG_PICKLE = 1
     
        def deserialize(self, value, flags):
            if flags & self.PYLIBMC_FLAG_PICKLE:
                return pickle_load(BytesIO(value), encoding=u"latin1")
            return super(MemcachedClient, self).deserialize(value, flags)
            

client = MemcachedClient(["localhost:11211"], behaviors={"pickle_protocol": 2}, binary=True)
client.get("testText")
client.get("testDict")
client.get("testDatetime")
@LaughInJar LaughInJar changed the title NumPy arrays and instances of datetime, date and time pickled by Python 2 cannot be deserialized in Python 3 NumPy arrays and instances of datetime, date and time pickled in Python 2 cannot be deserialized in Python 3 Sep 16, 2020
@lericson
Copy link
Owner

I will leave this up as your workaround seems good (though a bit unorthodox to put class-level function definitions inside if blocks 😆). I realize we should probably be exposing all the constants such as PYLIBMC_FLAG_PICKLE on the _pylibmc module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants