BUG: Unpacking PY2 msgpack in PY3 #12142

Closed
kawochen opened this Issue Jan 26, 2016 · 4 comments

Comments

Projects
None yet
2 participants
Contributor

kawochen commented Jan 26, 2016

In #10686, I should have made all the strings in encode Unicode strings. Now 'abc' packed in P2 becomes (or rather remains as) b'abc' when unpacked in P3. This I think is the desired behavior (bytes remain as bytes and text remains as text), but it causes errors in decode, because, for example, 'typ' (==u'type' in P2) is expected while b'typ' (=='typ' in P2) is the key.

Reading in the other direction is fine because P2 is more tolerant of these things.

To reproduce this,

(P2) python generate_legacy_storage_files.py your_dir msgpack
(P3) pandas.read_msgpack(the_file_just_created)

jreback added this to the Next Major Release milestone Jan 26, 2016

Contributor

jreback commented Jan 26, 2016

ok, I don't see a versioning schema in the actual packed file? Is it possible to add one? so we can then conditionally do things.

Contributor

kawochen commented Jan 26, 2016

I think the cleanest solution would be to sprinkle encode and decode with u, That wouldn't save the files packed with 0.17 in P2 (they still wouldn't be able to be unpacked in PY3). I think the end result would be the same as adding some version info.

packed unpacked
pre-0.17 PY2 any
pre-0.17 PY3 any
0.17 PY2 PY2
0.17 PY3 any

The other option would be to test for bytes vs strings in decode (which is called for all dicts), in which case files packed with 0.17 in PY2 can be unpacked by future pandas in PY3. This would be a bit unwieldy as you would need to test for both string 'abc' and bytes b'abc' pretty much everywhere.

packed unpacked
pre-0.17 PY2 any
pre-0.17 PY3 any
0.17 PY2 PY2 (<=0.17), any (>=0.18.?)
0.17 PY3 any
Contributor

kawochen commented Feb 1, 2016

What do people think? Would like to fix in 0.18.0.

@jreback jreback modified the milestone: 0.18.0, Next Major Release Feb 12, 2016

Contributor

jreback commented Feb 17, 2016

closed by #12129

jreback closed this Feb 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment