New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array pickling exposes internal memory representation of elements #46642
Comments
It would seem that pickling arrays directly exposes the underlying if (array->ob_size > 0) {
result = Py_BuildValue("O(cs#)O",
array->ob_type,
array->ob_descr->typecode,
array->ob_item,
array->ob_size * array->ob_descr->itemsize,
dict);
} The byte string that is pickled is directly created from the array's As far as I can tell, array pickles created on one platform cannot be Maybe the "typecode" field when used with the constructor could be |
Here is an example that directly demonstrates the bug. Pickling on x86_64: Python 2.5.1 (r251:54863, Mar 21 2008, 13:06:31)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import array, cPickle as pickle
>>> pickle.dumps(array.array('l', [1, 2, 3]))
"carray\narray\np1\n(S'l'\nS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x03\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp2\n."
Unpickling on ia32:
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cPickle as pickle
>>>
pickle.loads("carray\narray\np1\n(S'l'\nS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x03\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp2\n.")
array('l', [1, 0, 2, 0, 3, 0]) |
This looks indeed wrong. Unfortunately it also looks hard to fix in a |
Ping. |
At this point, I think it better to wait until Py2.7/3.1. Changing it Guido, do you agree? |
Agreed, this has been broken for a long time, and few people have |
I guess it went unnoticed due to prevalence of little-endian 32-bit Raymond, why do you think fixing this bug would complicate porting to |
I don't see why this cannot be fixed easily. All we need to do is fix And here is a patch against the trunk. |
Wouldn't that be lots and lots slower? I believe speed is one of the |
The slowdown depends of the array type. The patch makes array unpickling Although since most 64-bit compilers uses the LP64 model, I think we |
Unfortunately dumping the internal representation of non-long arrays I believe pickling arrays to compact strings is the right approach on Pickling arrays as lists is probably a decent workaround for the pending |
I like to challenge the view what "correct" behavior is here. If I IMO, correct behavior would preserve the width as much as possible. For When pickling, the pickle should always use network byte order; |
I think preserving integer width is a good idea because it saves us from Instead of sticking to network byte order, I propose to include byte Preserving widths and including endianness information would allow |
This sounds like the best approach yet -- it can be made backwards |
I'm all in for a standardized representation of array's pickles (with |
I think changing the array constructor is fairly easy: just pick a set char, signed-byte, unsigned byte: c, b, B In above scheme, even codes are little-endian, odd codes are big endian. |
Ah, I just remembered the smart way I had devised some time ago to Now, the only thing I am not sure about is whether this would work well |
Here's a patch that implements the solution I described in msg85298. Please give it a good review: |
I would like to commit my patch later this week. So if you see any issue |
I know believe that arrays should be pickled as a list of values on However, we still can use the compact memory representation on Python 3.x. So, I propose that we change the array module on Python 2.x to emit a |
Committed fix for 3.x in r74013 and for 2.x in r74014. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: