-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Description
Bug report
The array.array docs state for the initializer:
If given a list or string, the initializer is passed to the new array’s fromlist(), frombytes(), or fromunicode() method (see below) to add initial items to the array. Otherwise, the iterable initializer is passed to the extend() method.
code:
import array
start = array.array("Q", [1, 2, 3])
untyped_buf = memoryview(start).cast("B") # Unsigned bytes is the default buffer type IIRC
result = array.array("Q", untyped_buf)
expected = array("Q", [1, 2, 3])
actual = array('Q', [1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0])
What happens is that the memoryview is not treated as a buffer and passed to the from_bytes method. Instead it is treated as a generic iterable.
I discovered this bug when working on a C-extension where I expose almost 7_000_000 64-bit integers to Python using PyMemoryView_FromMemory
. Since this is roughly 56 MiB it should be a breeze to memcpy this into an array.array. Instead I get 56 million unsigned bytes types through a Python iterator, which causes a massive bug in my application while also being incredibly slow.
Looking at the code the arraymodule checks for array, bytes, and bytesarray before delegating to the from_bytes
method. I think it is much more appropriate to use PyBuffer_Check
this will also work appropiately whilst also catching memoryview.