Skip to content

[Python] Immutability of bytes is ignored #19572

@asfimport

Description

@asfimport

Creating a pyarrow.Buffer from Python bytes allows in-place changes of immutable Python strings:

>>> import pyarrow as pa
>>> import numpy as np
>>> a = b'123456'
>>> a[0] = 77 # bytes are immutable, so TypeError is expected
TypeError: 'bytes' object does not support item assignment
>>> b = pa.py_buffer(a)  # but with pyarrow bytes can be changed in-place
>>> arr = np.frombuffer(b, dtype=np.uint8)
>>> arr[0] = 66 # change 'a' in-place, would expect error
>>> a
b'B23456'
>>> hash(a)
-4581532003987476523
>>> arr[0] = 77
>>> a
b'M23456'
>>> hash(a) # hash value stays constant while changing 'a'
-4581532003987476523

Notice that numpy.frombuffer respects immutability of bytes:
``

>>> arr2 = np.frombuffer(a, dtype=np.uint8)
>>> arr2
array([77, 50, 51, 52, 53, 54], dtype=uint8)
>>> arr2[0] = 88 # expecting error
ValueError: assignment destination is read-only

Reporter: Pearu Peterson / @pearu
Assignee: Wes McKinney / @wesm

PRs and other links:

Note: This issue was originally created as ARROW-3228. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions