# Low-level hackery

One function I've used without much comment is `numpy.frombuffer`, which lets us wrap arbitrary regions of memory as Numpy arrays. We can "peek" at any memory we want; we can also "poke" it, changing values, byte by byte.

Consider, for instance, a byte string. These are immutable (cannot be changed) in Python:

In [13]:
hello = b"Hello, world!"

In [14]:
try:
    hello[4:8] = b"????"
except TypeError as err:
    print("Nope: " + str(err))

Nope: 'bytes' object does not support item assignment


In [15]:
import numpy
a = numpy.frombuffer(hello, dtype=numpy.uint8)
a

array([ 72, 101, 108, 108, 111,  44,  32, 119, 111, 114, 108, 100,  33],
      dtype=uint8)

In [16]:
a.view("S1")

array([b'H', b'e', b'l', b'l', b'o', b',', b' ', b'w', b'o', b'r', b'l',
       b'd', b'!'], dtype='|S1')

By default, Numpy tries to protect you from doing evil things.

In [17]:
try:
    a[4:8] = [69, 86, 73, 76]
except ValueError as err:
    print("Nope: " + str(err))

Nope: assignment destination is read-only


But this is Python: we can shoot our foot if we want to.

In [18]:
a.flags.writeable = True

In [19]:
a[4:8] = [69, 86, 73, 76]

In [20]:
hello

b'HellEVILorld!'

This messes with Python's internal data model.

In [27]:
hello = b"Hello, world!"
a = numpy.frombuffer(hello, dtype=numpy.uint8)
a.flags.writeable = True
a[4:8] = [69, 86, 73, 76]
print(hello == b"Hello, world!")

False


In [28]:
exec("""
hello = b"Hello, world!"
a = numpy.frombuffer(hello, dtype=numpy.uint8)
a.flags.writeable = True
a[4:8] = [69, 86, 73, 76]
print(hello == b"Hello, world!")
""")

True


(The second example was interpreted as a `.pyc` script, in which all instances of the literal `b"Hello, world!"` were replaced by a single object: modifying that object in line 4 changed it in line 5!)

With the help of ctypes, a built-in Python library, Numpy can wrap any address at all. (Some will cause segmentation faults, so be careful!)

In [31]:
x = 12345

In [47]:
import ctypes
import sys

ptr = ctypes.cast(id(x), ctypes.POINTER(ctypes.c_uint8))
a = numpy.ctypeslib.as_array(ptr, (sys.getsizeof(x),))
a

array([  1,   0,   0,   0,   0,   0,   0,   0,  32, 117, 138, 165,   1,
        95,   0,   0,   1,   0,   0,   0,   0,   0,   0,   0,  57,  48,
         0,   0], dtype=uint8)

Do you see it? We're looking at a Python object header, a pointer to the `int` type (also a Python object), and then the number itself. Here's a hint: it's the last four bytes.

In [50]:
a[-4:].view(numpy.int32)

array([12345], dtype=int32)

Let's try a string.

In [65]:
y = "Hey there."
ptr = ctypes.cast(id(y), ctypes.POINTER(ctypes.c_uint8))
a = numpy.ctypeslib.as_array(ptr, (sys.getsizeof(y),))
a

array([  2,   0,   0,   0,   0,   0,   0,   0,  32,  65, 138, 165,   1,
        95,   0,   0,  10,   0,   0,   0,   0,   0,   0,   0,  16,  24,
       172,  81, 105, 242,  27,  49, 228, 195,  85, 180, 181, 114,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,  72, 101, 121,  32,
       116, 104, 101, 114, 101,  46,   0], dtype=uint8)

In [69]:
a[-11:].tostring()

b'Hey there.\x00'