# Chapter 39: Array, Memoryview, and Struct

This notebook covers Python's `array` module for efficient typed arrays, the `memoryview`
object for zero-copy buffer access, and the `struct` module for packing and unpacking
binary data. Together these tools form Python's buffer protocol ecosystem.

## Key Concepts
- **`array.array`**: Typed, compact arrays of numeric values
- **`memoryview`**: Zero-copy slicing and modification of buffer objects
- **Buffer protocol**: The interface that allows objects to share memory
- **`struct.pack` / `struct.unpack`**: Convert between Python values and C-style binary data
- **Format strings**: Specify byte order and data types for binary serialization

## Section 1: Creating Arrays

The `array` module provides a compact, typed array that stores elements more efficiently
than a Python list. Each array has a typecode that determines the C type of its elements.

In [None]:
import array

# Create an array of signed integers ('i' = C int)
a: array.array = array.array("i", [1, 2, 3, 4, 5])

print(f"Array: {a}")
print(f"Length: {len(a)}")
print(f"First element: {a[0]}")
print(f"Last element: {a[-1]}")
print(f"Typecode: {a.typecode}")

# Common typecodes
typecodes: list[tuple[str, str]] = [
    ("b", "signed char (1 byte)"),
    ("B", "unsigned char (1 byte)"),
    ("h", "signed short (2 bytes)"),
    ("i", "signed int (2-4 bytes)"),
    ("l", "signed long (4-8 bytes)"),
    ("f", "float (4 bytes)"),
    ("d", "double (8 bytes)"),
]
print(f"\n{'Code':<6} {'Description'}")
print("-" * 35)
for code, desc in typecodes:
    print(f"{code:<6} {desc}")

In [None]:
import array

# Arrays of different types
int_arr: array.array = array.array("i", [10, 20, 30])
float_arr: array.array = array.array("f", [1.1, 2.2, 3.3])
double_arr: array.array = array.array("d", [1.0, 2.0, 3.0])

print(f"int array:    {list(int_arr)}")
print(f"float array:  {list(float_arr)}")
print(f"double array: {list(double_arr)}")

# Arrays support standard sequence operations
int_arr.append(40)
int_arr.extend([50, 60])
print(f"\nAfter append/extend: {list(int_arr)}")

# Slicing
sliced: array.array = int_arr[1:4]
print(f"Slice [1:4]: {list(sliced)}")

## Section 2: Array Properties -- itemsize and buffer_info

The `itemsize` attribute tells you how many bytes each element occupies. The
`buffer_info()` method returns the memory address and number of elements.

In [None]:
import array
import struct

# itemsize shows bytes per element
a: array.array = array.array("i")
print(f"Typecode 'i' itemsize: {a.itemsize} bytes")
print(f"Matches struct.calcsize('i'): {a.itemsize == struct.calcsize('i')}")

# Compare itemsize across types
for code in ["b", "h", "i", "l", "f", "d"]:
    arr: array.array = array.array(code)
    print(f"  typecode '{code}': {arr.itemsize} bytes per element")

In [None]:
import array

# buffer_info returns (address, length)
a: array.array = array.array("d", [1.0, 2.0, 3.0])
address, length = a.buffer_info()

print(f"Buffer address: {address}")
print(f"Number of elements: {length}")
print(f"Address is positive: {address > 0}")
print(f"Length matches len(): {length == len(a)}")

# Total memory used by the data
total_bytes: int = length * a.itemsize
print(f"\nTotal data bytes: {total_bytes}")
print(f"Expected (3 * 8): {3 * 8}")

## Section 3: Array Bytes Conversion

Arrays can be serialized to bytes with `tobytes()` and reconstructed from bytes
with `frombytes()`. This is useful for saving arrays to files or sending them
over a network.

In [None]:
import array

# Convert array to bytes
a: array.array = array.array("i", [1, 2, 3])
b: bytes = a.tobytes()

print(f"Original array: {list(a)}")
print(f"As bytes: {b!r}")
print(f"Byte length: {len(b)}")
print(f"Expected ({len(a)} * {a.itemsize}): {len(a) * a.itemsize}")

# Reconstruct from bytes
a2: array.array = array.array("i")
a2.frombytes(b)

print(f"\nReconstructed: {list(a2)}")
print(f"Round-trip match: {list(a) == list(a2)}")

In [None]:
import array

# tolist and fromlist for Python list conversion
original: array.array = array.array("d", [1.5, 2.5, 3.5])
as_list: list[float] = original.tolist()
print(f"tolist(): {as_list}")
print(f"Type: {type(as_list)}")

# fromlist adds elements from a Python list
a: array.array = array.array("d")
a.fromlist([10.0, 20.0, 30.0])
print(f"\nfromlist(): {list(a)}")

# Byte-swapping for endianness conversion
a: array.array = array.array("i", [1, 256])
print(f"\nBefore byteswap: {list(a)}")
a.byteswap()
print(f"After byteswap:  {list(a)}")

## Section 4: Memoryview Basics

A `memoryview` provides zero-copy access to the internal buffer of an object
that supports the buffer protocol (like `bytes`, `bytearray`, and `array.array`).
Slicing a memoryview does not copy data -- it creates a view into the same memory.

In [None]:
# Create a memoryview from a bytearray
data: bytearray = bytearray(b"Hello, World!")
mv: memoryview = memoryview(data)

# Slicing returns a new memoryview (zero-copy)
hello: memoryview = mv[0:5]
print(f"Full data: {bytes(mv)}")
print(f"Slice [0:5]: {bytes(hello)}")
print(f"Slice equals b'Hello': {bytes(hello) == b'Hello'}")

# Memoryview properties
print(f"\nFormat: {mv.format}")
print(f"Itemsize: {mv.itemsize}")
print(f"Length: {len(mv)}")
print(f"Readonly: {mv.readonly}")

In [None]:
# Memoryview from bytes (read-only) vs bytearray (writable)
immutable: memoryview = memoryview(b"readonly")
mutable: memoryview = memoryview(bytearray(b"writable"))

print(f"From bytes - readonly: {immutable.readonly}")
print(f"From bytearray - readonly: {mutable.readonly}")

# Read-only memoryview cannot be modified
try:
    immutable[0] = ord("R")
except TypeError as e:
    print(f"\nCannot modify bytes memoryview: {e}")

# Indexing returns an integer (the byte value)
print(f"\nmutable[0] = {mutable[0]} (ord('w') = {ord('w')})")

## Section 5: Modifying Data Through Memoryview

When a memoryview is created from a mutable buffer (like `bytearray`), modifications
through the memoryview directly change the underlying data without copying.

In [None]:
# Modify underlying buffer through memoryview
data: bytearray = bytearray(b"Hello")
mv: memoryview = memoryview(data)

print(f"Before: {data}")

# Change first byte from 'H' to 'J'
mv[0] = ord("J")
print(f"After mv[0] = ord('J'): {data}")
print(f"Data is now 'Jello': {data == bytearray(b'Jello')}")

# Slice assignment
data2: bytearray = bytearray(b"abcdefgh")
mv2: memoryview = memoryview(data2)
mv2[2:5] = b"CDE"
print(f"\nAfter slice assignment: {data2}")

In [None]:
# Zero-copy demonstration: memoryview shares memory with the original
original: bytearray = bytearray(b"shared memory demo")
view1: memoryview = memoryview(original)
view2: memoryview = view1[7:13]  # "memory"

print(f"Original: {original}")
print(f"view2: {bytes(view2)}")

# Modify through view2 -- affects original
view2[0] = ord("M")
print(f"\nAfter view2[0] = ord('M'):")
print(f"Original: {original}")
print(f"view2:    {bytes(view2)}")

# Release memoryview to free the buffer lock
view1.release()
view2.release()
print(f"\nMemoryviews released")

## Section 6: Memoryview with Typed Arrays

When you create a memoryview from a typed `array.array`, the memoryview
inherits the format and itemsize of the array's element type.

In [None]:
import array
import struct

# Memoryview from a typed array
a: array.array = array.array("i", [1, 2, 3])
mv: memoryview = memoryview(a)

print(f"Array: {list(a)}")
print(f"Format: {mv.format}")
print(f"Itemsize: {mv.itemsize}")
print(f"Itemsize matches struct.calcsize('i'): {mv.itemsize == struct.calcsize('i')}")
print(f"Length: {len(mv)}")

# Indexing returns typed values (not raw bytes)
print(f"\nmv[0] = {mv[0]}")
print(f"mv[1] = {mv[1]}")
print(f"mv[2] = {mv[2]}")

# Convert to list
print(f"\nAs list: {mv.tolist()}")

In [None]:
import array

# Modifying array elements through memoryview
a: array.array = array.array("i", [10, 20, 30, 40, 50])
mv: memoryview = memoryview(a)

print(f"Before: {list(a)}")

# Modify through memoryview
mv[0] = 100
mv[4] = 500
print(f"After:  {list(a)}")

# Cast to a different view (e.g., view ints as bytes)
byte_view: memoryview = mv.cast("b")  # 'b' = signed char
print(f"\nAs bytes view length: {len(byte_view)}")
print(f"Expected ({len(a)} * {a.itemsize}): {len(a) * a.itemsize}")
print(f"First 4 bytes (int 100): {list(byte_view[:4])}")

## Section 7: struct -- Packing and Unpacking Binary Data

The `struct` module converts between Python values and C-style packed binary data.
A format string specifies the byte order and types of values to pack or unpack.

In [None]:
import struct

# Pack two unsigned ints in network byte order
packed: bytes = struct.pack("!2I", 1, 2)
print(f"Packed bytes: {packed!r}")
print(f"Packed length: {len(packed)} bytes")

# Unpack them back
a, b = struct.unpack("!2I", packed)
print(f"\nUnpacked: a={a}, b={b}")
print(f"a == 1: {a == 1}")
print(f"b == 2: {b == 2}")

# Byte order prefixes
prefixes: list[tuple[str, str]] = [
    ("!", "network (big-endian)"),
    (">", "big-endian"),
    ("<", "little-endian"),
    ("=", "native"),
    ("@", "native with native alignment"),
]
print(f"\n{'Prefix':<8} {'Byte Order'}")
print("-" * 35)
for prefix, desc in prefixes:
    print(f"{prefix:<8} {desc}")

In [None]:
import struct

# Format characters for different types
# Pack a mixed record: an int, a float, and a 5-byte string
record: bytes = struct.pack("!if5s", 42, 3.14, b"Hello")
print(f"Packed record: {record!r}")
print(f"Record length: {len(record)} bytes")

# Unpack the record
num, fval, text = struct.unpack("!if5s", record)
print(f"\nUnpacked: num={num}, fval={fval:.4f}, text={text}")

# calcsize tells you the packed size for a format
print(f"\nFormat sizes:")
formats: list[tuple[str, str]] = [
    ("!I", "1 unsigned int (network)"),
    ("!2I", "2 unsigned ints (network)"),
    ("!if", "int + float (network)"),
    ("!d", "1 double (network)"),
    ("!10s", "10-byte string"),
]
for fmt, desc in formats:
    size: int = struct.calcsize(fmt)
    print(f"  {fmt:<8} ({desc}): {size} bytes")

In [None]:
import struct

# Practical: pack/unpack a network packet header
def pack_header(version: int, msg_type: int, length: int, seq: int) -> bytes:
    """Pack a 10-byte packet header."""
    return struct.pack("!BBHI", version, msg_type, length, seq)


def unpack_header(data: bytes) -> tuple[int, int, int, int]:
    """Unpack a 10-byte packet header."""
    return struct.unpack("!BBHI", data)


# Pack a header
header: bytes = pack_header(version=1, msg_type=3, length=256, seq=12345)
print(f"Packed header: {header!r}")
print(f"Header size: {len(header)} bytes")

# Unpack it
ver, mtype, length, seq = unpack_header(header)
print(f"\nVersion:  {ver}")
print(f"Type:     {mtype}")
print(f"Length:   {length}")
print(f"Sequence: {seq}")

In [None]:
import struct

# struct.Struct for repeated packing/unpacking with the same format
point_fmt: struct.Struct = struct.Struct("!2d")  # Two doubles

print(f"Format: {point_fmt.format}")
print(f"Size: {point_fmt.size} bytes")

# Pack several points
points: list[tuple[float, float]] = [(1.0, 2.0), (3.5, 4.5), (5.0, 6.0)]
packed_points: list[bytes] = [point_fmt.pack(x, y) for x, y in points]

print(f"\nPacked {len(packed_points)} points")
for i, data in enumerate(packed_points):
    x, y = point_fmt.unpack(data)
    print(f"  Point {i}: ({x}, {y})")

# pack_into / unpack_from for working with buffers
buffer: bytearray = bytearray(point_fmt.size * 2)
point_fmt.pack_into(buffer, 0, 10.0, 20.0)
point_fmt.pack_into(buffer, point_fmt.size, 30.0, 40.0)

p1_x, p1_y = point_fmt.unpack_from(buffer, 0)
p2_x, p2_y = point_fmt.unpack_from(buffer, point_fmt.size)
print(f"\nFrom buffer:")
print(f"  Point 0: ({p1_x}, {p1_y})")
print(f"  Point 1: ({p2_x}, {p2_y})")

## Section 8: Combining Array, Memoryview, and Struct

These three tools work together through the buffer protocol. Arrays provide
typed storage, memoryviews provide zero-copy access, and struct handles
serialization.

In [None]:
import array
import struct

# Round-trip: array -> bytes -> array
original: array.array = array.array("i", [100, 200, 300, 400])
print(f"Original: {list(original)}")

# Serialize to bytes
raw: bytes = original.tobytes()
print(f"As bytes: {raw!r}")

# Interpret the same bytes with struct
count: int = len(raw) // struct.calcsize("i")
values: tuple[int, ...] = struct.unpack(f"{count}i", raw)
print(f"Via struct.unpack: {values}")

# Reconstruct the array
restored: array.array = array.array("i")
restored.frombytes(raw)
print(f"Restored: {list(restored)}")
print(f"Match: {list(original) == list(restored)}")

In [None]:
import array

# Memoryview as a bridge between array and raw bytes
a: array.array = array.array("i", [1, 2, 3, 4, 5])
mv: memoryview = memoryview(a)

# Typed view: each element is an int
print(f"Typed view: format={mv.format}, len={len(mv)}")
print(f"Elements: {mv.tolist()}")

# Cast to raw bytes view
byte_mv: memoryview = mv.cast("B")  # unsigned char
print(f"\nByte view: format={byte_mv.format}, len={len(byte_mv)}")
print(f"First 8 bytes: {list(byte_mv[:8])}")

# Modify through the typed memoryview
mv[2] = 999
print(f"\nAfter mv[2] = 999:")
print(f"Array: {list(a)}")
print(f"Byte view at [8:12]: {list(byte_mv[8:12])}")

## Summary

### array Module
- **`array.array(typecode, iterable)`**: Create a typed, compact array
- **`.typecode`**: The character code for the element type (e.g., `'i'` for int)
- **`.itemsize`**: Bytes per element
- **`.buffer_info()`**: Returns `(address, element_count)`
- **`.tobytes()` / `.frombytes()`**: Serialize to and from raw bytes

### memoryview
- **`memoryview(buffer)`**: Zero-copy view into a buffer object
- Slicing creates a new view (no data copied)
- **`.format`**: Element format character
- **`.itemsize`**: Bytes per element
- **`.readonly`**: Whether the view is writable
- **`.cast(format)`**: Reinterpret the buffer with a different element type
- Modifications through a memoryview affect the underlying buffer directly

### struct Module
- **`struct.pack(fmt, v1, v2, ...)`**: Pack values into bytes
- **`struct.unpack(fmt, buffer)`**: Unpack bytes into a tuple of values
- **`struct.calcsize(fmt)`**: Size in bytes of the packed format
- Byte order prefixes: `!` (network), `>` (big), `<` (little), `=`/`@` (native)
- **`struct.Struct(fmt)`**: Precompiled format for repeated use