## Python - Allocations

---

CPython's memory system provides a self-contained, high-performance solution for managing the lifecycle of all objects on a dedicated private heap, integrated seamlessly with the PyMalloc arena allocator for optimal small-object performance. This architecture ensures that every value—scalars, collections, functions, classes, and custom instances—is managed consistently on the private heap under the same allocation and lifecycle rules. At the top level, Python keeps names and stack frames (local variables, call stack) separate from objects on the heap (lists, dicts, your Tick instances).

When you do `x = Tick(...)`, the name x is just a reference; the actual Tick object lives on the heap and can be pointed to by many names or containers. Every heap object has a reference count; when this count drops to zero (no references anywhere), CPython immediately frees that object’s memory back to its allocator.

Python also runs a cyclic garbage collector to clean up groups of objects that reference each other in cycles and would otherwise escape simple reference counting. Every object, regardless of type, undergoes identical allocation, reference tracking, and deallocation procedures, while arenas minimize system call overhead and fragmentation for the millions of tiny allocations typical in data-intensive applications.

## pymalloc 

Calling the OS allocator (`malloc`/`free` or `mmap`) for every small object would be too slow and fragment memory badly, so CPython uses pymalloc for small objects (up to a few hundred bytes).
Pymalloc grabs memory from the OS in larger chunks and then sub-divides and reuses it, so most small-object allocations never talk to the OS at all.

Conceptually, pymalloc uses three levels:
* Arenas: big chunks from the OS (hundreds of KB or ~1MB, depending on version)
* Pools: 4KB regions inside an arena, each pool dedicated to one size class
* Blocks: fixed-size slots inside a pool, each block holding exactly one small object

Each pool keeps bookkeeping information: which size class it serves, how many blocks are currently used, and a pointer to a free list of blocks that have been freed and are ready to be reused.  Arenas track which pools are still available and which are fully used; if all pools in all arenas for a size class are full, pymalloc will get another arena from the OS and carve out more pools.

## Free lists

When an object like a Tick is freed (its reference count reaches zero), its block does not go back to the OS immediately; instead, it is pushed onto the free list for that pool. The next time Python needs another object of the same size class, pymalloc first checks the free list for a pool of that size and can just pop a block off the list, which is extremely fast and cache-friendly.


## Demo

In [None]:
import ctypes
import gc
import logging
import os
import sys
from collections import defaultdict
from ctypes import c_uint, c_ulonglong, c_void_p

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s",
)

log = logging.getLogger(__name__)

In [25]:
class Tick:
    def __init__(self, price: float, size: int, ts: float):
        self.price = price
        self.size = size
        self.ts = ts

First we show that Python variables are just references to heap objects, and that CPython tracks how many references each object has via a reference count. We create a Tick object, showing that multiple variables point to the same heap object, and logs the reference count before and after deleting one reference.

In [26]:
def names_and_refcounts():
    log.info("=== Step 1: Names → objects → refcounts ===")

    t = Tick(100.5, 10, 0.0)
    log.info("Created Tick object %r", t)
    log.info("Heap-ish address (id): %s", hex(id(t)))
    log.info("Approx heap size of Tick: %d bytes", sys.getsizeof(t))

    # Multiple names pointing to the same object
    a = t
    b = t
    same_id = id(a) == id(b) == id(t)
    log.info("id(a) == id(b) == id(t)? %s", same_id)

    rc_before = sys.getrefcount(t)
    log.info("Reference count for t (sys.getrefcount): %d", rc_before)

    # Drop one reference
    del a
    rc_after = sys.getrefcount(t)
    log.info("After del a, refcount(t): %d", rc_after)


if __name__ == "__main__":
    names_and_refcounts()

INFO    === Step 1: Names → objects → refcounts ===
INFO    Created Tick object <__main__.Tick object at 0x769e24529c10>
INFO    Heap-ish address (id): 0x769e24529c10
INFO    Approx heap size of Tick: 48 bytes
INFO    id(a) == id(b) == id(t)? True
INFO    Reference count for t (sys.getrefcount): 4
INFO    After del a, refcount(t): 3


We now allocate many Tick objects to stress the allocator, then drops them and forces garbage collection, logging how many unreachable objects were collected.

In [29]:
def many_objects_and_gc():
    log.info("=== Step 2: Many small objects & GC ===")

    def make_ticks(n: int):
        return [Tick(100.0 + i, i, float(i)) for i in range(n)]

    ticks = make_ticks(10_000)
    log.info("Created %d Tick objects", len(ticks))
    log.info("Example Tick size: %d bytes", sys.getsizeof(ticks[0]))

    # Drop the list so all Tick instances become unreachable
    del ticks
    collected = gc.collect()
    log.info("Forced GC, unreachable objects collected: %d", collected)


many_objects_and_gc()

INFO    === Step 2: Many small objects & GC ===
INFO    Created 10000 Tick objects
INFO    Example Tick size: 48 bytes
INFO    Forced GC, unreachable objects collected: 18


In [None]:
# Small object class to hit pymalloc small-object allocator
class Small:
    __slots__ = ("x",)

    def __init__(self, x: int):
        self.x = x

In [None]:
def bucket(addr):
    # 1 MiB bucket (approx arena)
    return addr // (1 << 20)


def alloc_wave(n: int):
    objs = [Small(i) for i in range(n)]
    addrs = [id(o) for o in objs]
    return objs, addrs


def log_bucket_counts(label, addrs):
    counts = defaultdict(int)
    for a in addrs:
        counts[bucket(a)] += 1
    log.info("%s: %d objects, %d buckets", label, len(addrs), len(counts))
    for b, c in sorted(counts.items(), key=lambda kv: -kv[1])[:5]:
        log.info("  bucket %d: %d objects", b, c)


def demo_alloc_and_reuse():
    log.info("=== Alloc & reuse demo (per 1MiB bucket) ===")

    # Wave 1: allocations
    objs1, addrs1 = alloc_wave(10_000)
    log_bucket_counts("Wave 1 (fresh allocations)", addrs1)

    # Drop them so their blocks go to free lists
    del objs1, addrs1
    gc.collect()

    # Wave 2: likely reuses same arenas / blocks (freelists)
    objs2, addrs2 = alloc_wave(10_000)
    log_bucket_counts("Wave 2 (after freeing wave 1)", addrs2)

    # Keep alive if you want to inspect further
    return objs2


if __name__ == "__main__":
    _objs2 = demo_alloc_and_reuse()

INFO    === Alloc & reuse demo (per 1MiB bucket) ===
INFO    Wave 1 (fresh allocations): 10000 objects, 8 buckets
INFO      bucket 124379717: 4436 objects
INFO      bucket 124379716: 2243 objects
INFO      bucket 124379714: 1352 objects
INFO      bucket 124379715: 1346 objects
INFO      bucket 124379718: 526 objects
INFO    Wave 2 (after freeing wave 1): 10000 objects, 8 buckets
INFO      bucket 124379717: 4434 objects
INFO      bucket 124379716: 2243 objects
INFO      bucket 124379714: 1352 objects
INFO      bucket 124379715: 1347 objects
INFO      bucket 124379718: 530 objects


In [None]:
# --- Low-level structs (simplified pool header model) ----------------


class PoolHeader(ctypes.Structure):
    """
    Simplified pymalloc pool header view.
    Real layout is CPython-version dependent; this is a heuristic model.
    """

    _fields_ = [
        ("free", c_void_p),  # head of free-list (pointer to first free block)
        ("arenaindex", c_uint),
        ("szidx", c_uint),  # size class index (0,1,2,...)
        ("refcount", c_ulonglong),  # number of blocks currently in use
        ("nextoffset", c_ulonglong),
        ("maxnextoffset", c_ulonglong),
    ]


# --- Helpers to find arenas and inspect a pool size class -------------


def find_candidate_arenas():
    """
    Scan /proc/self/maps for anonymous ~1MiB RW regions that are likely pymalloc arenas.
    """
    arenas = []
    pid = os.getpid()
    maps_path = f"/proc/{pid}/maps"
    if not os.path.exists(maps_path):
        log.warning("/proc not available; low-level arena scan not possible.")
        return arenas

    with open(maps_path) as f:
        for line in f:
            parts = line.split()
            if len(parts) < 5:
                continue
            perms = parts[1]
            dev = parts[3]
            if "rw-p" not in perms or dev != "00:00":
                continue
            if "[heap]" in line:
                continue

            addr_range = parts[0].split("-")
            try:
                start = int(addr_range[0], 16)
                end = int(addr_range[1], 16)
            except ValueError:
                continue
            size_kb = (end - start) / 1024
            # Look for regions roughly 1 MiB in size
            if 900 < size_kb < 1100:
                arenas.append(start)

    log.info("Found %d candidate arenas (~1MiB RW anonymous regions)", len(arenas))
    return arenas


def inspect_one_size_class(arenas, szidx=5):
    """
    For each candidate arena, interpret part of it as a PoolHeader for the given size class.
    szidx=5 is typically a small size class (e.g. a few dozen bytes) in many builds.[web:208][web:210]
    """
    if not arenas:
        log.info("No candidate arenas to inspect.")
        return

    log.info(
        "Inspecting pool header for szidx=%d in up to %d arenas", szidx, len(arenas)
    )

    for arena_start in arenas[:5]:  # limit to first few arenas
        arena_id = arena_start // (1 << 20)

        # Heuristic layout:
        #   - first 64 bytes: arena header
        #   - next 64 * N bytes: pool headers for each size class index 0..N-1
        pool_header_addr = arena_start + 64 + szidx * 64

        try:
            pool = PoolHeader.from_address(pool_header_addr)
        except (OSError, ValueError):
            log.warning(
                "Arena %d: could not read pool header at %s",
                arena_id,
                hex(pool_header_addr),
            )
            continue

        # Rough block-size mapping: this is only an approximation.[web:210]
        approx_block_size = 16 + szidx * 8
        approx_total_blocks = (4096 - 64) // max(1, approx_block_size)
        used = pool.refcount
        free_est = max(0, approx_total_blocks - used)

        log.info("Arena %d:", arena_id)
        log.info("  pool header @ %s", hex(pool_header_addr))
        log.info("  szidx: %d", pool.szidx)
        log.info("  used blocks (refcount): %d", used)
        log.info("  approx free blocks:     %d", free_est)
        log.info(
            "  free-list head ptr:     %s", hex(pool.free) if pool.free else "NULL"
        )


# --- Demo driver ------------------------------------------------------


def low_level_pymalloc_demo():
    log.info("=== Low-level pymalloc arena/pool demo ===")
    # 1) Create many small objects to force arenas/pools to be allocated
    objs = [Small(i) for i in range(50_000)]
    log.info("Allocated %d Small objects to populate arenas/pools", len(objs))
    gc.collect()

    # 2) Find candidate arenas in /proc/self/maps
    arenas = find_candidate_arenas()

    # 3) Inspect a single small size class (szidx=5 is a typical small size)
    inspect_one_size_class(arenas, szidx=5)

    # Keep objs alive so pools stay populated while we inspect
    return objs


if __name__ == "__main__":
    _objs = low_level_pymalloc_demo()

INFO    === Low-level pymalloc arena/pool demo ===
INFO    Allocated 50000 Small objects to populate arenas/pools
INFO    Found 5 candidate arenas (~1MiB RW anonymous regions)
INFO    Inspecting pool header for szidx=5 in up to 5 arenas
INFO    Arena 124379968:
INFO      pool header @ 0x769e34013180
INFO      szidx: 30366
INFO      used blocks (refcount): 0
INFO      approx free blocks:     72
INFO      free-list head ptr:     0x769e38053ca0
INFO    Arena 124379970:
INFO      pool header @ 0x769e3423a180
INFO      szidx: 30366
INFO      used blocks (refcount): 130421852285776
INFO      approx free blocks:     0
INFO      free-list head ptr:     0xb16be0
INFO    Arena 124380039:
INFO      pool header @ 0x769e3872a180
INFO      szidx: 0
INFO      used blocks (refcount): 0
INFO      approx free blocks:     72
INFO      free-list head ptr:     NULL
INFO    Arena 124380040:
INFO      pool header @ 0x769e3889a180
INFO      szidx: 0
INFO      used blocks (refcount): 0
INFO      approx free bl