Skip to content

v0.9.0 - Main Pinned Cowns

Choose a tag to compare

@matajoh matajoh released this 06 Jun 11:21
· 4 commits to main since this release
6b9e1af

Main-pinned cowns — a new PinnedCown subclass holds its
value as a plain PyObject * on the main interpreter, never
round-tripped through XIData. Behaviors whose request set contains
any pinned cown are routed by the scheduler to a single-consumer
main-thread queue and drained by the new pump entry point
(or implicitly by wait, which auto-pumps when pinned cowns
exist). Designed for objects that cannot survive cross-interpreter
shipping — pyglet shapes, Tk widgets, GPU contexts, open file
handles, ctypes pointers. The companion examples/boids.py
rewrite demonstrates the coarse-grained pinned-dispatch pattern:
per-cell physics stays on workers, and one @when(PinnedCown)
per frame batches the write-back into main-thread matrices.
Also in this release: quiesce, a non-tearing-down
checkpoint primitive.

New Features

  • quiesce(timeout=None, *, stats=False, noticeboard=False)
    blocks until every in-flight behavior completes, without tearing
    down workers or the noticeboard thread. Implemented via a new
    terminator_seed_inc peer of terminator_seed_dec
    (Pyrona-style seed-up / seed-down pairing) so quiescence becomes
    a checkpoint rather than a shutdown. Useful for parallel-search
    patterns that need to inspect a best-so-far cown between rounds
    and for tests that must read a worker-produced send queue
    before its producer interpreter is destroyed. The stats and
    noticeboard flags mirror wait: returns None by
    default, a per-worker stats list[dict] when stats=True,
    a noticeboard dict[str, Any] when noticeboard=True, or a
    WaitResult when both are set. Raises TimeoutError
    if quiescence is not reached within timeout. Exported from
    bocpy.__all__.
  • PinnedCown(Cown[T]) — a cown whose value lives
    permanently on the main interpreter. Constructible only from the
    main interpreter (raises RuntimeError from workers);
    the value is never picklable, never reified twice, and never
    reconstructed in a worker. The capsule handle remains a
    first-class cross-interpreter shareable — workers may hold it,
    embed it in a regular Cown value graph, and place it in
    noticeboard entries, but only the main thread may acquire the
    value. See the new pinned_cowns page for the full
    contract and the coarse-grained-dispatch pattern.
  • pump(deadline_ms=None, max_behaviors=None, raise_on_error=False)
    — drains the main-thread queue of behaviors whose request sets
    contain a PinnedCown. Call from your event loop's
    idle / on-tick hook (pyglet schedule_interval, Tk after,
    asyncio task, …); script-mode programs need not call it
    explicitly because wait pumps internally. Non-preemptive:
    deadline_ms gates starting the next behavior, not
    interrupting one already running. Body exceptions default to
    landing on the result cown's .exception;
    raise_on_error=True re-raises the first body exception after
    drain. Returns a new PumpResult NamedTuple
    (executed, deadline_reached, raised).
  • set_pump_watchdog(warn_ms=1000, raise_ms=None, on_starve=None)
    — configure the pinned-queue starvation watchdog. Both thresholds
    gate on queue-non-empty time, not raw last-pump time, so
    programs running only unpinned work never trip them. Default is
    warn-only; users opt into fail-fast via an explicit raise_ms
    so interactive debugger sessions are not wedged by a breakpoint.
  • set_wait_pump_poll(ms=50) — set the poll cadence for
    wait's auto-pump loop. Re-read every iteration so a
    concurrent call updates the active wait immediately.
  • bocpy.PumpResult — three-field NamedTuple returned by
    pump. executed counts pinned behaviors whose lifecycle
    completed (including acquire-failure paths whose MCS chain still
    drained). deadline_reached is True only when the
    deadline_ms budget tripped before the queue drained.
    raised counts only body exceptions captured to a result cown
    (cleanup-path failures use PyErr_WriteUnraisable and do not
    count). Exported from bocpy.__all__.
  • Coarse-grained pinned-dispatch examples/boids.py — the
    per-cell send("update") / main-thread receive("update")
    barrier is replaced by per-cell physics on workers plus one
    pinned @when per frame that captures every per-cell result
    cown together with the two main-thread PinnedCown matrices
    and performs the batched write-back. Same visual output, fully
    worker-parallel per-cell work, single main-thread touchpoint.

Public C ABI

  • bocpy_main_interpid() — new static inline helper in
    <bocpy/bocpy.h> returning PyInterpreterState_GetID( PyInterpreterState_Main()) pre-typed as int_least64_t to
    match bocpy_interpid for owner-field equality checks.
    Safe to call from a worker sub-interpreter for diagnostic /
    assert use. Additive — existing consumers recompile unchanged;
    BOCPY_ABI is unchanged at 1. The
    templates/c_abi_consumer bocpy~= pin moves to
    ~=0.9 to signal the new ABI surface it was authored against.

Improvements

  • @when loop-variable snapshot via default arg — the
    transpiler now accepts def b(c, i=i) as an explicit
    loop-snapshot idiom in addition to the existing implicit form
    (just reference the loop variable in the body). Trailing
    positional parameters beyond the cown count are also
    auto-captured by name (def b(c, factor) captures
    factor).
  • @when alias decorators — the transpiler now recognises
    from bocpy import when as boc_when and import bocpy [as alias] followed by @bocpy.when(...) or
    @alias.when(...), provided the aliasing import is at module
    level. Previously only the bare @when form was detected.
  • Behaviors.start() compiles the export module on main
    the transpiler's rewritten module is now also instantiated as an
    in-memory types.ModuleType on the main thread (plus a
    linecache entry for traceback fidelity) so pump can
    resolve __behavior__N the same way workers do via their
    bootstrap.
  • Scheduler-owned behavior pre-headerbq_node and the
    new pinned OR-fold byte moved out of the opaque
    BOCBehavior into a scheduler-owned boc_behavior_prehdr_t
    allocated immediately before each behavior (CPython
    _PyGC_Head style). boc_sched.c no longer needs any
    knowledge of BOCBehavior's internal layout; layout drift
    between the scheduler and its users is impossible by
    construction.
  • terminator_wait_pumpable — new entry in
    boc_terminator.{c,h} lets the auto-pump loop wake on either
    count-zero or main-pinned-depth-becoming-non-zero, both wired
    through the existing single condition variable. Single-pumper
    enforcement on free-threaded builds (Py_GIL_DISABLED) lives
    alongside via a MAIN_PUMP_THREAD CAS that raises
    RuntimeError if a second thread tries to pump
    concurrently, cleared on every exit path including
    BaseException.

Bug Fixes

  • CWE-401: inheriting INCREF leak in cown_decref_inline
    CownCapsule_reduce packs an encoded XIData payload by
    taking an inheriting COWN_INCREF per embedded
    CownCapsule, normally balanced when the bytes are
    unpickled inside a worker. On the orphan-death path (the
    consumer side never deserialised the payload) the matching
    COWN_DECREFs never fired and every embedded cown leaked.
    cown_decref_inline now feeds the encoded bytes through
    pickle.loads and immediately drops the result, which lets
    CPython's GC fire the matching COWN_DECREFs recursively.
    Gated on the pickled flag so native XIData round-trips
    (e.g. Matrix) skip the work entirely.
  • Main-pump behavior reference leak — both
    _core_main_pump_bounded and _core_main_pump_drain_all
    popped a BehaviorCapsule from MAIN_PINNED_QUEUE but
    never released the strong reference the capsule held on the
    underlying BOCBehavior. Each pinned behavior leaked
    one reference until the runtime was torn down. The pump
    helpers now BEHAVIOR_DECREF the behavior immediately after
    the worker-equivalent cleanup runs.
  • MSVC <stdatomic.h> compatibility — Microsoft's
    <stdatomic.h> (used by CPython's headers on Windows) does
    not expose the unsigned atomic_uint_least64_t or
    atomic_uintptr_t forms that the pinned-pump bookkeeping
    used. MAIN_PINNED_DEPTH, MAIN_PINNED_NONEMPTY_SINCE_NS,
    LAST_PUMP_NS, WATCHDOG_WARN_MS, WATCHDOG_LAST_WARN_NS,
    WATCHDOG_ON_STARVE and MAIN_PUMP_THREAD are now
    atomic_int_least64_t / atomic_intptr_t. Depth never
    goes negative; pointer bits round-trip losslessly through the
    signed atomic boundary.
  • CPython 3.10/3.11 PyErr_SetRaisedException polyfill
    added to include/bocpy/xidata.h alongside the existing
    PyErr_GetRaisedException polyfill so the public C ABI's
    exception-stash pattern compiles on Python versions before
    3.12. BOCPY_ABI is unchanged.
  • Portable boc_max_align_t — added to boc_compat.h as
    a union of the most-strictly-aligned fundamental types
    (long long, long double, void *, function pointer).
    MSVC exposes the C11 max_align_t only under /std:c11,
    which the CPython build does not pass; the
    boc_behavior_prehdr_t size assertion now uses
    alignof(boc_max_align_t) so the alignment contract holds on
    every supported toolchain.
  • PEP 678 add_note 3.10 fallback — the new
    Behaviors.quiesce exception-context shim attaches a note
    describing the seed-inc / seed-dec balance on failure. CPython
    3.10 predates BaseException.add_note; the shim now
    writes to BaseException.__notes__ directly when add_note
    is missing.
  • Transpiler except ... as X mis-classification
    ExceptHandler binds X on the handler node
    itself rather than via Name Store, so the
    transpiler's free-variable walker mis-classified any read of
    X inside the handler body as a free variable, appended it
    as a behavior parameter, and emitted a call site that
    referenced an out-of-scope name. Fixed by a new
    visit_ExceptHandler hook that registers X as a local
    before recursing into the handler. Regression locked by
    TestCapturedLocals::test_except_as_name_excluded.

Documentation

  • New pinned_cowns page — concept and when to use,
    PinnedCown / pump / PumpResult / set_pump_watchdog
    / set_wait_pump_poll API, coarse-grained pinned-dispatch
    pattern, event-loop integration recipes (pyglet, Tk, asyncio),
    the queue-non-empty-time watchdog contract, free-threaded
    single-pumper rule, and free-threaded support trajectory.
    Linked from the root toctree.
  • api expanded with the new PinnedCown / pump /
    PumpResult / set_pump_watchdog / set_wait_pump_poll
    entries.
  • New "Talking to main-thread objects" subsection in the root
    README.md's "A taste of BOC" with a 10-line pyglet snippet
    illustrating the coarse-grained pattern; the public-API list
    picks up the five new symbols.
  • examples/README.md calls out the rewritten boids.py and
    the new examples/benchmark.py --pinned-spinner flag.

Tests

  • test/test_pinned_pump.py — new module covering the
    full PinnedCown / pump matrix: pure-pinned, mixed
    request sets, off-main construction rejection, locked
    error-string smoke tests, deadline_ms / max_behaviors
    bounding, body exceptions under default and
    raise_on_error=True, wait() auto-pump, shutdown drain
    via drop-exceptions, the watchdog warn-only and explicit-raise
    paths, the QUEUE_NONEMPTY_SINCE regression for unpinned-only
    workloads, hypothesis fuzz over mixed request sets,
    PinnedCown-handle round-trip through closure capture and
    through the noticeboard, Cown(PinnedCown) interop, and an
    acquire-failure fault-injection test that proves
    IN_PUMP_BODY / terminator_dec / MAIN_PUMP_THREAD
    cleanup runs on every exit path.
  • test/test_transpiler.py — 192 new lines covering the
    def b(c, i=i) loop-snapshot form, @when alias decorators,
    and the except ... as X regression.
  • test_main_pump_drain_all_marks_result_cowns flaky-shutdown
    rewrite
    — the original version scheduled eight pinned
    behaviors, called wait(timeout=0) to force shutdown, then
    asserted on the result cowns. The timeout=0 propagated
    through every stage of Behaviors.stop (quiescence,
    noticeboard drain) and raised TimeoutError from one of
    them under load before the post-wait assertions could run.
    The rewritten test calls _core.main_pump_drain_all directly
    to exercise the shutdown drain in isolation and asserts every
    drained result cown carries the shutdown RuntimeError.

Internal

  • examples/benchmark.py --pinned-spinner — high-rate
    pinned-dispatch overlay that adds one tail-recursing
    @when(PinnedCown) driven by pump(max_behaviors=1) on the
    main thread at a configurable rate while the existing chain-ring
    workload runs on workers. Used during development to verify
    worker-throughput regression under high-rate pinned dispatch;
    on CPython 3.14 at 4 workers / 10 s / 3 repeats the measured
    delta with the spinner active was −0.38%.
  • Noticeboard read contract tightenednoticeboard
    now explicitly documents that calling noticeboard or
    notice_read from the main thread outside a behavior is
    undefined behavior; the supported main-thread read path is
    wait(noticeboard=True). Seeding the noticeboard with
    notice_write from the main thread before scheduling any
    behavior remains supported.
  • test_matrix.TestVectorMethodsInCown migrated to the
    send("assert", ...) pattern
    — the in-cown Matrix vector
    tests previously asserted on result.value directly from the
    test thread, which violates the cown ownership contract. They now
    ship assertions out of each behavior via send("assert", ...)
    and collect on the test thread via a receive_asserts(count)
    helper, matching the project's BOC testing convention.
  • CI: ASAN detect_leaks=1 — the pinned-pump leak hunt
    cleared the last masking leak; the ASAN job in
    .github/workflows/pr_gate.yml now sets
    ASAN_OPTIONS=detect_leaks=1:halt_on_error=1 so any new
    reachable leak fails the build at the source instead of
    silently accumulating under detect_leaks=0.