|
36 | 36 | py3.14. |
37 | 37 |
|
38 | 38 | This submodule lifts the validated primitives out of the |
39 | | -smoke-test and into tractor proper, so they can eventually be |
40 | | -wired into a real "subint forkserver" spawn backend — where: |
| 39 | +smoke-test and into tractor proper as the |
| 40 | +`subint_forkserver` spawn backend. |
| 41 | +
|
| 42 | +Design rationale — why a forkserver, and why in-process |
| 43 | +------------------------------------------------------- |
| 44 | +
|
| 45 | +There are two design questions worth pinning down up front, |
| 46 | +since the name "subint_forkserver" intentionally evokes the |
| 47 | +stdlib `multiprocessing.forkserver` for comparison: |
| 48 | +
|
| 49 | +**(1) Why a forkserver pattern at all, vs. forking directly |
| 50 | +from the trio task?** |
| 51 | +
|
| 52 | +`os.fork()` is fundamentally hostile to trio: trio owns |
| 53 | +file descriptors, signal-wakeup-fds, threadpools, and an |
| 54 | +event loop with non-trivial post-fork lifecycle invariants |
| 55 | +(see python-trio/trio#1614 et al.). Forking a trio-running |
| 56 | +thread duplicates all that state into the child, which then |
| 57 | +either needs surgical reset (fragile) or has to immediately |
| 58 | +`exec()` (defeats the point of fork-without-exec). The |
| 59 | +*forkserver* sidesteps this by isolating the `os.fork()` |
| 60 | +call in a worker that has provably never entered trio — so |
| 61 | +the child inherits a clean, trio-free image. |
| 62 | +
|
| 63 | +**(2) Why an in-process forkserver, vs. stdlib |
| 64 | +`multiprocessing.forkserver`?** |
| 65 | +
|
| 66 | +The stdlib design solves the same "fork from clean state" |
| 67 | +problem by spinning up a **separate sidecar process** at |
| 68 | +first use of `mp.set_start_method('forkserver')`. The parent |
| 69 | +then IPC's each spawn request to that sidecar over a unix |
| 70 | +socket; the sidecar is the process that actually calls |
| 71 | +`os.fork()`. This works but pays for cleanliness with three |
| 72 | +costs: |
| 73 | +
|
| 74 | +- **Sidecar lifecycle**: a second long-lived process per |
| 75 | + parent, with its own start/stop/health-check semantics. |
| 76 | +- **IPC overhead per spawn**: every actor-spawn round-trips |
| 77 | + an `mp` request message through a unix socket before any |
| 78 | + child code runs. |
| 79 | +- **State isolation by process boundary**: the sidecar can't |
| 80 | + share parent state at all — every spawn is a "cold" child |
| 81 | + re-importing modules from disk. |
| 82 | +
|
| 83 | +The subint architecture lets us keep the forkserver |
| 84 | +**in-process** because subints already provide the |
| 85 | +state-isolation guarantee that `mp.forkserver`'s sidecar |
| 86 | +buys via the process boundary. Concretely: in the envisioned |
| 87 | +arch (currently partially landed — see "Status" below), |
| 88 | +
|
| 89 | +- the **main interpreter** stays trio-free and hosts the |
| 90 | + forkserver worker thread that owns `os.fork()`, |
| 91 | +- the parent actor's **`trio.run()`** lives in a separate |
| 92 | + *sub-interpreter* (a different worker thread) — fully |
| 93 | + isolated `sys.modules` / `__main__` / globals from main, |
| 94 | +- when a spawn is requested, the trio task signals the |
| 95 | + forkserver thread (intra-process, ~free) and the |
| 96 | + forkserver forks; the child inherits the parent's full |
| 97 | + in-memory state cheaply. |
| 98 | +
|
| 99 | +That collapses the three costs above: |
| 100 | +
|
| 101 | +- no sidecar — the forkserver is just another thread, |
| 102 | +- spawn signal is a thread-local event/condition, not IPC, |
| 103 | +- child inherits the warm parent state (loaded modules, |
| 104 | + populated caches, etc.) for free. |
| 105 | +
|
| 106 | +The tradeoff we accept in exchange: this design is |
| 107 | +3.14-only (legacy-config subints still share the GIL, so |
| 108 | +the parent's trio loop and the forkserver worker contend |
| 109 | +on it; once PEP 684 isolated-mode + msgspec |
| 110 | +[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026) |
| 111 | +land, this constraint relaxes). And the dedicated worker |
| 112 | +threads here are heavier than `trio.to_thread.run_sync` |
| 113 | +calls — see the "TODO" section further down for the audit |
| 114 | +plan once those upstream pieces land. |
| 115 | +
|
| 116 | +Implementation status — what's wired today |
| 117 | +----------------------------------------- |
| 118 | +
|
| 119 | +The "envisioned arch" above is the eventual target; the |
| 120 | +**currently-landed** flow is a partial step toward it: |
41 | 121 |
|
42 | 122 | - A dedicated main-interp worker thread owns all `os.fork()` |
43 | | - calls (never enters a subint). |
44 | | -- The tractor parent-actor's `trio.run()` lives in a |
45 | | - sub-interpreter on a different worker thread. |
46 | | -- When a spawn is requested, the trio-task signals the |
47 | | - forkserver thread; the forkserver forks; child re-enters |
48 | | - the same pattern (trio in a subint + forkserver on main). |
49 | | -
|
50 | | -This mirrors the stdlib `multiprocessing.forkserver` design |
51 | | -but keeps the forkserver in-process for faster spawn latency |
52 | | -and inherited parent state. |
| 123 | + calls (never enters a subint). ✓ landed. |
| 124 | +- Parent actor's `trio.run()` lives **on the main interp** |
| 125 | + for now (not a subint yet). The subint-hosted root |
| 126 | + runtime is gated on jcrist/msgspec#1026 (see |
| 127 | + `_subint.py` docstring). |
| 128 | +- Spawn-request signal: trio task `→ to_thread.run_sync` |
| 129 | + to the forkserver-worker thread. ✓ landed. |
| 130 | +- Forked child: runs `_actor_child_main` against a normal |
| 131 | + trio runtime. ✓ landed. |
| 132 | +
|
| 133 | +The "subint" in the backend name refers to the *family* — |
| 134 | +this backend ships in the same PR series as `_subint.py` |
| 135 | +(in-thread subint backend) and `_subint_fork.py` (the RFC |
| 136 | +stub for fork-from-non-main-subint, blocked upstream). |
| 137 | +Once the parent's trio also lives in a subint we'll have |
| 138 | +the full envisioned arch; until then the forkserver |
| 139 | +half is independently useful and ship-able. |
| 140 | +
|
| 141 | +What survives the fork? — POSIX semantics |
| 142 | +----------------------------------------- |
| 143 | +
|
| 144 | +A natural worry when forking from a parent that's running |
| 145 | +`trio.run()` on another thread: does that trio thread (and |
| 146 | +any other threads in the parent) keep running in the child? |
| 147 | +
|
| 148 | +**No.** POSIX `fork()` only preserves the *calling* thread |
| 149 | +in the child. Every other thread in the parent — trio's |
| 150 | +runner thread, any `to_thread` cache threads, anything else |
| 151 | +— is gone the instant `fork()` returns in the child. |
| 152 | +
|
| 153 | +Concretely, after the forkserver worker calls `os.fork()`: |
| 154 | +
|
| 155 | +| thread | parent | child | |
| 156 | +|-----------------------|-----------|---------------| |
| 157 | +| forkserver worker | continues | sole survivor | |
| 158 | +| `trio.run()` thread | continues | gone | |
| 159 | +| any other thread | continues | gone | |
| 160 | +
|
| 161 | +The forkserver worker becomes the new "main" execution |
| 162 | +context in the child; `trio.run()` and every other |
| 163 | +parent thread never executes a single instruction |
| 164 | +post-fork in the child. |
| 165 | +
|
| 166 | +This is exactly *why* `os.fork()` is delegated to a |
| 167 | +dedicated worker thread that has provably never entered |
| 168 | +trio: we want that trio-free thread to be the surviving |
| 169 | +one in the child. |
| 170 | +
|
| 171 | +That said, dead-thread *artifacts* still cross the fork |
| 172 | +boundary (canonical "fork in a multithreaded program is |
| 173 | +dangerous" — see `man pthread_atfork`). What persists, and |
| 174 | +how we handle each: |
| 175 | +
|
| 176 | +- **Inherited file descriptors** — the dead trio thread's |
| 177 | + epoll fd, signal-wakeup-fd, eventfds, sockets, IPC |
| 178 | + pipes, pytest's capture-fds, etc. are all still in the |
| 179 | + child's fd table (kernel-level inheritance). Handled by |
| 180 | + `_close_inherited_fds()` in the child prelude — walks |
| 181 | + `/proc/self/fd` and closes everything except stdio + |
| 182 | + the channel pipe to the forkserver. |
| 183 | +- **Memory image** — trio's internal data structures |
| 184 | + (scheduler, task queues, runner state) sit in COW |
| 185 | + memory but nobody's executing them. Get GC'd / |
| 186 | + overwritten when the child's fresh `trio.run()` boots. |
| 187 | +- **Python thread state** — handled automatically by |
| 188 | + CPython. `PyOS_AfterFork_Child()` calls |
| 189 | + `_PyThreadState_DeleteExceptCurrent()`, so dead |
| 190 | + `PyThreadState` objects are cleaned and |
| 191 | + `threading.enumerate()` returns just the surviving |
| 192 | + thread. |
| 193 | +- **User-level locks (`threading.Lock`)** — |
| 194 | + held-by-dead-thread state is the canonical fork hazard. |
| 195 | + Not an issue in practice for tractor: trio doesn't hold |
| 196 | + cross-thread locks across fork (its synchronization is |
| 197 | + within the trio task system, which doesn't survive in |
| 198 | + either direction). CPython's GIL is auto-reset by the |
| 199 | + fork callback. |
| 200 | +
|
| 201 | +FYI: how this dodges the `trio.run()` × `fork()` hazards |
| 202 | +-------------------------------------------------------- |
| 203 | +
|
| 204 | +`os.fork()` is famously hostile to `trio` (see |
| 205 | +python-trio/trio#1614 et al.) because trio owns several |
| 206 | +classes of process-global state that all break across the |
| 207 | +fork boundary in different ways. The forkserver-thread |
| 208 | +design dodges each class explicitly: |
| 209 | +
|
| 210 | +- **Signal-wakeup-fd**: trio installs a wakeup-fd via |
| 211 | + `signal.set_wakeup_fd()` on `trio.run()` startup so |
| 212 | + signals can interrupt `epoll_wait`. The child inherits |
| 213 | + this fd, but trio's runner that owns it is gone — so |
| 214 | + any signal delivery in the child writes to a dead |
| 215 | + reader. *Dodge*: the inherited wakeup-fd is closed by |
| 216 | + `_close_inherited_fds()`, then the child's own |
| 217 | + `trio.run()` installs a fresh one. |
| 218 | +- **`epoll`/`kqueue` instance**: trio's I/O backend holds |
| 219 | + one. Inherited as a dead fd; same fix as above. |
| 220 | +- **Threadpool cache threads** (`trio.to_thread`): worker |
| 221 | + threads with cached tstate. Don't exist in the child |
| 222 | + (POSIX); cache state is meaningless garbage that gets |
| 223 | + reset when the child's trio.run() initializes its own |
| 224 | + thread cache. |
| 225 | +- **Cancel scopes / nurseries / open `trio.Process` / |
| 226 | + open sockets**: these are trio-runtime objects, not |
| 227 | + kernel objects. The runtime that owns them is gone in |
| 228 | + the child, so the Python objects exist as zombie data |
| 229 | + in COW memory and get overwritten as the child runs. |
| 230 | + Inherited *kernel* fds those objects wrapped (sockets, |
| 231 | + proc pipes) are caught by `_close_inherited_fds()`. |
| 232 | +- **`atexit` handlers**: trio doesn't register any that |
| 233 | + would mis-fire post-fork; trio's lifetime-stack is |
| 234 | + all `with`-block-scoped and dies with the runner. |
| 235 | +- **Foreign-language I/O state** (libcurl, OpenSSL session |
| 236 | + caches, etc.): out of scope — same hazard as any |
| 237 | + fork-without-exec; users layering those on top of |
| 238 | + tractor need their own pthread_atfork handlers. |
| 239 | +
|
| 240 | +Net effect: for the runtime surface tractor controls |
| 241 | +(trio + IPC layer + msgspec), the forkserver-thread |
| 242 | +isolation + `_close_inherited_fds()` cleanup gives the |
| 243 | +forked child a clean trio environment. Everything else |
| 244 | +falls under the standard fork-without-exec disclaimer. |
53 | 245 |
|
54 | 246 | Status |
55 | 247 | ------ |
|
100 | 292 | Full analysis + audit plan for when we can revisit is in |
101 | 293 | `ai/conc-anal/subint_forkserver_thread_constraints_on_pep684_issue.md`. |
102 | 294 | Intent: file a follow-up GH issue linked to #379 once |
103 | | -[jcrist/msgspec#563](https://github.com/jcrist/msgspec/issues/563) |
| 295 | +[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026) |
104 | 296 | unblocks isolated-mode subints in tractor. |
105 | 297 |
|
106 | 298 | See also |
|
0 commit comments