Skip to content

Commit 3ab99d5

Browse files
committed
Doc _subint_forkserver design + fork semantics
Major expansion of the module docstring. Code is unchanged; this lands the architectural reasoning that was previously implicit, plus the POSIX/trio fork mechanics the design relies on. New sections: - "Design rationale" — answers two implicit questions: (1) why a forkserver pattern at all (vs. forking directly from a trio task), (2) why in-process (vs. stdlib `mp.forkserver`'s sidecar process). Documents the three costs the in-process design avoids (sidecar lifecycle, per-spawn IPC, cold-start child) and the tradeoffs we accept in exchange (3.14-only, heavier than `to_thread.run_sync`). - "Implementation status" — clarifies what's actually landed today vs. the envisioned arch: parent's `trio.run()` still lives on main interp (subint- hosted root gated on jcrist/msgspec#1026). Names why the "subint" prefix is correct anyway — same PR series as `_subint.py` / `_subint_fork.py`. - "What survives the fork? — POSIX semantics" — POSIX preserves only the calling thread, so the `trio.run()` thread is gone in the child. Includes a small parent/child thread-survival table and covers the four artifact classes that DO cross the fork boundary (inherited fds, COW memory, Python thread state, user-level locks) and how each is handled. - "FYI: how this dodges the `trio.run()` × `fork()` hazards" — itemizes each class of trio process- global state (wakeup-fd, `epoll`/`kqueue`, threadpool, cancel scopes / nurseries, `atexit`, foreign-language I/O) and explains how the forkserver-thread design avoids each. Also, - bump the gated msgspec issue link from `jcrist/msgspec#563` to `jcrist/msgspec#1026` (the PEP 684 isolated-mode tracker). (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
1 parent 5456195 commit 3ab99d5

1 file changed

Lines changed: 205 additions & 13 deletions

File tree

tractor/spawn/_subint_forkserver.py

Lines changed: 205 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -36,20 +36,212 @@
3636
py3.14.
3737
3838
This submodule lifts the validated primitives out of the
39-
smoke-test and into tractor proper, so they can eventually be
40-
wired into a real "subint forkserver" spawn backend — where:
39+
smoke-test and into tractor proper as the
40+
`subint_forkserver` spawn backend.
41+
42+
Design rationale — why a forkserver, and why in-process
43+
-------------------------------------------------------
44+
45+
There are two design questions worth pinning down up front,
46+
since the name "subint_forkserver" intentionally evokes the
47+
stdlib `multiprocessing.forkserver` for comparison:
48+
49+
**(1) Why a forkserver pattern at all, vs. forking directly
50+
from the trio task?**
51+
52+
`os.fork()` is fundamentally hostile to trio: trio owns
53+
file descriptors, signal-wakeup-fds, threadpools, and an
54+
event loop with non-trivial post-fork lifecycle invariants
55+
(see python-trio/trio#1614 et al.). Forking a trio-running
56+
thread duplicates all that state into the child, which then
57+
either needs surgical reset (fragile) or has to immediately
58+
`exec()` (defeats the point of fork-without-exec). The
59+
*forkserver* sidesteps this by isolating the `os.fork()`
60+
call in a worker that has provably never entered trio — so
61+
the child inherits a clean, trio-free image.
62+
63+
**(2) Why an in-process forkserver, vs. stdlib
64+
`multiprocessing.forkserver`?**
65+
66+
The stdlib design solves the same "fork from clean state"
67+
problem by spinning up a **separate sidecar process** at
68+
first use of `mp.set_start_method('forkserver')`. The parent
69+
then IPC's each spawn request to that sidecar over a unix
70+
socket; the sidecar is the process that actually calls
71+
`os.fork()`. This works but pays for cleanliness with three
72+
costs:
73+
74+
- **Sidecar lifecycle**: a second long-lived process per
75+
parent, with its own start/stop/health-check semantics.
76+
- **IPC overhead per spawn**: every actor-spawn round-trips
77+
an `mp` request message through a unix socket before any
78+
child code runs.
79+
- **State isolation by process boundary**: the sidecar can't
80+
share parent state at all — every spawn is a "cold" child
81+
re-importing modules from disk.
82+
83+
The subint architecture lets us keep the forkserver
84+
**in-process** because subints already provide the
85+
state-isolation guarantee that `mp.forkserver`'s sidecar
86+
buys via the process boundary. Concretely: in the envisioned
87+
arch (currently partially landed — see "Status" below),
88+
89+
- the **main interpreter** stays trio-free and hosts the
90+
forkserver worker thread that owns `os.fork()`,
91+
- the parent actor's **`trio.run()`** lives in a separate
92+
*sub-interpreter* (a different worker thread) — fully
93+
isolated `sys.modules` / `__main__` / globals from main,
94+
- when a spawn is requested, the trio task signals the
95+
forkserver thread (intra-process, ~free) and the
96+
forkserver forks; the child inherits the parent's full
97+
in-memory state cheaply.
98+
99+
That collapses the three costs above:
100+
101+
- no sidecar — the forkserver is just another thread,
102+
- spawn signal is a thread-local event/condition, not IPC,
103+
- child inherits the warm parent state (loaded modules,
104+
populated caches, etc.) for free.
105+
106+
The tradeoff we accept in exchange: this design is
107+
3.14-only (legacy-config subints still share the GIL, so
108+
the parent's trio loop and the forkserver worker contend
109+
on it; once PEP 684 isolated-mode + msgspec
110+
[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026)
111+
land, this constraint relaxes). And the dedicated worker
112+
threads here are heavier than `trio.to_thread.run_sync`
113+
calls — see the "TODO" section further down for the audit
114+
plan once those upstream pieces land.
115+
116+
Implementation status — what's wired today
117+
-----------------------------------------
118+
119+
The "envisioned arch" above is the eventual target; the
120+
**currently-landed** flow is a partial step toward it:
41121
42122
- A dedicated main-interp worker thread owns all `os.fork()`
43-
calls (never enters a subint).
44-
- The tractor parent-actor's `trio.run()` lives in a
45-
sub-interpreter on a different worker thread.
46-
- When a spawn is requested, the trio-task signals the
47-
forkserver thread; the forkserver forks; child re-enters
48-
the same pattern (trio in a subint + forkserver on main).
49-
50-
This mirrors the stdlib `multiprocessing.forkserver` design
51-
but keeps the forkserver in-process for faster spawn latency
52-
and inherited parent state.
123+
calls (never enters a subint). ✓ landed.
124+
- Parent actor's `trio.run()` lives **on the main interp**
125+
for now (not a subint yet). The subint-hosted root
126+
runtime is gated on jcrist/msgspec#1026 (see
127+
`_subint.py` docstring).
128+
- Spawn-request signal: trio task `→ to_thread.run_sync`
129+
to the forkserver-worker thread. ✓ landed.
130+
- Forked child: runs `_actor_child_main` against a normal
131+
trio runtime. ✓ landed.
132+
133+
The "subint" in the backend name refers to the *family* —
134+
this backend ships in the same PR series as `_subint.py`
135+
(in-thread subint backend) and `_subint_fork.py` (the RFC
136+
stub for fork-from-non-main-subint, blocked upstream).
137+
Once the parent's trio also lives in a subint we'll have
138+
the full envisioned arch; until then the forkserver
139+
half is independently useful and ship-able.
140+
141+
What survives the fork? — POSIX semantics
142+
-----------------------------------------
143+
144+
A natural worry when forking from a parent that's running
145+
`trio.run()` on another thread: does that trio thread (and
146+
any other threads in the parent) keep running in the child?
147+
148+
**No.** POSIX `fork()` only preserves the *calling* thread
149+
in the child. Every other thread in the parent — trio's
150+
runner thread, any `to_thread` cache threads, anything else
151+
— is gone the instant `fork()` returns in the child.
152+
153+
Concretely, after the forkserver worker calls `os.fork()`:
154+
155+
| thread | parent | child |
156+
|-----------------------|-----------|---------------|
157+
| forkserver worker | continues | sole survivor |
158+
| `trio.run()` thread | continues | gone |
159+
| any other thread | continues | gone |
160+
161+
The forkserver worker becomes the new "main" execution
162+
context in the child; `trio.run()` and every other
163+
parent thread never executes a single instruction
164+
post-fork in the child.
165+
166+
This is exactly *why* `os.fork()` is delegated to a
167+
dedicated worker thread that has provably never entered
168+
trio: we want that trio-free thread to be the surviving
169+
one in the child.
170+
171+
That said, dead-thread *artifacts* still cross the fork
172+
boundary (canonical "fork in a multithreaded program is
173+
dangerous" — see `man pthread_atfork`). What persists, and
174+
how we handle each:
175+
176+
- **Inherited file descriptors** — the dead trio thread's
177+
epoll fd, signal-wakeup-fd, eventfds, sockets, IPC
178+
pipes, pytest's capture-fds, etc. are all still in the
179+
child's fd table (kernel-level inheritance). Handled by
180+
`_close_inherited_fds()` in the child prelude — walks
181+
`/proc/self/fd` and closes everything except stdio +
182+
the channel pipe to the forkserver.
183+
- **Memory image** — trio's internal data structures
184+
(scheduler, task queues, runner state) sit in COW
185+
memory but nobody's executing them. Get GC'd /
186+
overwritten when the child's fresh `trio.run()` boots.
187+
- **Python thread state** — handled automatically by
188+
CPython. `PyOS_AfterFork_Child()` calls
189+
`_PyThreadState_DeleteExceptCurrent()`, so dead
190+
`PyThreadState` objects are cleaned and
191+
`threading.enumerate()` returns just the surviving
192+
thread.
193+
- **User-level locks (`threading.Lock`)** —
194+
held-by-dead-thread state is the canonical fork hazard.
195+
Not an issue in practice for tractor: trio doesn't hold
196+
cross-thread locks across fork (its synchronization is
197+
within the trio task system, which doesn't survive in
198+
either direction). CPython's GIL is auto-reset by the
199+
fork callback.
200+
201+
FYI: how this dodges the `trio.run()` × `fork()` hazards
202+
--------------------------------------------------------
203+
204+
`os.fork()` is famously hostile to `trio` (see
205+
python-trio/trio#1614 et al.) because trio owns several
206+
classes of process-global state that all break across the
207+
fork boundary in different ways. The forkserver-thread
208+
design dodges each class explicitly:
209+
210+
- **Signal-wakeup-fd**: trio installs a wakeup-fd via
211+
`signal.set_wakeup_fd()` on `trio.run()` startup so
212+
signals can interrupt `epoll_wait`. The child inherits
213+
this fd, but trio's runner that owns it is gone — so
214+
any signal delivery in the child writes to a dead
215+
reader. *Dodge*: the inherited wakeup-fd is closed by
216+
`_close_inherited_fds()`, then the child's own
217+
`trio.run()` installs a fresh one.
218+
- **`epoll`/`kqueue` instance**: trio's I/O backend holds
219+
one. Inherited as a dead fd; same fix as above.
220+
- **Threadpool cache threads** (`trio.to_thread`): worker
221+
threads with cached tstate. Don't exist in the child
222+
(POSIX); cache state is meaningless garbage that gets
223+
reset when the child's trio.run() initializes its own
224+
thread cache.
225+
- **Cancel scopes / nurseries / open `trio.Process` /
226+
open sockets**: these are trio-runtime objects, not
227+
kernel objects. The runtime that owns them is gone in
228+
the child, so the Python objects exist as zombie data
229+
in COW memory and get overwritten as the child runs.
230+
Inherited *kernel* fds those objects wrapped (sockets,
231+
proc pipes) are caught by `_close_inherited_fds()`.
232+
- **`atexit` handlers**: trio doesn't register any that
233+
would mis-fire post-fork; trio's lifetime-stack is
234+
all `with`-block-scoped and dies with the runner.
235+
- **Foreign-language I/O state** (libcurl, OpenSSL session
236+
caches, etc.): out of scope — same hazard as any
237+
fork-without-exec; users layering those on top of
238+
tractor need their own pthread_atfork handlers.
239+
240+
Net effect: for the runtime surface tractor controls
241+
(trio + IPC layer + msgspec), the forkserver-thread
242+
isolation + `_close_inherited_fds()` cleanup gives the
243+
forked child a clean trio environment. Everything else
244+
falls under the standard fork-without-exec disclaimer.
53245
54246
Status
55247
------
@@ -100,7 +292,7 @@
100292
Full analysis + audit plan for when we can revisit is in
101293
`ai/conc-anal/subint_forkserver_thread_constraints_on_pep684_issue.md`.
102294
Intent: file a follow-up GH issue linked to #379 once
103-
[jcrist/msgspec#563](https://github.com/jcrist/msgspec/issues/563)
295+
[jcrist/msgspec#1026](https://github.com/jcrist/msgspec/issues/1026)
104296
unblocks isolated-mode subints in tractor.
105297
106298
See also

0 commit comments

Comments
 (0)