Background
Today the actor lifecycle is split across two tightly coupled components:
atelet: a node-level DaemonSet that the control plane talks to
ateom: the worker component inside the worker pod that actually runs, checkpoints, and restores actors
To create a single worker, the control plane today coordinates two RPCs:
one to atelet and one to ateom-gvisor. The two processes also share
state through a host bind mount at /run/ateom-gvisor so they can hand
off snapshot files.
Problem
This split presents three structural issues:
-
Two-component coordination. Every worker-lifecycle operation is a
distributed transaction across atelet and ateom. Failures and
partial states have to be reconciled by callers, and upgrades have to
keep the two binaries version-compatible. Debugging means reading two
sets of logs and reasoning about the handoff between them.
-
Backend lock-in. The split assumes the gVisor model (a node agent
plus an in-sandbox helper). Adding a different worker backend
(Firecracker, for example) will be harder as we will need to build the support for it in 2 components.
-
Shared host /run mount is a blast radius. The atelet ↔ ateom
handoff requires a host bind mount on /run/ateom-gvisor. A
misbehaving sandbox that fills that directory can exhaust /run on
the node and take down every other pod on it. With per-pod state
(no host mount), one bad sandbox only takes itself down.
Proposal
Remove atelet and consolidate its responsibilities into ateom
(running per-worker-pod), exposing a single control-plane-facing
interface. Concretely:
- Worker lifecycle RPCs (create / start / suspend / restore / destroy)
become a single call to the per-pod agent.
- The backend (gVisor today, others later) lives behind an interface
inside ateom; new backends plug in there.
- Snapshot/restore state stays inside the worker pod's own filesystem —
no host mount needed.
- In the future, we can potentially even standardtize the api that
ateom exposes to allow out-of-tree ateoms.
Currently it is not possible due to how the ateom does networking, but once #110 is in, we can implement this proposal.
Background
Today the actor lifecycle is split across two tightly coupled components:
atelet: a node-level DaemonSet that the control plane talks toateom: the worker component inside the worker pod that actually runs, checkpoints, and restores actorsTo create a single worker, the control plane today coordinates two RPCs:
one to
ateletand one toateom-gvisor. The two processes also sharestate through a host bind mount at
/run/ateom-gvisorso they can handoff snapshot files.
Problem
This split presents three structural issues:
Two-component coordination. Every worker-lifecycle operation is a
distributed transaction across
ateletandateom. Failures andpartial states have to be reconciled by callers, and upgrades have to
keep the two binaries version-compatible. Debugging means reading two
sets of logs and reasoning about the handoff between them.
Backend lock-in. The split assumes the gVisor model (a node agent
plus an in-sandbox helper). Adding a different worker backend
(Firecracker, for example) will be harder as we will need to build the support for it in 2 components.
Shared host
/runmount is a blast radius. Theatelet↔ateomhandoff requires a host bind mount on
/run/ateom-gvisor. Amisbehaving sandbox that fills that directory can exhaust
/runonthe node and take down every other pod on it. With per-pod state
(no host mount), one bad sandbox only takes itself down.
Proposal
Remove
ateletand consolidate its responsibilities intoateom(running per-worker-pod), exposing a single control-plane-facing
interface. Concretely:
become a single call to the per-pod agent.
inside
ateom; new backends plug in there.no host mount needed.
ateomexposes to allow out-of-treeateoms.Currently it is not possible due to how the
ateomdoes networking, but once #110 is in, we can implement this proposal.