Skip to content

forkd-jupyter-spawner: JupyterHub Spawner adapter #21

@WaylandYang

Description

@WaylandYang

Goal

Ship a forkd-jupyter-spawner Python package that implements
JupyterHub's Spawner interface, so JupyterHub admins can use forkd
as the per-user kernel backend instead of Docker, Kubernetes, or
local processes.

Why

JupyterHub's current spawner ecosystem (DockerSpawner, KubeSpawner,
SystemdSpawner, …) all pay container or process cold-start per user.
forkd's recipes/jupyter-kernel/ parent already proves the SciPy
stack can be warmed once and shared across N children. The missing
piece is a Spawner class that talks to the forkd-controller daemon
on the host.

This is the natural production path for the "Where forkd fits" /
Jupyter-kernel use case currently surfaced in the README.

Scope

  • In scope. A standalone Python package forkd-jupyter-spawner
    (separate repo or sdk/python-spawner/ here) implementing
    jupyterhub.spawner.Spawner subclass. Methods to wire:
    • start()POST /v1/sandboxes on the forkd controller with
      a configured snapshot tag (e.g. jupyter-kernel).
    • poll()GET /v1/sandboxes/:id to check liveness.
    • stop()DELETE /v1/sandboxes/:id.
    • State persistence: get_state / load_state for surviving
      hub restarts.
    • Network: forward the kernel's ZMQ ports through the child's
      netns to the host so JupyterHub clients can connect.
  • Out of scope. Multi-host scheduling (the controller is single-
    host today); a custom Jupyter UI; replacing JupyterHub itself.

Known unknowns

  • Kernel readiness signal. The forkd-controller doesn't currently
    expose "the in-VM kernel is listening on ZMQ ports X/Y/Z."
    Either (a) extend the controller to forward arbitrary TCP ports
    through netns + return their host-side addresses, or (b) use the
    existing forkd-agent on :8888 as a control plane and have it
    start ipykernel and report ports back.
  • Resource limits. JupyterHub admins expect per-user memory /
    cpu quotas. forkd has cgroup memory.max today; cpu/io/pids quotas
    are still TODO.
  • Authentication. Hub authenticates the user; forkd-controller
    expects a bearer token. Spawner needs to pass that through (config).

Acceptance criteria

  • pip install forkd-jupyter-spawner works end-to-end.
  • A minimal JupyterHub c.JupyterHub.spawner_class = 'forkd_jupyter_spawner.ForkdSpawner' config spawns a user kernel against a
    forkd snapshot tag.
  • Per-user kernel cold-start (hub start → cell-1 ready) is < 1 s
    given a warmed parent on the same host.
  • Tests: one integration test that brings up a local JupyterHub,
    spawns 5 kernels in parallel, asserts each can execute a numpy
    expression.

Related

  • Recipe: recipes/jupyter-kernel/ (already shipped — proves the
    parent-rootfs story; spawner is the JupyterHub-side glue.)
  • README "Where forkd fits" → first bullet now explicitly names
    Jupyter-kernel sandboxes as the design point.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions