Skip to content

feat(api): rootless Podman OS sandbox for untrusted compiles#21

Merged
frederikbeimgraben merged 4 commits into
mainfrom
feat/pytex-api-sandbox-podman
Jun 4, 2026
Merged

feat(api): rootless Podman OS sandbox for untrusted compiles#21
frederikbeimgraben merged 4 commits into
mainfrom
feat/pytex-api-sandbox-podman

Conversation

@frederikbeimgraben
Copy link
Copy Markdown
Owner

Summary

Stacked on #19. Implements the rootless Podman option from the sandbox proposal: non-trusted (UNTRUSTED/SANDBOXED) tectonic compiles run inside an ephemeral, network-less, read-only container. render_blob stays sandbox-agnostic — the wrapper is chosen by the TrustPolicy in compile_to_pdf, and TRUSTED builds run un-containerised as before.

Base = feat/pytex-api-module (PR #19), not main. Review/merge #19 first.

What's implemented

New module pytex_api._sandbox:

  • build_podman_cmd — pure, unit-tested argv builder. Hardening flags:
    • --network none (no network at all; untrusted tectonic also runs --only-cached)
    • --read-only rootfs + size-capped /tmp tmpfs for scratch
    • --cap-drop ALL, --security-opt no-new-privileges, Podman's default seccomp profile (never weakened to unconfined)
    • --memory / --pids-limit / --cpus — cgroups v2 resource caps
    • Mounts: pre-warmed bundle cache as an ephemeral overlay (:O — writable view, host cache never mutated), the per-request workdir as the only read-write mount (SELinux-relabelled :Z), host font dirs read-only.
  • SandboxConfig — image, cache dir, font dirs, caps, seccomp profile, tectonic_in_image.
  • build_sandbox_image / sandbox_image_present — the privileged warm-up. Builds a Fedora image carrying tectonic (installed via its official binary + the libs it dynamically links — graphite2, openssl — since Fedora has no tectonic package) plus fontconfig. tectonic runs from the image, so no dynamically-linked host binary is smuggled into a minimal base. The image is never pulled/built at request time.

compile_to_pdf wiring:

  • Sandbox is used only when podman is present and the image already exists locally (never pulls on the request path). Otherwise it falls back to the in-process setrlimit/timeout floor with an explicit console.warn — the weaker confinement is never silent.
  • cgroup flags replace the in-process rlimits on the Podman path (an rlimit on the podman client would not reach the build inside).
  • On the wall-clock kill, the named container is force-removed (podman rm -f) so the build cannot outlive the request.

Podman test proof (rootless, no sudo)

Podman 5.8.1 was already installed — the sudo exception was not needed (nothing installed via sudo, no system paths touched). All container work ran rootless as the user.

$ podman info  -> Rootless: true
# image built rootless from the in-code Containerfile (microdnf + official tectonic installer)
$ podman run --rm --network none localhost/pytex-tectonic:latest tectonic --version
Tectonic 0.16.9

# live untrusted build through the sandbox (render_blob, TrustLevel.UNTRUSTED, PDF):
should_sandbox(UNTRUSTED): True
SANDBOX PDF: b'%PDF-1.5' bytes: 7776 dur: 3.7s

# confinement spot-checks with the production flags:
no network        -> network blocked (good)   # curl in container fails
--read-only       -> touch /pwned -> Read-only file system (good)
--tmpfs /tmp      -> touch /tmp/ok -> writable (good)

The opt-in integration test (PYTEX_TEST_PODMAN=1) runs the same end-to-end: warm-up (download tectonic + bundle), then compile untrusted input fully offline through the sandbox — 15 passed.

Quality gates

  • Tests: 15 new in tests/pytex_api/test_sandbox.py (14 pure flag/mount asserts + 1 opt-in live build). Full suite 811 passed, 2 skipped (skips = the real-tectonic PDF build from feat(api): blob-in/blob-out render module with trust levels #19 and the opt-in Podman build; both gated on env/availability).
  • basedpyright src: 0 errors, 0 warnings (the CI gate).
  • ruff: ruff format --check and ruff check on src tests both clean.

Notes / limitations

  • The sandbox image is a privileged warm-up (build_sandbox_image, needs network for dnf + the tectonic installer). Production must build it out of band; an untrusted request never triggers a build or pull.
  • Bundle cache is mounted overlay (:O) so it must be pre-warmed too (a TRUSTED build, or any prior compile, populates ~/.cache/Tectonic).
  • setrlimit+timeout remain the fallback floor when podman/image are absent; the warning makes the downgrade visible.
  • No new docs/ dir added.

🤖 Generated with Claude Code

Frederik Beimgraben and others added 4 commits June 4, 2026 11:06
Wrap the tectonic compile of non-trusted builds in a rootless `podman run`
container, selected by the TrustPolicy. `render_blob` stays sandbox-agnostic;
`compile_to_pdf` chooses the Podman path for UNTRUSTED/SANDBOXED when podman
is available and the pre-built image is present locally, and falls back to the
in-process setrlimit/timeout floor otherwise (with a warning, never silently).

New `pytex_api._sandbox`:

* `build_podman_cmd` - pure, unit-tested argv builder. Flags: `--network none`,
  `--read-only` rootfs + size-capped `/tmp` tmpfs, `--cap-drop ALL`,
  `--security-opt no-new-privileges`, Podman's default seccomp profile (never
  weakened to unconfined), `--memory`/`--pids-limit`/`--cpus` cgroup caps.
  Mounts: pre-warmed bundle cache as an ephemeral overlay (`:O`, writable view,
  host cache never mutated), the per-request workdir as the only read-write
  mount (SELinux-relabelled `:Z`), and existing host font dirs read-only.
* `SandboxConfig` - image, cache dir, font dirs, resource caps, seccomp profile,
  `tectonic_in_image`.
* `build_sandbox_image` / `sandbox_image_present` - privileged warm-up that
  builds a Fedora image carrying tectonic (installed via its official binary
  plus the libs it links: graphite2, openssl) + fontconfig. tectonic runs from
  the image, so no dynamically-linked host binary is smuggled into a minimal
  base. The image is never pulled/built at request time.

The container has no network (`--network none`) and untrusted tectonic also
runs `--only-cached`, so a build cannot fetch anything. cgroup caps replace the
in-process rlimits on this path (an rlimit on the podman client would not reach
the container); a wall-clock kill force-removes the container so the build
cannot outlive the request.

Tests: 14 pure flag/mount assertions + 1 opt-in live build
(`PYTEX_TEST_PODMAN=1`) that pre-warms then compiles untrusted input fully
offline through the sandbox. Full suite green; basedpyright 0/0; ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… image

Address review of the rootless-Podman sandbox:

BLOCKERS
- Fail-closed: TrustPolicy gains `require_sandbox` (on for UNTRUSTED/SANDBOXED).
  When no usable sandbox is available, a non-trusted PDF build now raises
  CompileError instead of downgrading to the in-process floor — that floor does
  NOT block \input/\include/\openin of host files, so a silent fallback let
  untrusted LaTeX read host files into the PDF.
- Disk/OOM: cap the PDF via stat() *before* reading it back (a huge output no
  longer loads into the process and OOMs), and restore the per-file write cap
  lost on the container path via `--ulimit fsize` (RLIMIT_FSIZE was gone there);
  /tmp tmpfs stays size-capped.

SHOULD
- Supply chain: base image pinned by @sha256 digest (not :latest); tectonic
  installed from a pinned release URL with sha256 verification instead of
  piping the latest installer to a shell.
- Memory floor: `--memory` is always emitted and floored at MEMORY_FLOOR_BYTES,
  so a 0/negative limit can no longer drop the cap and leave the container
  unbounded.
- Fonts: documented that system font dirs are mounted plain `:ro` — a `:z`
  relabel of shared system dirs is rejected rootless; default (lmodern) fonts
  ship in the bundle, fontspec system fonts are a documented caveat.
- Version-matched cache: `warm_sandbox_cache` populates the bundle cache with
  the image's OWN tectonic (using the real render pipeline's LaTeX), so the
  offline (--network none + --only-cached) build gets a cache hit instead of a
  failed fetch.

NICE-TO-HAVE
- /tmp mounted noexec,nosuid,nodev; XDG_CACHE_HOME pinned into the tmpfs.
  (--user/seccomp/Landlock left as future hooks.)

Tests: +15 (truth table incl. fail-closed + host-binary branch, fail-closed
raise, timeout->`podman rm -f` cleanup, fallback warning, stat-before-read,
new hardening flags). Live opt-in build verified end-to-end offline. Full suite
826 passed / 2 skipped; basedpyright 0/0; ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…k guard

Address round-2 review of the Podman sandbox:

S1 (SHOULD): warm_sandbox_cache relabelled the shared bundle cache with :Z
(private MCS category). The cache is the shared lower layer that later request
containers read via :O and that the host-side TRUSTED build reads directly, so
a private category would deny those readers under enforcing SELinux - defeating
the warm-up. Use :z (shared) instead; allowed because the cache lives in the
user's $HOME. Verified end-to-end on an enforcing-SELinux host (MLS/MCS on):
build image -> warm (:z) -> UNTRUSTED render fully offline -> PDF.

N1: refuse a symlinked output before stat()/read (defense-in-depth; not
reachable today without shell-escape, but cheap).
N3: floor --ulimit fsize at FSIZE_FLOOR_BYTES and always emit it, mirroring
--memory, so a 0/negative limit cannot silently drop the per-file write cap.
N4: bump pinned tectonic 0.15.0 -> 0.16.9 (still sha256-verified).
N5: confirmed --ulimit fsize takes raw bytes on Podman (fsize=1048576 caps a
file at exactly 1 MiB), so the existing value is correct.

Also: warm_sandbox_cache now uses --network host for the one-time privileged
warm-up so the bundle can be fetched where rootless slirp/pasta networking is
unavailable; the untrusted path is unchanged (always --network none).

Tests: fsize-floor cases, symlink-output rejection; existing fsize test updated
for the floor. Full suite 828 passed / 2 skipped; basedpyright 0/0; ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Round-3 close-out. Add a guard test asserting the untrusted compile argv is
always offline (--network none) and never carries the `host` netns that only
the one-time privileged warm-up uses, for both tectonic_in_image True/False.

Verified live on this enforcing-SELinux host (getenforce=Enforcing, MLS/MCS on):
- build image (tectonic 0.16.9, sha-pinned) -> warm_sandbox_cache (:z shared
  relabel, --network host) on the $HOME cache -> UNTRUSTED render fully offline
  (--network none, :O overlay) -> valid PDF. No SELinux denial, no cache miss.
- S1: warm-populated cache is labelled container_file_t:s0 (shared, NO private
  MCS category), so it is readable by later request containers via the :O lower
  layer (untrusted render succeeds) AND by the host-side path (host tectonic
  --only-cached reads ~/.cache/Tectonic offline -> PDF).
- Confinement intact: network blocked, rootfs read-only, /tmp (noexec) writable.

Full suite 829 passed / 2 skipped; basedpyright 0/0; ruff format+check clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@frederikbeimgraben frederikbeimgraben changed the base branch from feat/pytex-api-module to main June 4, 2026 10:23
@frederikbeimgraben frederikbeimgraben merged commit c406bc1 into main Jun 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant