Skip to content

fix(pytex_api): typed errors for malformed source + harden review LOW findings#22

Merged
frederikbeimgraben merged 1 commit into
mainfrom
fix/api-hardening-and-review
Jun 4, 2026
Merged

fix(pytex_api): typed errors for malformed source + harden review LOW findings#22
frederikbeimgraben merged 1 commit into
mainfrom
fix/api-hardening-and-review

Conversation

@frederikbeimgraben
Copy link
Copy Markdown
Owner

Summary

Red-team hardening of the pytex_api blob server library plus the main-review LOW findings. No behavioural change for valid input; all changes are error-surface / consistency / metadata.

Part A — pytex_api (this PR)

1. Malformed source → typed CompileError (Red-Team O1–O3)

Previously render_to_latex / get_tex_node let raw Exceptions escape on broken user input, so a caller could only map them to a blanket 500. render_blob now wraps the render step (_render_or_compile_error):

  • non-ApiError exceptions (Python SyntaxError, non-node __pytex__BuildError, eval failure in a pytex(...) replacement, Markdown parse error) → re-raised as CompileError with a generic message;
  • the underlying exception is chained (from exc) for server-side logs but never embedded in the message → no temp path / stacktrace leaks to the client;
  • our own typed ApiErrors (trust gate, limits, allowlist) pass through unchanged.

Callers can now cleanly return 400 on O1–O3. New tests in test_render_blob.py cover each vector + the message-cleanliness invariant + that typed errors still propagate.

2. main-review LOW findings

  • _security.__all__ completed: enforce_input_size, enforce_output_size, enforce_output_file_size, filter_assets, truncate_log.
  • PDF warnings: tectonic warning: lines from the compile log are now folded into result.warnings (_collect_warnings(stream + compile_log)), consistent with the TEX path.
  • SANDBOXED_EXTRA_PACKAGES re-exported from pytex_api.__all__ (consistency with DANGEROUS_PACKAGES / PACKAGE_ALLOWLIST); module docstring now documents all three trust levels.
  • filter_assets validated dict is now threaded into compile_to_pdf, which writes that dict instead of re-iterating req.assets with raw names → the workdir-escape guarantee no longer depends on call order.

3. Version bump

0.4.7 → 0.5.0. 0.4.7 is the tag before any of these features; main now carries the whole pytex_api library + the Podman sandbox. Backward-compatible additions → minor bump.

5. examples/ linting

Added examples to ruff extend-exclude. It is outside the CI lint scope (ruff {check,format} src tests) and some files use 3.14 t-string syntax the py313 target cannot parse — so a bare ruff check now matches CI instead of choking. Mirrors the existing test_template.py exclusion.

Deliberately left as follow-up

4. Fill registry collision

pytex/commands/lengths.py:199 (Fill() length factory) and pytex_tikz/tikz.py:131 (Fill tikz class) register under the same Registry key (obj.__name__ == "Fill") → the winner in the pytex(...) eval namespace is import-order-dependent.

Not fixed here, by design. Both Fills are public API (direct imports, pytex_tikz.__init__ re-export, and tests on each). A clean fix requires either (a) renaming a public symbol — breaks downstream imports — or (b) changing Registry.add's key scheme to namespaced keys, which changes the eval-namespace key for every registered name (itself a public surface of the pytex(...) / Markdown-eval hatch). Both exceed a risk-free fix and warrant their own design decision. Flagging per instructions rather than forcing.

Verification

  • basedpyright src/pytex_api0 errors, 0 warnings.
  • ruff format --check + ruff check (whole tree) → clean.
  • pytest -q847 passed, 2 skipped.

Note: under local Python 3.14, basedpyright emits one pre-existing warning at src/pytex/__init__.py:96 (an unchanged file) — the # pyright: ignore[reportUnreachable] is required under CI's 3.13 target (branch unreachable there) and only looks redundant on 3.14. Removing it would break CI on 3.13, so it is intentionally untouched.

🤖 Generated with Claude Code

… findings

Map malformed/hostile source to a typed CompileError so callers can return
400 instead of a blanket 500. render_blob now wraps the render step: any
non-ApiError (Python SyntaxError, non-node __pytex__ surfaced as BuildError,
eval failure in a pytex(...) replacement, Markdown parse error) is re-raised as
CompileError with a generic message. The underlying exception is chained for
server-side logs but never embedded in the message, so no temp path or
stacktrace leaks to the client; our own ApiError subclasses pass through
unchanged. Covers Red-Team findings O1-O3.

Also address the main-review LOW findings:
- _security.__all__: add the public helpers it omitted (enforce_input_size,
  enforce_output_size, enforce_output_file_size, filter_assets, truncate_log).
- PDF path: collect tectonic `warning:` lines from the compile log into
  result.warnings, consistent with the TEX path.
- Re-export SANDBOXED_EXTRA_PACKAGES from pytex_api (alongside
  DANGEROUS_PACKAGES / PACKAGE_ALLOWLIST) and document the third (SANDBOXED)
  trust level in the module docstring.
- filter_assets now feeds the validated dict straight into compile_to_pdf,
  which writes *that* instead of re-iterating req.assets with raw names, so the
  workdir-escape guarantee no longer depends on call order.

Bump version 0.4.7 -> 0.5.0 (minor: main now carries the pytex_api library and
the Podman sandbox on top of the 0.4.7 tag).

Exclude examples/ from ruff (it is outside the CI lint scope and some files use
3.14 t-string syntax the py313 target cannot parse), so a bare `ruff check`
matches CI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@frederikbeimgraben frederikbeimgraben merged commit c090a5c into main Jun 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant