fix(pytex_api): typed errors for malformed source + harden review LOW findings#22
Merged
Merged
Conversation
… findings Map malformed/hostile source to a typed CompileError so callers can return 400 instead of a blanket 500. render_blob now wraps the render step: any non-ApiError (Python SyntaxError, non-node __pytex__ surfaced as BuildError, eval failure in a pytex(...) replacement, Markdown parse error) is re-raised as CompileError with a generic message. The underlying exception is chained for server-side logs but never embedded in the message, so no temp path or stacktrace leaks to the client; our own ApiError subclasses pass through unchanged. Covers Red-Team findings O1-O3. Also address the main-review LOW findings: - _security.__all__: add the public helpers it omitted (enforce_input_size, enforce_output_size, enforce_output_file_size, filter_assets, truncate_log). - PDF path: collect tectonic `warning:` lines from the compile log into result.warnings, consistent with the TEX path. - Re-export SANDBOXED_EXTRA_PACKAGES from pytex_api (alongside DANGEROUS_PACKAGES / PACKAGE_ALLOWLIST) and document the third (SANDBOXED) trust level in the module docstring. - filter_assets now feeds the validated dict straight into compile_to_pdf, which writes *that* instead of re-iterating req.assets with raw names, so the workdir-escape guarantee no longer depends on call order. Bump version 0.4.7 -> 0.5.0 (minor: main now carries the pytex_api library and the Podman sandbox on top of the 0.4.7 tag). Exclude examples/ from ruff (it is outside the CI lint scope and some files use 3.14 t-string syntax the py313 target cannot parse), so a bare `ruff check` matches CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Red-team hardening of the
pytex_apiblob server library plus themain-review LOW findings. No behavioural change for valid input; all changes are error-surface / consistency / metadata.Part A —
pytex_api(this PR)1. Malformed source → typed
CompileError(Red-Team O1–O3)Previously
render_to_latex/get_tex_nodelet rawExceptions escape on broken user input, so a caller could only map them to a blanket 500.render_blobnow wraps the render step (_render_or_compile_error):ApiErrorexceptions (PythonSyntaxError, non-node__pytex__→BuildError,evalfailure in apytex(...)replacement, Markdown parse error) → re-raised asCompileErrorwith a generic message;from exc) for server-side logs but never embedded in the message → no temp path / stacktrace leaks to the client;ApiErrors (trust gate, limits, allowlist) pass through unchanged.Callers can now cleanly return 400 on O1–O3. New tests in
test_render_blob.pycover each vector + the message-cleanliness invariant + that typed errors still propagate.2.
main-review LOW findings_security.__all__completed:enforce_input_size,enforce_output_size,enforce_output_file_size,filter_assets,truncate_log.warning:lines from the compile log are now folded intoresult.warnings(_collect_warnings(stream + compile_log)), consistent with the TEX path.SANDBOXED_EXTRA_PACKAGESre-exported frompytex_api.__all__(consistency withDANGEROUS_PACKAGES/PACKAGE_ALLOWLIST); module docstring now documents all three trust levels.filter_assetsvalidated dict is now threaded intocompile_to_pdf, which writes that dict instead of re-iteratingreq.assetswith raw names → the workdir-escape guarantee no longer depends on call order.3. Version bump
0.4.7 → 0.5.0.0.4.7is the tag before any of these features;mainnow carries the wholepytex_apilibrary + the Podman sandbox. Backward-compatible additions → minor bump.5.
examples/lintingAdded
examplesto ruffextend-exclude. It is outside the CI lint scope (ruff {check,format} src tests) and some files use 3.14 t-string syntax thepy313target cannot parse — so a bareruff checknow matches CI instead of choking. Mirrors the existingtest_template.pyexclusion.Deliberately left as follow-up
4.
Fillregistry collisionpytex/commands/lengths.py:199(Fill()length factory) andpytex_tikz/tikz.py:131(Filltikz class) register under the sameRegistrykey (obj.__name__ == "Fill") → the winner in thepytex(...)eval namespace is import-order-dependent.Not fixed here, by design. Both
Fills are public API (direct imports,pytex_tikz.__init__re-export, and tests on each). A clean fix requires either (a) renaming a public symbol — breaks downstream imports — or (b) changingRegistry.add's key scheme to namespaced keys, which changes the eval-namespace key for every registered name (itself a public surface of thepytex(...)/ Markdown-eval hatch). Both exceed a risk-free fix and warrant their own design decision. Flagging per instructions rather than forcing.Verification
basedpyright src/pytex_api→ 0 errors, 0 warnings.ruff format --check+ruff check(whole tree) → clean.pytest -q→ 847 passed, 2 skipped.🤖 Generated with Claude Code