Skip to content

chore: polish — tooling, rustdoc, CI, changelog#71

Merged
pratyush618 merged 9 commits intomainfrom
chore/polish
Apr 24, 2026
Merged

chore: polish — tooling, rustdoc, CI, changelog#71
pratyush618 merged 9 commits intomainfrom
chore/polish

Conversation

@pratyush618
Copy link
Copy Markdown
Collaborator

Summary

Bundle of low-risk polish items from the audit. All mechanical, no behaviour changes.

  • chorerust-toolchain.toml pins stable + rustfmt/clippy/wasm target so CI and contributors share an identical toolchain. justfile makes the commands documented in CLAUDE.md executable.
  • chore(async)paperjam-async only uses paperjam_core::render, but previously force-enabled signatures and validation on paperjam-core for every consumer. Now only render is enabled at the async layer; paperjam-py continues to enable the full feature set explicitly. Async-only consumers no longer compile the x509 / cms / rsa / p256 / sha1 / pkcs8 / spki / ureq / rustls / roxmltree tree.
  • docs — Crate-level //! rustdoc on every library crate in the workspace. Uniform plain-prose style; no intra-doc links in summaries. Also fixes two pre-existing rustdoc warnings ([OPTIONAL] literal in TSA parser; bare URL in annotations). cargo doc --workspace --no-deps is now warning-clean.
  • chore(ci) — docs workflow now runs on pull requests (without deploying) so docs regressions surface pre-merge. Installs binaryen so wasm-pack auto-invokes wasm-opt and release WASM bundles shrink ~20-30%.
  • docs(changelog)[Unreleased] section records everything from chore: audit-driven cleanup (stubs, metadata, docs, release profile) #68, Security hardening: ZIP entry caps + MCP path sandbox #69, fix: eliminate panic surfaces on untrusted PDF input #70, and this polish branch.

What's still outstanding from the audit

Items that need your input (out of scope for mechanical polish):

  • Layer discipline (paperjam-epubpaperjam-html) — amend CLAUDE.md or extract shared helper?
  • paperjam-studio scope — wire to engine or update CLAUDE.md to reflect "static file server for now"?
  • Rust test coverage strategy — 13/15 crates have zero tests; proptest/fuzz adoption?

Low-priority remnants that can be picked up any time:

  • Rust CI matrix (currently Ubuntu-only; macOS/Windows would increase CI time)
  • MSRV verification job (pin rust-version = "1.75" is declared but unverified)
  • Full SafeZip (total-bytes budget, entry count cap, compression-ratio cap — incremental on the per-entry cap that's already landed)
  • calamine / docx-rs pin refresh (both sit on old minors)
  • md-5 / digest / block-buffer duplicate-version collapse

Test plan

  • cargo test --workspace — 11 existing tests pass (4 xlsx, 2 epub, 5 mcp)
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo fmt --all --check
  • cargo doc --workspace --no-deps — zero warnings
  • uv run pytest tests/python/ — 88 passed, 4 skipped
  • pre-commit run --all-files — every hook passes

rust-toolchain.toml pins every contributor and CI invocation to the
same stable toolchain with rustfmt, clippy, and the
wasm32-unknown-unknown target. Previously CI used
dtolnay/rust-toolchain@stable while contributors installed their own;
minor version drift between them could produce clippy lint
discrepancies at merge time.

justfile captures the common build / test / lint commands documented
in CLAUDE.md as executable recipes. `just` (no args) prints the full
list, and the common flows (build, test, check, fmt, clean-all) are
one step each so local iteration matches the pre-commit chain.
paperjam-async currently only reaches into paperjam_core::render, yet
its manifest force-enabled the signatures and validation features on
paperjam-core for every consumer. Downstream crates that need those
features (paperjam-py does, explicitly) keep working unchanged;
lightweight async consumers no longer drag in the x509-parser / cms /
rsa / p256 / sha1 / pkcs8 / spki / ureq / rustls / roxmltree tree.
Every library crate now has a `//!` summary describing its scope,
its entry points, and how it fits into the broader paperjam
ecosystem. Uniform style: plain prose, no intra-doc links in
crate-level summaries (simpler to maintain, no rustdoc link
warnings to manage).

Also fixes two pre-existing rustdoc warnings uncovered along the
way: an `[OPTIONAL]` literal in signature/tsa.rs that rustdoc was
parsing as an intra-doc link, and a bare URL in model/annotations.rs
flagged for auto-linking. The PyO3 `PyDocument` and `PyPage` classes
get class-level docs that clarify they are the native layer beneath
the pure-Python `paperjam.Document` / `paperjam.Page` wrappers.

After this commit `cargo doc --workspace --no-deps` produces zero
warnings.
The docs workflow previously fired only on pushes to main, so docs
regressions (broken wasm builds, Docusaurus compile errors, bad
links) were invisible until after merge. Now PRs with matching
paths run the full build (without deploying) so problems surface in
the PR check run.

Also installs binaryen, whose wasm-opt binary wasm-pack invokes
automatically when present on PATH. Release-mode WASM bundles
shrink by 20-30% with no code changes.

Concurrency group is keyed on ref so PR builds and deploy builds
don't cancel each other; the deploy job is skipped on pull_request
events to preserve production pages behaviour.
Document the audit-driven work that has landed on main but hasn't
been cut into a release yet: the ZIP-entry and MCP sandbox security
hardening (#69), the panic-surface cleanup in the PDF engine (#70),
the form-bindings stub sync and metadata / docs refresh (#68), plus
the tooling, docs, and paperjam-async feature adjustments from this
polish branch.
@github-actions github-actions Bot added documentation Improvements or additions to documentation github_actions Pull requests that update GitHub Actions code rust Pull requests that update rust code mcp MCP server / paperjam-mcp crate wasm WebAssembly crate (paperjam-wasm) labels Apr 24, 2026
Ubuntu's apt-shipped binaryen is ~v108, which predates the default
enablement of bulk-memory and sign-extension instructions in rustc
output. The result is wasm-pack invoking /usr/bin/wasm-opt on a
valid modern wasm module and wasm-opt rejecting it with
"[wasm-validator error] Bulk memory operation (bulk memory is
disabled)" — observed on the PR #71 run.

Download and install a pinned binaryen release tarball from the
upstream GitHub releases page. version_119 is known-good against
the current rustc and supports all default features. Future bumps
change one env var.
Harden the binaryen install step that landed in the previous commit:

- SHA256-pin the downloaded tarball (value verified against a local
  download of version_119). Guards against upstream tampering or an
  accidental silent swap.
- Split the version-check into a dedicated Verify step so the log
  shows the installed wasm-opt version unambiguously.
- Wrap the install in actions/cache keyed on the pinned version so
  subsequent runs skip the download. Saves ~3-5s per run.
rustc 1.82+ emits bulk-memory and sign-extension instructions in its
default wasm output. wasm-pack's baseline wasm-opt invocation ("-O")
does not pass --enable-bulk-memory / --enable-sign-ext, so even a
modern binaryen rejects the module with "Bulk memory operations
require bulk memory [--enable-bulk-memory]" during validation.

Configure the flags in paperjam-wasm's Cargo.toml metadata block so
wasm-pack invokes wasm-opt with the right feature set. This is what
was blocking CI #71 even after installing a modern binaryen.
Rust 1.87 / LLVM 20 enabled bulk-memory and nontrapping-fptoint in
the default wasm32-unknown-unknown feature set, alongside the
previously-defaulted multivalue, mutable-globals, reference-types,
and sign-ext. wasm-pack's baseline "-O" invocation of wasm-opt does
not pass any of them, so the optimiser rejects a perfectly valid
rustc-emitted module.

The previous commit only enabled bulk-memory and sign-ext, which
exposed a follow-on validator error on `i32.trunc_sat_f64_s`
(nontrapping-fptoint). Rather than re-play whack-a-mole for each
feature, pass the full list that matches the rustc default set
documented in the wasm32-unknown-unknown platform-support page.

Ref: https://doc.rust-lang.org/rustc/platform-support/wasm32-unknown-unknown.html
@pratyush618 pratyush618 merged commit ddd8909 into main Apr 24, 2026
15 checks passed
@pratyush618 pratyush618 deleted the chore/polish branch April 24, 2026 11:24
@pratyush618 pratyush618 mentioned this pull request Apr 24, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation github_actions Pull requests that update GitHub Actions code mcp MCP server / paperjam-mcp crate rust Pull requests that update rust code wasm WebAssembly crate (paperjam-wasm)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant