Skip to content

v0.3.49 | Off-byte-0 PDF header recovery, sparse-trailer Catalog discovery, a render-path thread-safety fix, and release-automation hardening.

Choose a tag to compare

@github-actions github-actions released this 16 May 08:20
· 174 commits to main since this release
18ad69e

Fixed

  • Linearized PDFs with a non-zero %PDF- header offset
    (#509)
    — files
    whose %PDF- header is preceded by leading bytes (e.g. a captive-
    portal HTML redirect injected ahead of a Linearized PDF) are now read
    instead of rejected with Trailer missing /Root entry. The xref-
    offset shift for header-offset PDFs no longer requires the final
    trailer to carry /Root; xref reconstruction now rejects a parsed-
    but-/Root-less trailer and falls through to Catalog discovery; and
    catalog() scans for /Type /Catalog when the trailer omits /Root
    (matching Poppler / PDFium behaviour, ISO 32000-2 §7.5.2 / 1.7
    Implementation Note G.6).

  • Render-path data race under concurrent rendering
    (#505)
    — the
    process-wide embedded-font classification cache keyed on
    Arc::as_ptr could return a stale (is_byte_indexed, has_unicode_cmap) for an unrelated font when an allocation address
    was recycled across threads, intermittently surfacing as
    ParseException [1000] from RenderPage / RenderPageFit under
    Parallel.ForEach. The unsound global cache is removed; the cmap
    classification is now computed locally per call (a cheap ttf_parser
    table probe), so concurrent renders can no longer collide.

  • Test helper make_type0_font used a non-production Encoding
    variant (#504)

    — the helper now maps Identity-H / Identity-V to
    Encoding::Identity exactly as the real font parser does, so the
    affected Type0 tests exercise the production code path instead of a
    variant production never produces. Purely test-correctness; no user-
    facing behaviour change.

CI / Infrastructure

  • Release-notes title extraction hardened
    (#506)

    extract-release-notes.sh now bounds the subtitle scan to the
    requested version's section (no longer silently inheriting an older
    version's > blockquote), concatenates multi-line blockquotes
    instead of truncating at the first line, and fails loudly when the
    version section or its subtitle is missing. A validate-changelog
    PR/release-branch gate plus a release-title sanity check stop a
    malformed CHANGELOG from ever reaching the publish step, and a self-
    contained regression test covers the missing-section, missing-
    subtitle, multi-line, and cross-version false-scrape cases.

  • GitHub Deployments visibility for regular publishes
    (#493)
    — each
    publish job in release.yml (crates.io, PyPI, npm, npm-native,
    NuGet, Homebrew/Scoop) now declares an environment:, so standard-
    pipeline publishes appear under the Deployments view with their
    artifact URL, matching what the FIPS pipeline already did.

Thanks

  • @Goldziher (kreuzberg-dev) — opened
    #509 with a clean
    standalone reproducer (no app code), a pinned test file, a full
    multi-engine cross-check against Poppler, and a 156-PDF corpus survey
    that isolated this as the single legitimate file the parser rejected.
    That report turned a vague "Linearized PDF fails" into a precise
    header-offset + sparse-trailer root cause.

The remaining fixes (#506,
#505,
#504,
#493) were surfaced
internally while reviewing the v0.3.45–v0.3.47 release automation, the
post-merge main CI runs, and the v0.3.47 PR review.


Installation

Rust (crates.io)

cargo add pdf_oxide

Python (PyPI)

pip install pdf_oxide

JavaScript/WASM (npm)

npm install pdf-oxide-wasm

CLI (Homebrew)

brew install yfedoseev/tap/pdf-oxide

CLI (Scoop — Windows)

scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxide

CLI (Shell installer)

curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | sh

CLI (cargo-binstall)

cargo binstall pdf_oxide_cli

MCP Server (for AI assistants)

cargo install pdf_oxide_mcp

Pre-built Binaries
Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).

Platform Support

Platform Architecture Archive
Linux x86_64 (glibc) pdf_oxide-linux-x86_64-*.tar.gz
Linux x86_64 (musl) pdf_oxide-linux-x86_64-musl-*.tar.gz
Linux ARM64 pdf_oxide-linux-aarch64-*.tar.gz
macOS x86_64 (Intel) pdf_oxide-macos-x86_64-*.tar.gz
macOS ARM64 (Apple Silicon) pdf_oxide-macos-aarch64-*.tar.gz
Windows x86_64 pdf_oxide-windows-x86_64-*.zip

Changelog

See CHANGELOG.md for full details.