Skip to content

v0.3.61 | Press-accurate CMYK→RGB rendering via document `/OutputIntents` ICC profiles, vertical writing mode (WMode 1 / tategaki) support, RTL (Hebrew/Arabic) and Indic text-extraction fixes, separation-plate image rendering and ActualText extraction, path flattening (`PathContent::to_points`), Node.js quickstart and form-display fixes, macOS OCR detection, faster table-heavy extraction, and cross-OS + cross-language CI verification

Choose a tag to compare

@github-actions github-actions released this 07 Jun 23:29
· 29 commits to main since this release
37825d9

Added

  • Vertical writing mode (WMode 1 / tategaki) support across extraction, rendering, and reading-order pipelines (#645) — Japanese tategaki, Traditional Chinese vertical packaging, and similar -V-suffixed encodings (Identity-V, UniJIS-UTF16-V, UniGB-UTF16-V, UniCNS-UTF16-V, UniKS-UTF16-V) plus CMap streams with /WMode 1 def now drive vertical glyph advance along the y-axis instead of being silently rendered as horizontal. The §9.4.4 axis-swap math lives in a single helper (GraphicsState::advance_text_matrix) consumed by the extractor, page renderer, separation renderer, and text rasterizer — horizontal text pays one predicted-not-taken branch per advance. Per-CID /W2 (§9.7.4.3) and /DW2 arrays are parsed for vertical metrics; ToUnicode /WMode is intentionally ignored per §9.10.2 so a stale tooling leftover can't flip the document. Vertical-majority pages (≥50% of spans tagged wmode == 1) bypass the configured ReadingOrderStrategyType and route through a dedicated right-to-left column-ordering path, since none of the horizontal strategies can produce correct vertical reading order. Thanks @RayVR.
  • PathContent::to_points(tolerance) path flattening (#147) — flattens an extracted vector path into polylines (Vec<Vec<(f32, f32)>>, one inner vec per subpath) for consumers that need sampled coordinates rather than drawing operators (chart/ECG/CAD digitisation). MoveTo/LineTo pass through unchanged; cubic Béziers are adaptively subdivided to stay within tolerance of the true curve. Subpath handling follows ISO 32000-1:2008 §8.5.2 (Table 59). Thanks @mbeschastn0v, and @joelparkerhenderson for the use case.
  • Separation-plate image rendering (#631) — raster Image XObjects are now routed to the matching ink plates in the separation renderer (previously only Form XObjects were handled, so photo content, gradients, and sample-based artwork were absent from per-ink output). Per-pixel routing dispatches by image colour space (ISO 32000-1:2008 §8.9). Thanks @RayVR.
  • /ActualText extraction for structure-tree spans (#646)/ActualText on a StructElem (the form InDesign emits for drop caps, ligature spans, and stylized text in tagged PDFs, §14.9.4) is now applied correctly in extract_text / to_markdown / to_html — emitting the replacement text once, at the right position, instead of duplicating it with the raw descendant glyphs. Marked-content-scope /ActualText already worked. Thanks @RayVR.
  • Article-thread (/Threads) parsing (#458) — a new parser reads a document's article threads (ISO 32000-1:2008 §12.4.3) into per-page bead rectangles, with an accompanying reading-order strategy, shipped as tested public building blocks. The default reading order is unchanged; auto-wiring threads into it is tracked for a future release.
  • Cross-language test-parity suite — one shared functional spec (open, extract, convert, search, structured extraction, create, encrypt, version) is now implemented idiomatically in all nine bindings (Rust, Python, Node, Go, Java, Ruby, PHP, C#, WASM), so every binding is verified to expose the same core behavior.
  • Press-accurate CMYK→RGB via document /OutputIntents ICC profile (#652) — the composite render path now consumes the document's /OutputIntents CMYK DestOutputProfile and routes /DeviceCMYK paint, /Separation / /DeviceN colourants resolving to a /DeviceCMYK alternate, and /ICCBased N=4 spaces lacking a usable embedded profile through qcms (ISO 32000-1:2008 §14.11.5, §10). The conversion is built as qcms::Transform::new_to(src = OutputIntent, dst = sRGB), so it uses the OutputIntent profile's AToB ("device-to-PCS") direction into the CIE PCS and then the sRGB profile's PCS-to-device direction out — composite direction CMYK → CIE PCS → sRGB. Closes the press-vs-screen colour divergence on heavy-yellow / saturated-mid-tone branding artwork that previously rendered through the §10.3.5 additive-clamp fallback. When no /OutputIntents is declared, §10.3.5 is preserved byte-for-byte. Thanks @RayVR.
  • Page-level /DefaultGray / /DefaultRGB / /DefaultCMYK overrides (§8.6.5.6) — when a page or Form XObject's /Resources /ColorSpace declares these defaults, the canonical g / rg / k / K operators (and their stroking siblings) are routed through the override colour space before any document-level /OutputIntents lookup. A /DefaultCMYK [/ICCBased <N=4 stream>] override drives the conversion through its embedded profile; the override takes precedence over the document /OutputIntents for bare device-family paint. Form XObject overrides take precedence inside the form's scope (§7.8.3).
  • Rendering-intent operator (/RI) honoured in the render path (§10.7.3) — the /RI operator was being parsed but its value never reached the colour conversion. The graphics-state intent (/AbsoluteColorimetric / /RelativeColorimetric / /Saturation / /Perceptual, defaulting to /RelativeColorimetric) now flows into every qcms Transform::new_srgb_target build. Two /RI settings on the same page now compile two distinct transforms instead of silently sharing one.
  • ICC v2 and ICC v4 DestOutputProfile profiles both supported through qcms 0.3.0's unconditional header-version check. A v4 LUT8-tag-form profile compiles through the same code path as the v2 equivalent and produces byte-identical RGB.
  • Per-page compiled-transform cache (IccTransformCache, lives on PageRenderer) keyed on (profile.content_hash, intent). Amortises the 17⁴ CLUT precomputation qcms::Transform::new_to runs for CMYK input across paint operators that share a profile and intent: a page emitting 1 000 identical CMYK paints builds one transform, not one thousand. The cache is dropped per page so memory stays bounded across renders.

Changed

  • Faster extraction on table-heavy pages — an output-preserving optimization to the table detector cuts ~30% off extraction time for large, dense documents (e.g. regulatory volumes) with byte-identical results.
  • Cross-OS + FIPS example verification in CI — the core example scenarios for every language binding now run and assert their output on Linux, macOS, and Windows (previously Linux-only — the gap that let #648 reach users), including a FIPS-safe run. This is the guard that prevents another platform-specific quickstart regression.
  • Renderer resolution pipeline refactor (#649) — the copy-pasted paint-resolution arms in page_renderer and separation_renderer are unified into a single layered rendering/resolution module (colour resolution, overprint, blend-mode, clip, per-plate routing), removing quiet divergence between the two renderers. This also fixes PostScript Type 4 calculator tint transforms for /Separation and /DeviceN spot colours over DeviceCMYK/ICCBased alternates, which previously fell through to a flat fallback. Thanks @RayVR.
  • CI reliability hardening (#544) — SHA-pinned the remaining floating GitHub Action and added network retries to reduce transient CI failures.
  • Dependency & CI-action updatesimageproc 0.27, subsetter 0.2.6, log 0.4.32, and the actions/checkout, taiki-e/install-action, and astral-sh/setup-uv actions were bumped (dependabot, #639/#637/#636/#643/#641/#640).
  • FontInfo gains three new pub fields (wmode: u8, cid_vertical_metrics: Option<HashMap<u16, VerticalMetrics>>, cid_default_vertical_metrics: VerticalMetrics) (#645) — source-breaking for downstream code that constructs FontInfo with struct-literal syntax; add the three new fields to fix. Horizontal-only fonts pay no allocation cost (cid_vertical_metrics: None, cid_default_vertical_metrics is Copy). FontInfo continues to NOT be #[non_exhaustive], consistent with the 0.3.60 ascent/descent addition. The natural construction values for the new fields are wmode: 0, cid_vertical_metrics: None, cid_default_vertical_metrics: VerticalMetrics::SPEC_DEFAULT.
  • ReadingOrderConfig::strategy is now overridden on vertical-majority pages (#645) — the configured horizontal strategy (Simple, Geometric, XYCut, StructureTreeFirst) is bypassed when a page has ≥50% vertical-writing spans, and the page is ordered through the tategaki path instead. Per-span wmode is preserved on every output span so consumers can still distinguish the two modes. See ReadingOrderConfig::strategy rustdoc for the rule.
  • ResolvedColor gains an IccCmyk { rgba, cmyk } dual-payload variant (#652)/ICCBased N=4 paint with a parseable embedded profile (and /DefaultCMYK [/ICCBased N=4] overrides) emits both the pre-converted RGBA (consumed by the composite backend) and the original CMYK quadruple (consumed by the per-plate separation router). Source-breaking for downstream code that exhaustively matches on ResolvedColor; add the new arm to fix. The type is not #[non_exhaustive].
  • /ICCBased N=4 with an embedded profile now wins over document /OutputIntents (§8.6.5.5). Pre-this-change, an embedded /ICCBased N=4 colour space with a parseable qcms profile emitted ResolvedColor::Cmyk and was projected through the document /OutputIntents ICC profile by the composite pipeline — inverting the spec's "embedded ICC trumps OutputIntent". The four components are now routed through the embedded profile directly and the OutputIntent is consulted only when the embedded profile fails to parse or qcms refuses to build a CMM.

Fixed

  • Node.js quickstart (#648) — the documented Quick Start used CommonJS require (the package is ESM-only) and new PdfDocument(path) (not a public constructor), so the first example threw. Samples now use import + PdfDocument.open(path), and the constructor gives an actionable error if handed a path. Thanks @abeq for the report and @lihouwenbin for the docs fix (#651).
  • Form fields filled but not displayed (#647) — for PDFs with an inline AcroForm, a filled field's value was written but the /NeedAppearances flag was dropped on full-rewrite save, so viewers showed the field blank (ISO 32000-1:2008 §12.7.3.3). The flag now survives, and viewers render the filled value. Thanks @mitslabo.
  • macOS OCR engine detection (#632)onnxruntime auto-discovery missed the versioned libonnxruntime.<version>.dylib macOS actually ships, so OCR was silently skipped; detection is now version-tolerant across Linux/macOS/Windows. Thanks @paliwalvimal.
  • Deterministic table detection — several table-detection steps could order results by per-process hash iteration, yielding run-to-run differences on table-heavy pages; these are now deterministic.
  • Decimal points in CMSY/Symbol math fonts — numbers whose decimal point is drawn from a math font's logicalnot glyph (e.g. 1¬00, and the spaced 1¬ 00) now extract as 1.00.
  • Consistent empty output for unreadable encrypted PDFs — all text surfaces (extract_text, markdown, HTML, plain text) now uniformly return empty for an encrypted PDF that can't be decrypted, instead of one surface diverging (ISO 32000-1:2008 §7.6).
  • RTL (Hebrew/Arabic) word order in tagged PDFs (#656, #657)extract_text on a tagged PDF assembles text from the structure tree and never reached the untagged reverse_rtl_visual_order_runs pass, so a pure-RTL run's word-spans were emitted in visual (left-to-right) order — the whole line reversed. A new geometric pass (order_mcid_spansrow_aware_span_cmp_rtl) now emits each pure-RTL row right-to-left (rightmost word first), reconstructing logical reading order from page geometry independent of how the producer stored the run (ISO 32000-1:2008 §14.8, UAX #9 §3.3.4). Hebrew text-extraction error rate on the multilingual benchmark drops from worst-tier to parity with poppler/pdfium. Two companion fixes improve Arabic: a grapheme-cluster reversal keeps combining marks (kasra/shadda) attached to their base letter instead of floating off, and a space-gap threshold fallback (0.25 em) when a CID subset font omits a space glyph stops the geometric word-gap threshold from collapsing to zero and shattering cursive words into single letters (§9.3.3 — Tw does not apply to composite-font Arabic). Remaining Arabic intra-word phantom gaps (glyph ink-width vs advance-width) and the markdown/HTML converter paths are tracked in #656/#657. Mixed RTL+Latin runs are left untouched pending full bidi.
  • Node.js binding missing barrel exports (#653) — the package's ESM entry re-exports ContentType, ImageFormat, ThumbnailManager/ThumbnailSize, and the OCRDetectionMode/OCRLanguage aliases through managers/index.js, but that barrel never re-exported the hybrid-ml and thumbnail modules (the CJS require path tolerated the gap silently). Strict ESM consumers — including the new cross-language core-parity test — failed at import with "does not provide an export named 'ContentType'". The managers barrel now re-exports all of them.
  • Java binding native-library load on macOS/Windows (#653)NativeLoader loaded the explicit fyi.oxide.pdf.lib.path override unconditionally, but the Maven build defaults that property to a Linux .so path. On macOS (.dylib) and Windows (.dll) the override file is absent, so System.load hard-failed with UnsatisfiedLinkError even though the correct platform native is bundled in the JAR. The loader now checks the override exists and otherwise falls through to the bundled resource. Surfaced by v0.3.61's new cross-OS Java JNI test runs.
  • Number corruption in plain-text table cells — per-glyph table cells (Td <hex> Tj, e.g. 0.99, Q1) were merged into one span without keeping the char_widths array in sync, so the width-based column-spanning-decimal and letter→digit splitters misfired and dropped the decimal point (0.990 99) or inserted a spurious space (Q1Q 1). char_widths is now re-synced on merge (benchmark table CER 0.117→0.067, 0.091→0.061).
  • Spurious word spaces in Indic scripts (Tamil/Bengali/Devanagari) — Brahmic text extracted the right codepoints but inserted a space after nearly every dependent vowel sign (matra), because a matra carries its own advance and the geometric gap test read matra→consonant as a word break. Tamil/Bengali/Telugu/Kannada/Malayalam are now recognised by the complex-script word-boundary path, the matra→base-consonant boundary is suppressed, and should_insert_space gained a combining-mark guard (benchmark text CER: Tamil 0.095→0.035, Bengali 0.175→0.032, Devanagari 0.066→0.016; real word breaks carry an explicit space glyph, §9.3.3).

Known limitations

These are limitations of the upstream qcms 0.3.0 colour engine (items 1–2) and one test-coverage gap (item 3), tracked in #655. The test suite documents each with a HONEST_GAP_* marker wired as an upgrade gate, so a future engineer (or a qcms upgrade) flips the gated test RED on landing:

  • qcms 0.3.0 ignores the CMYK rendering intent. The end-to-end intent chain inside pdf_oxide is correct — gs.rendering_intentResolutionContext::rendering_intentTransform::new_srgb_target's intent parameter → qcms — but qcms 0.3.0 declares the intent as _intent for CLUT-based CMYK conversion (transform.rs:1283-1289) and dispatches the same CLUT for every PDF intent. A qcms upgrade that honours the parameter, or a CMM swap, will surface intent-sensitive behaviour without further code changes; the test qa_round3_qcms_030_treats_cmyk_intent_as_informational is the upgrade gate.
  • qcms 0.3.0 has no Black-Point Compensation (lib.rs:29-36 — upstream documents the choice as intentional). qa_round4_bpc_paper_white_preservation_under_relative_colorimetric is #[ignore]-marked with HONEST_GAP_QCMS_030_NO_BPC.
  • No real-corpus branding-logo regression fixture (HONEST_GAP_NO_REAL_BRANDING_FIXTURE). The synthetic green-mark probe (qa_round4_branding_green_mark_routes_through_output_intent) pins the press-target direction-of-shift through saturation collapse; a vendor-issued press profile plus a CIEDE2000 ΔE assertion against a commercial-viewer baseline would tighten the bound.

Installation

Rust (crates.io)

cargo add pdf_oxide

Python (PyPI)

pip install pdf_oxide

JavaScript/WASM (npm)

npm install pdf-oxide-wasm

CLI (Homebrew)

brew install yfedoseev/tap/pdf-oxide

CLI (Scoop — Windows)

scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxide

CLI (Shell installer)

curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | sh

CLI (cargo-binstall)

cargo binstall pdf_oxide_cli

MCP Server (for AI assistants)

cargo install pdf_oxide_mcp

Pre-built Binaries
Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).

Platform Support

Platform Architecture Archive
Linux x86_64 (glibc) pdf_oxide-linux-x86_64-*.tar.gz
Linux x86_64 (musl) pdf_oxide-linux-x86_64-musl-*.tar.gz
Linux ARM64 pdf_oxide-linux-aarch64-*.tar.gz
macOS x86_64 (Intel) pdf_oxide-macos-x86_64-*.tar.gz
macOS ARM64 (Apple Silicon) pdf_oxide-macos-aarch64-*.tar.gz
Windows x86_64 pdf_oxide-windows-x86_64-*.zip

Changelog

See CHANGELOG.md for full details.