v0.3.61 | Press-accurate CMYK→RGB rendering via document `/OutputIntents` ICC profiles, vertical writing mode (WMode 1 / tategaki) support, RTL (Hebrew/Arabic) and Indic text-extraction fixes, separation-plate image rendering and ActualText extraction, path flattening (`PathContent::to_points`), Node.js quickstart and form-display fixes, macOS OCR detection, faster table-heavy extraction, and cross-OS + cross-language CI verification
Added
- Vertical writing mode (WMode 1 / tategaki) support across extraction, rendering, and reading-order pipelines (#645) — Japanese tategaki, Traditional Chinese vertical packaging, and similar
-V-suffixed encodings (Identity-V, UniJIS-UTF16-V, UniGB-UTF16-V, UniCNS-UTF16-V, UniKS-UTF16-V) plus CMap streams with/WMode 1 defnow drive vertical glyph advance along the y-axis instead of being silently rendered as horizontal. The §9.4.4 axis-swap math lives in a single helper (GraphicsState::advance_text_matrix) consumed by the extractor, page renderer, separation renderer, and text rasterizer — horizontal text pays one predicted-not-taken branch per advance. Per-CID/W2(§9.7.4.3) and/DW2arrays are parsed for vertical metrics; ToUnicode/WModeis intentionally ignored per §9.10.2 so a stale tooling leftover can't flip the document. Vertical-majority pages (≥50% of spans taggedwmode == 1) bypass the configuredReadingOrderStrategyTypeand route through a dedicated right-to-left column-ordering path, since none of the horizontal strategies can produce correct vertical reading order. Thanks @RayVR. PathContent::to_points(tolerance)path flattening (#147) — flattens an extracted vector path into polylines (Vec<Vec<(f32, f32)>>, one inner vec per subpath) for consumers that need sampled coordinates rather than drawing operators (chart/ECG/CAD digitisation).MoveTo/LineTopass through unchanged; cubic Béziers are adaptively subdivided to stay withintoleranceof the true curve. Subpath handling follows ISO 32000-1:2008 §8.5.2 (Table 59). Thanks @mbeschastn0v, and @joelparkerhenderson for the use case.- Separation-plate image rendering (#631) — raster Image XObjects are now routed to the matching ink plates in the separation renderer (previously only Form XObjects were handled, so photo content, gradients, and sample-based artwork were absent from per-ink output). Per-pixel routing dispatches by image colour space (ISO 32000-1:2008 §8.9). Thanks @RayVR.
/ActualTextextraction for structure-tree spans (#646) —/ActualTexton aStructElem(the form InDesign emits for drop caps, ligature spans, and stylized text in tagged PDFs, §14.9.4) is now applied correctly inextract_text/to_markdown/to_html— emitting the replacement text once, at the right position, instead of duplicating it with the raw descendant glyphs. Marked-content-scope/ActualTextalready worked. Thanks @RayVR.- Article-thread (
/Threads) parsing (#458) — a new parser reads a document's article threads (ISO 32000-1:2008 §12.4.3) into per-page bead rectangles, with an accompanying reading-order strategy, shipped as tested public building blocks. The default reading order is unchanged; auto-wiring threads into it is tracked for a future release. - Cross-language test-parity suite — one shared functional spec (open, extract, convert, search, structured extraction, create, encrypt, version) is now implemented idiomatically in all nine bindings (Rust, Python, Node, Go, Java, Ruby, PHP, C#, WASM), so every binding is verified to expose the same core behavior.
- Press-accurate CMYK→RGB via document
/OutputIntentsICC profile (#652) — the composite render path now consumes the document's/OutputIntentsCMYKDestOutputProfileand routes/DeviceCMYKpaint,/Separation//DeviceNcolourants resolving to a/DeviceCMYKalternate, and/ICCBased N=4spaces lacking a usable embedded profile throughqcms(ISO 32000-1:2008 §14.11.5, §10). The conversion is built asqcms::Transform::new_to(src = OutputIntent, dst = sRGB), so it uses the OutputIntent profile's AToB ("device-to-PCS") direction into the CIE PCS and then the sRGB profile's PCS-to-device direction out — composite direction CMYK → CIE PCS → sRGB. Closes the press-vs-screen colour divergence on heavy-yellow / saturated-mid-tone branding artwork that previously rendered through the §10.3.5 additive-clamp fallback. When no/OutputIntentsis declared, §10.3.5 is preserved byte-for-byte. Thanks @RayVR. - Page-level
/DefaultGray//DefaultRGB//DefaultCMYKoverrides (§8.6.5.6) — when a page or Form XObject's/Resources /ColorSpacedeclares these defaults, the canonicalg/rg/k/Koperators (and their stroking siblings) are routed through the override colour space before any document-level/OutputIntentslookup. A/DefaultCMYK [/ICCBased <N=4 stream>]override drives the conversion through its embedded profile; the override takes precedence over the document/OutputIntentsfor bare device-family paint. Form XObject overrides take precedence inside the form's scope (§7.8.3). - Rendering-intent operator (
/RI) honoured in the render path (§10.7.3) — the/RIoperator was being parsed but its value never reached the colour conversion. The graphics-state intent (/AbsoluteColorimetric//RelativeColorimetric//Saturation//Perceptual, defaulting to/RelativeColorimetric) now flows into every qcmsTransform::new_srgb_targetbuild. Two/RIsettings on the same page now compile two distinct transforms instead of silently sharing one. - ICC v2 and ICC v4
DestOutputProfileprofiles both supported through qcms 0.3.0's unconditional header-version check. A v4 LUT8-tag-form profile compiles through the same code path as the v2 equivalent and produces byte-identical RGB. - Per-page compiled-transform cache (
IccTransformCache, lives onPageRenderer) keyed on(profile.content_hash, intent). Amortises the 17⁴ CLUT precomputationqcms::Transform::new_toruns for CMYK input across paint operators that share a profile and intent: a page emitting 1 000 identical CMYK paints builds one transform, not one thousand. The cache is dropped per page so memory stays bounded across renders.
Changed
- Faster extraction on table-heavy pages — an output-preserving optimization to the table detector cuts ~30% off extraction time for large, dense documents (e.g. regulatory volumes) with byte-identical results.
- Cross-OS + FIPS example verification in CI — the core example scenarios for every language binding now run and assert their output on Linux, macOS, and Windows (previously Linux-only — the gap that let #648 reach users), including a FIPS-safe run. This is the guard that prevents another platform-specific quickstart regression.
- Renderer resolution pipeline refactor (#649) — the copy-pasted paint-resolution arms in
page_rendererandseparation_rendererare unified into a single layeredrendering/resolutionmodule (colour resolution, overprint, blend-mode, clip, per-plate routing), removing quiet divergence between the two renderers. This also fixes PostScript Type 4 calculator tint transforms for/Separationand/DeviceNspot colours overDeviceCMYK/ICCBasedalternates, which previously fell through to a flat fallback. Thanks @RayVR. - CI reliability hardening (#544) — SHA-pinned the remaining floating GitHub Action and added network retries to reduce transient CI failures.
- Dependency & CI-action updates —
imageproc0.27,subsetter0.2.6,log0.4.32, and theactions/checkout,taiki-e/install-action, andastral-sh/setup-uvactions were bumped (dependabot, #639/#637/#636/#643/#641/#640). FontInfogains three newpubfields (wmode: u8,cid_vertical_metrics: Option<HashMap<u16, VerticalMetrics>>,cid_default_vertical_metrics: VerticalMetrics) (#645) — source-breaking for downstream code that constructsFontInfowith struct-literal syntax; add the three new fields to fix. Horizontal-only fonts pay no allocation cost (cid_vertical_metrics: None,cid_default_vertical_metricsisCopy).FontInfocontinues to NOT be#[non_exhaustive], consistent with the 0.3.60ascent/descentaddition. The natural construction values for the new fields arewmode: 0,cid_vertical_metrics: None,cid_default_vertical_metrics: VerticalMetrics::SPEC_DEFAULT.ReadingOrderConfig::strategyis now overridden on vertical-majority pages (#645) — the configured horizontal strategy (Simple, Geometric, XYCut, StructureTreeFirst) is bypassed when a page has ≥50% vertical-writing spans, and the page is ordered through the tategaki path instead. Per-spanwmodeis preserved on every output span so consumers can still distinguish the two modes. SeeReadingOrderConfig::strategyrustdoc for the rule.ResolvedColorgains anIccCmyk { rgba, cmyk }dual-payload variant (#652) —/ICCBased N=4paint with a parseable embedded profile (and/DefaultCMYK [/ICCBased N=4]overrides) emits both the pre-converted RGBA (consumed by the composite backend) and the original CMYK quadruple (consumed by the per-plate separation router). Source-breaking for downstream code that exhaustively matches onResolvedColor; add the new arm to fix. The type is not#[non_exhaustive]./ICCBased N=4with an embedded profile now wins over document/OutputIntents(§8.6.5.5). Pre-this-change, an embedded/ICCBased N=4colour space with a parseable qcms profile emittedResolvedColor::Cmykand was projected through the document/OutputIntentsICC profile by the composite pipeline — inverting the spec's "embedded ICC trumps OutputIntent". The four components are now routed through the embedded profile directly and the OutputIntent is consulted only when the embedded profile fails to parse or qcms refuses to build a CMM.
Fixed
- Node.js quickstart (#648) — the documented Quick Start used CommonJS
require(the package is ESM-only) andnew PdfDocument(path)(not a public constructor), so the first example threw. Samples now useimport+PdfDocument.open(path), and the constructor gives an actionable error if handed a path. Thanks @abeq for the report and @lihouwenbin for the docs fix (#651). - Form fields filled but not displayed (#647) — for PDFs with an inline AcroForm, a filled field's value was written but the
/NeedAppearancesflag was dropped on full-rewrite save, so viewers showed the field blank (ISO 32000-1:2008 §12.7.3.3). The flag now survives, and viewers render the filled value. Thanks @mitslabo. - macOS OCR engine detection (#632) —
onnxruntimeauto-discovery missed the versionedlibonnxruntime.<version>.dylibmacOS actually ships, so OCR was silently skipped; detection is now version-tolerant across Linux/macOS/Windows. Thanks @paliwalvimal. - Deterministic table detection — several table-detection steps could order results by per-process hash iteration, yielding run-to-run differences on table-heavy pages; these are now deterministic.
- Decimal points in
CMSY/Symbol math fonts — numbers whose decimal point is drawn from a math font'slogicalnotglyph (e.g.1¬00, and the spaced1¬ 00) now extract as1.00. - Consistent empty output for unreadable encrypted PDFs — all text surfaces (
extract_text, markdown, HTML, plain text) now uniformly return empty for an encrypted PDF that can't be decrypted, instead of one surface diverging (ISO 32000-1:2008 §7.6). - RTL (Hebrew/Arabic) word order in tagged PDFs (#656, #657) —
extract_texton a tagged PDF assembles text from the structure tree and never reached the untaggedreverse_rtl_visual_order_runspass, so a pure-RTL run's word-spans were emitted in visual (left-to-right) order — the whole line reversed. A new geometric pass (order_mcid_spans→row_aware_span_cmp_rtl) now emits each pure-RTL row right-to-left (rightmost word first), reconstructing logical reading order from page geometry independent of how the producer stored the run (ISO 32000-1:2008 §14.8, UAX #9 §3.3.4). Hebrew text-extraction error rate on the multilingual benchmark drops from worst-tier to parity with poppler/pdfium. Two companion fixes improve Arabic: a grapheme-cluster reversal keeps combining marks (kasra/shadda) attached to their base letter instead of floating off, and a space-gap threshold fallback (0.25 em) when a CID subset font omits a space glyph stops the geometric word-gap threshold from collapsing to zero and shattering cursive words into single letters (§9.3.3 —Twdoes not apply to composite-font Arabic). Remaining Arabic intra-word phantom gaps (glyph ink-width vs advance-width) and the markdown/HTML converter paths are tracked in #656/#657. Mixed RTL+Latin runs are left untouched pending full bidi. - Node.js binding missing barrel exports (#653) — the package's ESM entry re-exports
ContentType,ImageFormat,ThumbnailManager/ThumbnailSize, and theOCRDetectionMode/OCRLanguagealiases throughmanagers/index.js, but that barrel never re-exported the hybrid-ml and thumbnail modules (the CJSrequirepath tolerated the gap silently). Strict ESM consumers — including the new cross-language core-parity test — failed at import with "does not provide an export named 'ContentType'". The managers barrel now re-exports all of them. - Java binding native-library load on macOS/Windows (#653) —
NativeLoaderloaded the explicitfyi.oxide.pdf.lib.pathoverride unconditionally, but the Maven build defaults that property to a Linux.sopath. On macOS (.dylib) and Windows (.dll) the override file is absent, soSystem.loadhard-failed withUnsatisfiedLinkErroreven though the correct platform native is bundled in the JAR. The loader now checks the override exists and otherwise falls through to the bundled resource. Surfaced by v0.3.61's new cross-OS Java JNI test runs. - Number corruption in plain-text table cells — per-glyph table cells (
Td <hex> Tj, e.g.0.99,Q1) were merged into one span without keeping thechar_widthsarray in sync, so the width-based column-spanning-decimal and letter→digit splitters misfired and dropped the decimal point (0.99→0 99) or inserted a spurious space (Q1→Q 1).char_widthsis now re-synced on merge (benchmark table CER 0.117→0.067, 0.091→0.061). - Spurious word spaces in Indic scripts (Tamil/Bengali/Devanagari) — Brahmic text extracted the right codepoints but inserted a space after nearly every dependent vowel sign (matra), because a matra carries its own advance and the geometric gap test read matra→consonant as a word break. Tamil/Bengali/Telugu/Kannada/Malayalam are now recognised by the complex-script word-boundary path, the matra→base-consonant boundary is suppressed, and
should_insert_spacegained a combining-mark guard (benchmark text CER: Tamil 0.095→0.035, Bengali 0.175→0.032, Devanagari 0.066→0.016; real word breaks carry an explicit space glyph, §9.3.3).
Known limitations
These are limitations of the upstream qcms 0.3.0 colour engine (items 1–2) and one test-coverage gap (item 3), tracked in #655. The test suite documents each with a HONEST_GAP_* marker wired as an upgrade gate, so a future engineer (or a qcms upgrade) flips the gated test RED on landing:
- qcms 0.3.0 ignores the CMYK rendering intent. The end-to-end intent chain inside pdf_oxide is correct —
gs.rendering_intent→ResolutionContext::rendering_intent→Transform::new_srgb_target'sintentparameter → qcms — but qcms 0.3.0 declares the intent as_intentfor CLUT-based CMYK conversion (transform.rs:1283-1289) and dispatches the same CLUT for every PDF intent. A qcms upgrade that honours the parameter, or a CMM swap, will surface intent-sensitive behaviour without further code changes; the testqa_round3_qcms_030_treats_cmyk_intent_as_informationalis the upgrade gate. - qcms 0.3.0 has no Black-Point Compensation (
lib.rs:29-36— upstream documents the choice as intentional).qa_round4_bpc_paper_white_preservation_under_relative_colorimetricis#[ignore]-marked withHONEST_GAP_QCMS_030_NO_BPC. - No real-corpus branding-logo regression fixture (
HONEST_GAP_NO_REAL_BRANDING_FIXTURE). The synthetic green-mark probe (qa_round4_branding_green_mark_routes_through_output_intent) pins the press-target direction-of-shift through saturation collapse; a vendor-issued press profile plus a CIEDE2000 ΔE assertion against a commercial-viewer baseline would tighten the bound.
Installation
Rust (crates.io)
cargo add pdf_oxidePython (PyPI)
pip install pdf_oxideJavaScript/WASM (npm)
npm install pdf-oxide-wasmCLI (Homebrew)
brew install yfedoseev/tap/pdf-oxideCLI (Scoop — Windows)
scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxideCLI (Shell installer)
curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | shCLI (cargo-binstall)
cargo binstall pdf_oxide_cliMCP Server (for AI assistants)
cargo install pdf_oxide_mcpPre-built Binaries
Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).
Platform Support
| Platform | Architecture | Archive |
|---|---|---|
| Linux | x86_64 (glibc) | pdf_oxide-linux-x86_64-*.tar.gz |
| Linux | x86_64 (musl) | pdf_oxide-linux-x86_64-musl-*.tar.gz |
| Linux | ARM64 | pdf_oxide-linux-aarch64-*.tar.gz |
| macOS | x86_64 (Intel) | pdf_oxide-macos-x86_64-*.tar.gz |
| macOS | ARM64 (Apple Silicon) | pdf_oxide-macos-aarch64-*.tar.gz |
| Windows | x86_64 | pdf_oxide-windows-x86_64-*.zip |
Changelog
See CHANGELOG.md for full details.