Releases · QrCommunication/gigapdf-lib

23 Jun 22:55

v0.72.0

2154b93

Latest

Fidelity release focused on text extraction and AcroForm rendering on dense
government forms (CERFA). The public API is additive — existing behaviour is
unchanged except where noted as a fix.

Added

FormField now surfaces its text-formatting metadata. Each field exposes
comb (the /Ff comb flag for fixed-pitch character cells), quadding (the
/Q justification: 0 left, 1 centre, 2 right), and the default-appearance font
and size parsed from the field's /DA string as daFont / daSize. This lets
a host reproduce a field's intended layout (combed cells, alignment, font
metrics) without re-parsing the appearance stream.

Fixed

Spurious in-word spaces during text extraction. A new gap-aware
runs_join helper now drives all four reconstruction paths (lines,
paragraphs, lists, tables): a word split across several font runs no longer
emits a phantom space at each run boundary (e.g. ENFANT S is reassembled as
ENFANTS). Spacing is decided from the real inter-run gap, not the mere fact
that the text changed font.
Form-field appearances double-rendered behind the editable text. A
widget_appearances flag makes renderPageNoText / renderPageExcluding
omit the /AP appearance streams of AcroForm widgets, so a filled field's
baked-in value no longer shows through underneath the live, editable overlay.
Borderless prose misdetected as a table. A line_has_gutter guard now
requires a real inter-cell gutter before promoting a borderless block to a
table: a two-run-per-line prose notice is kept as prose, while genuine tables
(wide column gutters) are still recognised.

Assets 2

23 Jun 20:35

github-actions

v0.71.1

2fd6dfb

gigapdf-lib v0.71.1

Documentation-only patch. No code changes — the WASM blob is byte-for-byte
identical to 0.71.0.

Documentation

Complete overhaul of the SDK documentation for 0.71: API reference (signature
matrix for B / B-T / LTV signing, full ~263-method surface, removal of the
phantom OCR methods doc.ocr / ocrText / extractText), USAGE guide (the
four signing-signature levels + the host-fetch two-phase model + an SSRF note),
COOKBOOK (added signTimestamped / signLtv recipes and an image-watermark
recipe), plus the README and sdk/README (npm). No behavioural change — the
WASM is identical to 0.71.0.

Assets 2

23 Jun 18:56

github-actions

v0.71.0

df9f0fc

gigapdf-lib v0.71.0

Long-term validation release: PAdES-LTV builds on the B-T timestamped signatures
from 0.70 by embedding the validation material (certificate chain + revocation
responses) so a signature keeps verifying long after its certificates expire or
are revoked. The public API is additive — existing behaviour is unchanged.

Added

PAdES-LTV (B-LT / B-LTA). New SDK GigaPdfDoc.signLtv() (async) produces a
long-term-validation signature: it first builds a B-T signature
(signTimestamped), then embeds a Document Security Store (/DSS with
/Certs, /OCSPs, /CRLs, and per-signature /VRI) carrying the revocation
material for the certificate chain (B-LT). With archiveTimestamp it also adds
a /DocTimeStamp document timestamp (ETSI.RFC3161 subfilter) over the whole
updated file for B-LTA, refreshing the long-term trust anchor. The engine
computes which OCSP/CRL endpoints to query from the certificates' AIA / CRL-DP
extensions; the host fetches them (the WASM core has no network stack, same
pure-data two-phase model as the TSA). OCSP requests follow RFC 6960; CRLs are
parsed as CertificateList. The exported defaultOcspPost and defaultCrlGet
perform the round trips via fetch, and the revocationFetch / crlFetch
hooks let the host add auth/proxy/retries and apply its own SSRF allow-list.

Fixed

B-T id-aa-timeStampToken now carries the bare TimeStampToken.
signFinishTimestamped / signTimestamped previously embedded the TSA's raw
TimeStampResp (SEQUENCE { PKIStatusInfo, TimeStampToken }) verbatim in the
id-aa-timeStampToken unsigned attribute. The engine now unwraps the response
to the bare TimeStampToken (a CMS ContentInfo) before embedding it — as
required by RFC 3161 §3.3.2 / ETSI EN 319 122 — matching the B-LTA
document-timestamp path. Both a raw TimeStampResp and an already-unwrapped
token are accepted (the PKIStatusInfo gate is still enforced).

Assets 2

23 Jun 16:30

github-actions

v0.70.0

a199811

gigapdf-lib v0.70.0

Fidelity + standards release: advanced (PAdES-B-T) timestamped signatures,
richer shading and JPEG decoding at the rasteriser, complex-script text shaping
for Indic writing systems, CFF flex curves, and RTF image import. The public API
is additive — existing behaviour is unchanged.

Added

PAdES-B-T trusted timestamps (RFC 3161). New SDK
GigaPdfDoc.signTimestamped() (async) embeds an RFC 3161 timestamp token in
the SignerInfo for an advanced-level PAdES-B-T signature — ETSI.CAdES.detached
subfilter, signing-certificate-v2 (ESS) signed attribute, and the
id-aa-timeStampToken unsigned attribute. Uses the engine's pure-data
two-phase TSA flow (core emits the TimeStampReq, host POSTs it, core embeds
the returned token) since the WASM core has no network stack; tsaFetch lets
the host add auth/proxy/retries and apply its own SSRF allow-list, and the
exported defaultTsaPost POSTs application/timestamp-query via fetch
(e.g. FreeTSA). Signs with an imported PKCS#12 or a freshly generated
self-signed identity.
Mesh shadings at the rasteriser. Free-form (type 4), lattice (type 5),
Coons (type 6) and tensor (type 7) shadings are now rendered as Gouraud
triangles (pure, zero-dep decoder; Coons/tensor patches tessellated per
ISO 32000-1 §8.7.4.5.7), with per-vertex colour resolved through
Separation/DeviceN/ICCBased/CMYK/Gray. Axial (2) and radial (3)
shadings are unchanged.
Arithmetic-coded JPEG decoding. SOF9 (sequential) and SOF10 (progressive)
JPEGs now decode via a hand-rolled ISO/IEC 10918-1 Annex MQ arithmetic decoder
with the F.1.4 DC/AC context models and DAC conditioning. Baseline/Huffman
paths are unchanged; lossless (SOF3/SOF11) and 12-bit Huffman (SOF1) remain
gracefully unsupported.
Indic complex-script shaping. A syllabic reordering machine for the
Brahmi-derived scripts (Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil,
Telugu, Kannada, Malayalam) — reph and pre-base matra reordering — plus the
missing OpenType lookups: GSUB 2 (multiple), GSUB 3 (alternate), GSUB 8
(reverse chaining single) and GPOS 3 (cursive attachment). Latin and the
existing contextual paths are unchanged.
CFF/Type2 flex operators. The Type2 charstring interpreter now implements
the four flex operators (flex, flex1, hflex, hflex1, Adobe TN #5177),
each emitting two cubic curves — CFF glyphs using flex no longer drop or
mis-render contour segments.
RTF image import. RTF import parses the \pict group, extracting
\pngblip/\jpegblip payloads as <img src="data:image/…;base64,…">
(display size recovered from \picwgoal/\pichgoal), reusing the HTML
engine's image-embed pipeline. DIB/BMP, WMF/EMF and binary \bin payloads are
skipped (documented limits), guarded by a PNG/JPEG magic-byte check.

Assets 2

23 Jun 11:40

github-actions

v0.69.0

dcdb805

gigapdf-lib v0.69.0

Image-watermark release: stamp a raster image across any range of pages, with
the same ergonomics as the existing text watermark. The text watermark is
unchanged.

Added

Image watermark. Stamp a raster image over pages —
addImageWatermark (SDK) / add_image_watermark (core) /
gp_add_image_watermark (FFI). Accepts PNG / JPEG / WebP / GIF / AVIF
source images and supports per-watermark opacity, anchoring
(center + four corners) with offsets, rotation (about the image center),
scaling to a target size (aspect-follow), and an optional tiling grid.
The image XObject is embedded once and referenced on each target page,
reusing the existing image-embed/raster-transcode pipeline. The text
watermark and add_image behavior are unchanged.

Assets 2

23 Jun 08:05

github-actions

v0.68.0

ee0c6c8

gigapdf-lib v0.68.0

Format-reach + import/render fidelity release: the unified model now exports
Markdown / CSV / EPUB end to end, Office/ODF import preserves far more
structure, the HTML→PDF renderer gains the remaining common CSS, and several
image-codec and rendering bugs are fixed.

Added

Markdown / CSV / EPUB model export. The unified editable model can now be
raised to Markdown (modelToMd), CSV (RFC 4180, modelToCsv) and
EPUB 3 (modelToEpub), alongside the existing
modelTo{Docx,Xlsx,Pptx,Odt,Ods,Odp,Pdf,Html,Rtf} targets (ABI
gp_model_to_{md,csv,epub}).
Complete Markdown modelling. CodeBlock, Blockquote and
HorizontalRule are first-class in the model — full Markdown round-trip
(headings, runs, links, images, nested lists, GFM tables, code blocks,
block-quotes, horizontal rules, footnotes, front-matter) rendered and exported
consistently across formats.
Office / ODF import fidelity. DOCX/XLSX/PPTX and ODF (.odt/.ods/
.odp) import now preserves images, hyperlinks, strikethrough,
highlighting, spreadsheet formulas, grouped shapes, charts, SmartArt text and
master/layout (theme) inheritance.
HTML / CSS → PDF — remaining common CSS. Radial and conic
gradients, font-weight 100–900, box-shadow (blur), elliptical
border-radius, dashed/dotted borders, linear-gradient and
position: sticky.
OpenType text shaping. GPOS mark positioning, GSUB contextual, script
selection and Arabic joining (complex scripts only; Latin unchanged).
Image codecs. SVG <text> rendering and GIF multi-frame decoding.
Run highlight. Character-level background is painted and emitted across
HTML, PDF and Office output.
setTextRunStyle. Run-level style bake exposed in the SDK.
Mermaid flowchart renderer in the HTML engine (graph TD/LR, node shapes,
typed edges + arrowheads, Sugiyama layout → PDF vectors).

Fixed

AVIF multi-tile decode — corrupt images > 9.4 MP. Multi-tile AVIFs were
decoded as a single tile, garbling pixels. The AV1 spec forces multi-tile
above ~9.4 MP, so essentially every modern phone/camera AVIF was silently
corrupted. Each tile is now decoded independently; single-tile and existing
fixtures are byte-for-byte unchanged (validated bit-exact vs dav1d).
WebP lossless (VP8L) — lossless transforms + meta-Huffman now decode real
cwebp/libwebp lossless images correctly.

Changed

Non-Device colorspaces — Pattern fills and Separation/ICCBased colours
in content streams are unified through the raster colour resolver (consistent
with the rasterizer) instead of a device-default fallback.
Docs honesty — README corrected to near-zero-dependency (hand-written
PDF/render/conversion core; RustCrypto for crypto/signatures; Boa for
JS — the earlier from-scratch JS engine is gone), 1198 tests (was 284), and
.wasm ~5.6 MB (was ~540 KB, before Boa was bundled).

Assets 2

23 Jun 00:37

github-actions

v0.67.0

10df65d

gigapdf-lib v0.67.0

Added

Structured-editing ModelOps + permissions API exposed in the SDK. New
applyModelOps variants: paragraph formatting (setParagraphStyle — align/indent/
spacing/line-height), lists (setListLevel/setListMarker/setListOrdered),
absolute block placement (setBlockFrame/setBlockRotation), and table styling
(setCellShading/setRowHeight/setColWidth/setTableBorder). Table structural
edits (insertTableRow/deleteTableRow/insertTableColumn/deleteTableColumn/
setCellSpan + sheet row/column ops) and GigaPdfDoc permission helpers
(permissionsToP/decodePermissions/getPermissions + saveEncrypted({ flags }))
are now callable from JS.

Changed

8 PDF permission flags are functional: /P is computed from named flags per
ISO 32000-1 Table 22 (previously a cosmetic integer).

Assets 2

22 Jun 23:44

github-actions

v0.66.0

954e909

gigapdf-lib v0.66.0

Added

HTML/CSS rendering — LibreOffice-level fidelity. htmlRender gains real CSS
grid (fr/minmax/repeat/span/auto-rows) and complete flexbox
(basis/grow/shrink/wrap/justify/align), multi-column (column-count/columns/
column-gap), pragmatic RTL/bidi (direction/dir, RTL block/inline/run
layout), table fidelity (colspan/rowspan, LibreOffice-level), text styling
(super/sub, underline, strike), @media, font shorthand and further CSS-2 coverage.
Document reconstruction (structuredText) — waves R1–R10. Typed + populated
pageBlocks bodies, merged-cell spans, strikethrough, hyperlinks, paragraph
spacing, super/subscript, document outline + figure captions, list nesting +
continuation lines, multi-column reading order, multiple tables per page
(connected-component split), borderless right/decimal-aligned columns, true
decimal-tab alignment.
PDF permissions — 8 functional flags. getPermissions + correct /P encoding
of the 8 standard permission bits (print, modify, copy, annotate, fill-forms,
extract, assemble, high-res print).
Model structural edits. Table & sheet structural-edit ModelOps.

OCR (native `gigapdf-ocr-rten` crate — host-side, not bundled in the npm package)

Pivoted the OCR engine to PaddleOCR PP-OCR on RTen (pure-Rust ONNX, no C++/
Tesseract): 13 printed languages incl. our own Hebrew model, with automatic
per-line script selection.
Handwriting recognizer (latin_hw) — our own CRNN trained on real handwriting
(IAM/RIMES/NorHand/…; standard nn.LSTM → dynamic-width ONNX), opt-in via
recognize_page_handwriting / recognize_page_with(img, "latin_hw").
Full OCR documentation refresh (architecture, training data, SDK, cookbook).

Assets 2

22 Jun 13:36

github-actions

v0.65.0

750229b

gigapdf-lib v0.65.0

Added

Office→PDF phase-2 fonts — officeToPdfWith(office, fonts) (ABI
gp_office_to_pdf_with_fonts, core office_to_pdf_with_fonts) completes the
two-phase font flow opened by officeNeededFonts: hand back the host-fetched
faces for the families a container references but doesn't embed (e.g.
Carlito for a Calibri reference) and styled runs lay out + paint with the right
metrics instead of drifting onto the bundled fallback. The supplied faces are
merged with whatever the document embeds itself — embedded faces win on
conflict — so an empty fonts array yields exactly officeToPdf's output
(no regression). fonts uses the same packed blob as htmlRender.

Assets 2

22 Jun 13:15

github-actions

v0.64.0

ef959e2

gigapdf-lib v0.64.0

Office↔PDF fidelity program — import all formats → PDF and export PDF → all
formats much closer to 1:1, including complex layouts (boxes/encadrés).

Added

Office→PDF preserves absolute layout — presentation/box geometry is no
longer reflowed into a flat stack. PPTX/ODP shapes, images and tables carrying
an explicit a:xfrm / draw:frame are emitted at their exact coordinates
(EMU/ODF units → pt), with slide backgrounds and a:schemeClr theme colours
resolved. DOCX floating/anchored drawings (wp:anchor) and text boxes
(w:txbxContent) become absolutely-positioned frames (the “encadrés”), and
explicit page breaks (w:br type=page, w:pageBreakBefore, section breaks)
are honoured.
XLSX/ODS render with cell styling — fonts (bold/italic/underline/size/
colour/family), borders, alignment and row heights are read from each cell's
style and applied at render (theme colours resolved); ODS cells were previously
unstyled. Merges, column widths and number formats unchanged.
PDF→Office export preserves absolute layout — text boxes, images and vector
rectangles/paths (fill/stroke/dash) are exported at their exact coordinates for
PPTX/ODP/DOCX/ODT, so an exported deck/doc opened in PowerPoint/Word/Impress/
Writer looks like the source PDF, encadrés included.
Office→PDF embeds the document's own fonts — a self-embedding DOCX/PPTX/
XLSX (word|ppt|xl/fonts/*.odttf, de-obfuscated per ECMA-376 §17.8.1) or ODT/
ODS/ODP (Fonts/*, TTF/OTF) renders with its own typefaces (exact glyphs
and metrics, no reflow drift) instead of the bundled Liberation fallback.
officeNeededFonts(office) / gp_office_needed_fonts — phase-1 for
officeToPdf: returns the fonts a container references but doesn't embed
(HtmlFontRequest[]), so the host can fetch metric clones (Carlito↔Calibri,
Arimo↔Arial, …) into its font cache for correct line-breaking. null for an
unrecognized archive, [] when nothing is needed.
Stateful RTF rendering — rtfToPdf now uses a real RTF parser with a {}
group state stack: character styling (\b \i \ul \strike \cf \fs \f via
font/colour tables), paragraph alignment/indents (\qc\qr\qj\li\fi), tables
(\trowd\cell\row) and correct CP1252 (\'80→€, smart quotes, dashes) instead
of the previous text-only extraction.

Assets 2

Uh oh!

Releases: QrCommunication/gigapdf-lib

gigapdf-lib v0.72.0

Added

Fixed

Uh oh!

gigapdf-lib v0.71.1

Documentation

Uh oh!

gigapdf-lib v0.71.0

Added

Fixed

Uh oh!

gigapdf-lib v0.70.0

Added

Uh oh!

gigapdf-lib v0.69.0

Added

Uh oh!

gigapdf-lib v0.68.0

Added

Fixed

Changed

Uh oh!

gigapdf-lib v0.67.0

Added

Changed

Uh oh!

gigapdf-lib v0.66.0

Added

OCR (native gigapdf-ocr-rten crate — host-side, not bundled in the npm package)

Uh oh!

gigapdf-lib v0.65.0

Added

Uh oh!

gigapdf-lib v0.64.0

Added

Uh oh!

OCR (native `gigapdf-ocr-rten` crate — host-side, not bundled in the npm package)