Skip to content

dlmanning/pretext-rs

Repository files navigation

pretext-rs

A fast, accurate text measurement and layout library for Rust — a faithful port of pretext (TypeScript). It predicts how a browser lays out text: word/grapheme segmentation, line breaking, bidi, and (optionally) real font measurement that reproduces Chrome's canvas measureText.

Status: early (0.0.x), not yet on crates.io, API may change. macOS and Linux measurement backends are complete and validated against real Chromium; Windows is on the roadmap.

The defining constraint: the shipped library never depends on a browser — not at runtime, not in its own tests. But the goal is to reproduce real browser layout, so a real browser is the authoritative oracle during development. Fidelity is "faithful by construction": each stage mirrors what Chrome/Blink does, with the same algorithm or the same OS API.

What you get

  • Layout core — always built, browser-free, cross-platform: text analysis, line breaking, rich line APIs, and bidi metadata. Bring your own glyph widths.
  • Measurement — optional measure feature (on by default): real font matching
    • HarfBuzz shaping that reproduces Chrome's measureText, via CoreText (macOS) / fontconfig (Linux).
  • Scripts — Latin, CJK (incl. word-break: keep-all), and SE-Asian (Thai/Lao/Burmese), using ICU4X's dictionary segmenter to match V8's Intl.Segmenter.
  • CSS-ish modeswhite-space: normal + overflow-wrap: break-word by default, with opt-in pre-wrap, keep-all, and letter-spacing.

Install

Not yet on crates.io — depend on it via git (or a path):

[dependencies]
pretext = { git = "<repository-url>" }
# Pure layout only (no shaping/platform deps — wasm, bring-your-own widths):
# pretext = { git = "<repository-url>", default-features = false }

Quickstart

Measure a string — matches Chrome's measureText for text the family covers:

use pretext::measure::{SystemMeasurer, CoreTextFonts, Measurer}; // FontconfigFonts on Linux

let mut m = SystemMeasurer::new(CoreTextFonts::new());
let width = m.measure("Hello, world", "16px Helvetica").map(|t| t.width);
// `None` = unparseable font shorthand or no matching face; "" measures to Some(0.0).

Lay text into linesprepare() once (analysis + measurement), then layout() on every resize (pure arithmetic, no re-measuring):

use pretext::{prepare, layout, MeasurerWidths, PrepareOptions, LayoutEngine};
use pretext::measure::{SystemMeasurer, CoreTextFonts};

let mut measurer = SystemMeasurer::new(CoreTextFonts::new());
let engine = LayoutEngine {
    line_fit_epsilon: measurer.engine().line_fit_epsilon,
    ..LayoutEngine::default()
};
let mut widths = MeasurerWidths { measurer: &mut measurer, font: "16px Helvetica" };

let prepared = prepare(&mut widths, "The quick brown fox jumps over the lazy dog",
                       &PrepareOptions::default(), &engine);

let lines = layout(&prepared, 200.0, /* line_height */ 20.0);
println!("{} lines, {}px tall", lines.line_count, lines.height);

Bring your own widths — no measure feature; pure layout for wasm, a custom rasterizer, or pre-measured text:

use pretext::{prepare, layout, PrepareOptions, LayoutEngine, SegmentMeasurer};

struct MyWidths;
impl SegmentMeasurer for MyWidths {
    fn measure(&mut self, text: &str) -> f64 { text.chars().count() as f64 * 8.0 }
}

let mut widths = MyWidths;
let prepared = prepare(&mut widths, "hello world",
                       &PrepareOptions::default(), &LayoutEngine::default());
let lines = layout(&prepared, 40.0, 16.0);

Beyond the basics, Prepared exposes a rich APIlayout_with_lines, walk_line_ranges (no string allocation), measure_line_stats, streaming layout_next_line, and seg_levels() for per-segment bidi embedding levels — plus a rich-inline flow (prepare_rich_inline) for laying out multiple, individually-fonted inline items on shared lines. See AGENTS.md.

Platform support

Platform Measurement backend Status
macOS CoreText ✅ verified locally (exact vs Chromium)
Linux fontconfig ✅ CI (exact vs Chromium-on-Linux)
Windows DirectWrite ▢ roadmap — the SystemFonts trait keeps it additive
wasm / any bring-your-own widths (--no-default-features) ✅ pure layout, no native deps

The engine profile is browser-specific (the reference sniffs navigator): LayoutEngine / AnalysisProfile expose chrome() / firefox() / safari() and default to Chromium, the authoritative oracle.

How it works

prepare() runs two phases on a clean seam — analysis (normalize → segment → glue/merge rules) and measurement — producing an opaque, width-independent handle. layout() then turns the prepared per-segment widths into lines with pure arithmetic across all eight break kinds (text, collapsible/preserved space, tab, NBSP-style glue, ZWSP, soft hyphen, hard break), without re-measuring.

Measurement reproduces Blink's canvas measureText pipeline stage-for-stage:

Stage What Backend
1 parse ctx.font font_spec
2–3 family/weight/style match + typeface resolution CoreText / fontconfig
4–5 grapheme + script itemization measurer::itemize
6 per-cluster OS fallback (language-aware, emoji routing) CoreText cascade / FcFontSort
7–8 HarfBuzz shaping + glyph advances harfrust / read-fonts
9 sum advances (f64) measurer

harfrust (HarfBuzz on the Fontations stack) is chosen over rustybuzz because it reproduces canonical HarfBuzz on variable fonts, where rustybuzz's static-hmtx advances diverge. Architecture, decisions, and conventions live in AGENTS.md.

Fidelity & testing

The shipped library is browser-free, but every layer is differentially tested — there are no hand-written expected values:

  • Authoritative — real Chromium. A Playwright oracle runs the real pretext (the pinned vendor/pretext submodule) in real headless Chromium and records per-segment widths and lines; the Rust port replays them and must reproduce the browser exactly (line breaks, ranges, text), with accumulated widths within a tight FP tolerance. Measurement is likewise checked against real Chromium measureTextexact when the requested family covers the text. See oracle/browser/README.md.
  • Porting guard — deterministic mock. The same TS runs under Node with an exact-dyadic mock canvas, so the Rust output is asserted bit-for-bit — proving faithful porting, isolated from font/width noise. See oracle/README.md.
  • Shaping — canonical HarfBuzz. harfrust glyphs/advances are cross-checked per-glyph against real C HarfBuzz (harfbuzz_rs), including a checked-in variable font shaped at a non-default weight — the case where rustybuzz diverged.

Caveat: the HarfBuzz reference is whatever harfbuzz-sys links, not pinned to Chrome's exact build — so shaping is validated as "≈ canonical HarfBuzz" (strong signal), not "≈ this Chrome".

Known limitations

Deliberately out of scope; the differential corpus avoids these, so a green suite does not overstate coverage:

  • Windows backend (roadmap); locl shaping (buffer-level language — e.g. Serbian/Turkish glyph forms); CSS Fonts L4 from-scratch matching; synthetic bold; letter-spacing / word-spacing in the Measurer API.
  • Variable fallback faces shape at the cascade default instance (forcing CSS axes onto arbitrary fallback axes would diverge from Blink).
  • Bare system-cascade fallback (no declared family covers the text) can pick a different face than Chrome's fallback picker — a few px. Name a covering family and it matches exactly, as in CSS. Color emoji is a separate platform/version quirk.
  • Concurrency: CoreTextFonts isn't robust under heavy concurrent first-use; use one source per thread.

Build & test

cargo test                        # full suite (measure + layout)
cargo test --no-default-features  # pure layout (the measure-gated tests cfg out)
cargo clippy --all-targets

The default build includes measure (harfrust + CoreText/fontconfig). --no-default-features drops every shaping/platform dep, leaving only unicode-segmentation + icu_segmenter. Font-dependent tests skip when a font is absent. CI runs the full suite on Linux (Debian bookworm container, fonts pinned to the recorded Chromium vectors); the macOS/CoreText backend is verified locally. Regenerating oracle vectors needs the submodule (git submodule update --init vendor/pretext); the checked-in vectors are replayed as-is. Contributor docs and the pre-commit gate live in AGENTS.md.

License

MIT — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors