Skip to content

Tracking issue: HTML export #5512

@laurmaedje

Description

@laurmaedje

Purpose

This issue documents the progress of development on HTML export. It's not for discussion of specific features, which should occur in other issues or on Discord.

For design discussions, visit the Discord forge thread. The feature request issue for this tracking issue is #721.

Plan

HTML export will be developed iteratively, merging individual chunks of work instead of building up huge branches. For this reason, it will stay behind a feature flag (typst c file.typ --format html --features html) and warn upon use. It will be available for experimentation, but it's not intended for production use.

There is not yet a timeline for when we'll move things out of the feature flag, but likely once we are happy with the general usability of the feature and think it's ready for more widespread use.

The work on Typst's HTML export is generously supported by the NLnet foundation with a grant financing 6 months of full-time work. Many thanks to them!

Scope

Our primary focus in this initial phase of work will be on generating semantic HTML with no CSS. This already supports various use cases where CSS can be added separately and, in my view, unlocks the main benefit of supporting HTML export in the first place: To share the same content between print and web.

Rich customization of the semantic output structure is explicitly in scope. We'll provide sensible defaults via built-in show rules, but you will be able to write your own target-aware code to generate whatever HTML is best for your use case. The goal is to encapsulate this target awareness into show rules & templates as much as possible, such that the main content is left untouched by it.

We also want to cover various different use cases ranging from single-file output or even just fragments of HTML for inclusion in web pages to multi-file output with fine control over paths & assets (e.g. for a blog).

Things that we'll likely work on in the future, but which are not contained in the initial phase of work, include CSS and EPUB support.

Milestones

The milestones are split into "Infrastructure" and "Library" milestones. The "Infrastructure" milestones refer to specific new features, APIs, compiler changes, and such. The "Library" milestones track how well Typst's standard library elements work in HTML export.

HTML export itself is mostly unaware of the concrete way Typst elements map to HTML elements. Instead, elements check the export target in their show rules and either produce html.elems or layout elements (just like a user would). Thus, the usability and completeness of HTML export is dependant both on compiler systems and on library show rules.

The milestones will be expanded as work progresses.

Infrastructure

  • Feature flag
  • Make basic language features work in HTML export
    • Realization, show rules, context
    • Introspection (except for concrete positions)
  • HTML writing
    • Basic writing with support for escaping rules and allowed charsets
    • Minified output
    • Basic pretty-printing
    • Ensure spec-compliant HTML element nesting
    • Share printing code between SVG and HTML export
    • Prettier output
  • HTML APIs
    • target API for output-aware show rules
      • target API that returns string
      • Finalize how the API should look like (str vs method or something else)
    • Raw HTML API
      • Generic html.elem API for outputing arbitrary HTML elements
      • Respect raw <html>, <head>, <body>
      • Typed html.* element constructors (e.g. html.div)
      • Support raw bodies for things like pre, script, or style
    • Frame API for embedded layout-as-SVG
  • Output formats
    • Single-file output format
      • CLI integration
      • typst watch support
      • Test runner support
    • Directory output
      • CLI integration
      • typst watch support
      • Test runner support
      • html.asset API for creating a raw output file
      • html.document API for creating an HTML output file
      • Support for auto-generated asset paths (e.g. for image)
    • Fragment output
      • CLI integration
      • Test runner support
  • Linking
    • Linking within a single page
      • Linking to a label, with derived ID
      • Linking to a location, with auto-generated ID
    • Linking between normal page content & frames
    • Linking across HTML files
  • Safety
    • Define safety levels
    • Enforce safety depending on level (e.g. no <script>)
  • Control over how semantic vs visual the output should be. (Needs to be respected by individual elements.)
    • Define semanticness levels
  • Accessibility audit

Library

An element can be in different stages of completeness:

Description Level
Does not exist yet 🛑 -
Support is not planned at all ❎ x
Support is not planned for initial phase /
No support, does not appear or is totally broken ❌ 0
Partially works, but not all cases/properties are handled ⚠️ 1
Works and semantically fully handled ✅ 2
Fully handled not just semantically, but also style-wise 🎨 3
Unknown or undecided

Since we mostly aim for semantic HTML support for now, the target is to bring all elements to level 2 or x.

Note that an element may be listed at a lower level than it actually is, either because the table hasn't yet been updated or because nobody checked whether support is comprehensive yet. If marked with a +, the level is conservative and might actually be higher. Rows with ... indicate that the table is incomplete and will be expanded in the future.

Model

Element Level
Bibliography (bibliography)  ✅
Bullet List (list)  ✅
Cite (cite)  ✅
Document (document)  ✅
Emphasis (emph)  ✅
Figure (figure)  ✅
Footnote (footnote)  ✅
Heading (heading), maps to <h2>, ...  ✅
Link (link)  ✅
Numbered List (enum)  ✅
Outline (outline)  ✅
Paragraph (par)  ✅
Paragraph break (parbreak)  ✅
Quote (quote)  ✅
Reference (ref)  ✅
Strong Emphasis (strong)  ✅
Table (table)  ✅
Term List (terms)  ✅
Title (title), maps to <h1>  ✅

N/A: Numbering (numbering)

Text

Element Level
Highlight (highlight)  ✅
Line Break (linebreak)  ✅
Lowercase (lower)  ✅
Overline (overline)  ✅
Raw Text / Code (raw)  ✅
Small Capitals (smallcaps)  ✅
Smartquote (smartquote)  ✅
Strikethrough (strike)  ✅
Subscript (sub)  ✅
Superscript (super)  ✅
Text (text)  ✅
Underline (underline)  ✅
Uppercase (uppercase)  ✅

N/A: Lorem (lorem)

Math

Element Level
... ...

Layout

Element Level
Box (box) ⚠️ 1
Block (block) ⚠️ 1
... ...

Visualize

Element Level
Circle (circle)
Curve (curve)
Ellipse (ellipse)
Image (image)
Line (line)
Polygon (polygon)
Rectangle (rect)
Square (square)

N/A: Color (color), Gradient (gradient), Stroke (stroke), Tiling (tiling)

Metadata

Metadata

Assignees

No one assigned

    Labels

    htmlRelated to HTML exporttracking-issueAn issue that tracks a larger piece of work.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions