experiment: enable QUARTO_PDF_STANDARD and run latex/typst tests#14097
experiment: enable QUARTO_PDF_STANDARD and run latex/typst tests#14097gordonwoodhull wants to merge 2 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add environment variable fallback for pdf-standard option so any document without an explicit pdf-standard setting inherits from QUARTO_PDF_STANDARD (comma-separated, e.g. "ua-1" or "a-2b,ua-1"). Also add tools/find-tests.ts to find test documents by format and tools/filter-pdf-errors.ts to extract and summarize PDF validation errors from render logs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Here is Claude's analysis of the LaTeX failures. The big structural one is margin layout. There's one knitr problem where we could embed fonts, but otherwise everything is upstream. PDF UA-2 LaTeX Failure Analysis22 files fail PDF/UA-2 validation (excluding 3 tests that use non-LuaTeX engines or 1. Missing
|
| File | Notes |
|---|---|
2025/03/21/issue-12344.qmd |
No title. Tests column-page-right layout. |
2024/07/18/10324.qmd |
No title. Tests R subcaptions with tinytable. |
2024/09/02/10655.qmd |
No title. Tests font auto-install. |
2024/08/30/10291/latex-hyphen-lang-es-no-install.qmd |
No title. Tests Spanish hyphenation without install. |
2024/08/30/10291/latex-hyphen-lang-es.qmd |
No title. Tests Spanish hyphenation. |
2023/03/03/article-layout/table-endnotes-4324.qmd |
No title. Tests table endnotes with margin references. |
2023/04/24/format-links.qmd |
Has Title: Test 123 (capital T). Pandoc requires lowercase title:. |
2023/11/02/latex-quarto-markdown-base64.qmd |
No title. Tests base64-encoded markdown in LaTeX. |
2023/11/02/7262.qmd |
No title. Tests layout-ncol with figure + table. |
2023/11/15/4370.qmd |
No title. Tests R figure layout. |
2023/11/14/7568.qmd |
No title. Tests code annotations in LaTeX. |
2023/07/24/code-annotation-false.qmd |
No title. Tests code-annotations: false. |
2023/01/17/format-variants.qmd |
No title. Tests format variant syntax. |
article-layout/tables/compute-table-screen.qmd |
No title. Tests screen-width table. |
2022/09/30/caption-footnotes/test.qmd |
No title. Tests caption footnotes. (Also has Caption error, see below.) |
Problem in: Test files. These tests predate PDF/UA-2 and simply lack a document
title. One special case: format-links.qmd uses Title: (capital T) which Pandoc
treats as a custom metadata key, not the document title.
Quarto could help: Quarto could auto-generate a dc:title fallback (e.g. from the
filename) when tagging is on and no title is set. But fundamentally, real documents
should have titles.
2. Margin layout breaks PDF structure tree (5 files)
Errors: StructTreeRoot shall not contain <P>/<Caption>/<Figure>/<Table>/<Div>/<Part>,
The structure tree root shall contain a single Document structure element,
<P> shall not contain <Aside>/<Part>/<P>
Cause: All these tests use margin layout (.column-margin or column: margin).
Quarto's filters generate:
\marginnote{\begin{footnotesize}...\end{footnotesize}}for text
(src/resources/filters/layout/latex.lua:215-238)\begin{marginfigure}...\end{marginfigure}for figures
(src/resources/filters/layout/latex.lua:554)\begin{margintable}...\end{margintable}for tables
(src/resources/filters/layout/latex.lua:643-663)
These come from the sidenotes and marginnote LaTeX packages (injected at
src/resources/filters/layout/meta.lua:106-116). These packages predate PDF
tagging and don't cooperate with tagpdf:
\marginnotecreates content in a separate output stream that escapes the
Document structure element entirely, placing children at StructTreeRoot.tagpdfassigns an<Aside>role to margin content but nests it inside an
active<P>context, which is invalid per PDF/UA-2.- The result is structure elements (
<P>,<Caption>,<Figure>, etc.) appearing
directly under StructTreeRoot instead of inside<Document>.
Files:
| File | Margin content |
|---|---|
2024/05/06/9582.qmd |
Multiple .column-margin divs with figures and text |
2024/06/24/10112.qmd |
R table with column: margin |
typst/margin-layout/margin-figure-crossref-interleaved.qmd |
Figures with .column-margin class |
article-layout/tables/compute-table-margin.qmd |
R table with column: margin |
2022/09/30/caption-footnotes/test.qmd |
Also affected (uses figure captions; see section 3) |
Problem in: LaTeX packages (sidenotes/marginnote). These packages are not
tag-aware. Fixing this requires either:
- Upstream patches to
sidenotes/marginnotefortagpdfcompatibility - A different LaTeX approach for margin content when tagging is enabled
- Skipping UA-2 validation for margin-layout tests until the ecosystem catches up
Upstream status (as of Feb 2026):
The LaTeX kernel's built-in \marginpar command does have tagging support (tags as
<Aside>), but quarto doesn't use \marginpar directly — it uses the marginnote
and sidenotes packages, both of which are tracked as currently-incompatible in
the LaTeX tagging project:
-
marginnote: latex3/tagging-project#165
(opened Jul 2024, still open). Errors with paragraph hook counting mismatches.
The LaTeX team has a commented-out namespace entry mappingmarginnote->Aside
inlatex-lab-namespace.dtx, indicating intent but no implementation. Package
maintainer (Markus Kohm) is listed as inactive. No comments, no assignees,
no PRs on the issue. -
sidenotes: latex3/tagging-project#555
(opened Aug 2024, still open). Basic\sidenote{}works (uses\marginpar
internally), but\sidenote[][offset]{text},\sidecaption,marginfigure,
andmargintableenvironments all fail because they depend onmarginnote.
No public repo or issue tracker for the package. Priority 7 (low) in the
tagging status tracker. -
All 10
area: marginparsissues inlatex3/tagging-projectare open with
zero closed. No milestones, no active PRs, minimal discussion.
There is no timeline for resolution. Margin layout and PDF/UA-2 are incompatible
in the current LaTeX ecosystem.
3. <Document> shall not contain <Caption> (2 files)
Error: <Document> shall not contain <Caption>
Cause: When a figure caption contains complex content (footnotes, citations),
LaTeX's footnote processing extracts content out of the figure environment. tagpdf
loses the parent-child relationship and places <Caption> directly under <Document>
instead of inside <Figure>.
Files:
| File | Caption content |
|---|---|
2022/09/30/caption-footnotes/test.qmd |
Figures with footnotes and citations in captions |
2023/01/17/online-image-mediabag.qmd |
Online image; may relate to implicit figure wrapping |
Problem in: Pandoc + LaTeX tagging. Pandoc's generated LaTeX for captions with
footnotes doesn't maintain proper structural nesting for tagpdf. This is an
upstream issue in how pandoc emits LaTeX figure environments with complex captions.
4. Fonts not embedded (3 files)
Error: The font programs for all fonts used for rendering within a conforming file shall be embedded within that file, as defined in ISO 32000-2:2020, 9.9
Cause: R's graphics devices (used by knitr to generate plot PDFs) don't always
embed all fonts. When these plot PDFs are included in the final document, unembedded
font references carry over.
Files:
| File | R content |
|---|---|
2022/12/9/jats/example.qmd |
plot(cars), knitr::kable(head(mtcars)), embedded notebook |
2023/11/02/7262.qmd |
plot(cars) via knitr in layout-ncol |
2023/11/15/4370.qmd |
plot(1), plot(2) via knitr |
Problem in: R/knitr (external tool). R's default pdf() device doesn't always
embed fonts. Quarto could mitigate this by configuring knitr's default graphics device
to use cairo_pdf() or setting pdf(embed=TRUE) when PDF standards are active.
5. <Link> shall not contain <Link> (1 file)
Error: <Link> shall not contain <Link>
Cause: This test defines custom crossref types with list-of commands
(\listofsupptbls, \listofdiagrams in raw LaTeX). The list-of entries are
hyperlinked (clickable navigation), and the cross-references inside them also
generate \hyperref links. This creates nested <Link> elements in the structure
tree.
File: crossrefs/float/latex/latex-custom-categories.qmd
Problem in: LaTeX tagging (tagpdf + hyperref interaction). When hyperref
generates links inside list-of entries that are themselves linked, the structure
tree gets nested <Link> elements. This is an upstream LaTeX issue.
6. <Sect> shall not contain content items (1 file)
Error: <Sect> shall not contain content items
Cause: This document has complex metadata (multiple authors with affiliations,
funding, citations, licenses) and toc: true. The author block or TOC generation
produces text content directly inside a <Sect> structure element without a <P>
wrapper.
File: 2022/12/9/jats/example.qmd
Problem in: Pandoc + LaTeX tagging. Pandoc's LaTeX output has loose text in
section contexts that tagpdf doesn't automatically wrap in <P> elements.
Summary
| Root cause | Files | Owner | Effort |
|---|---|---|---|
Missing title: in YAML |
15 | Test files | Low -- add titles |
| Margin layout breaks structure tree | 5 | LaTeX packages | High -- needs upstream work |
| Caption with footnotes misplaced | 2 | Pandoc + LaTeX | Medium -- pandoc change needed |
| R plots don't embed fonts | 3 | R/knitr | Medium -- configure knitr defaults |
| Nested hyperlinks in list-of | 1 | LaTeX (tagpdf/hyperref) |
High -- upstream fix |
| Loose content in Sect | 1 | Pandoc + LaTeX | Medium -- pandoc change needed |
Here are results of adding a
QUARTO_PDF_STANDARDenvironment variable and running the LaTeX and Typst tests.Idk if we want to keep the vibe-coded tools, so I'll leave this as an experimental branch
quarto run --dev tools/find-tests.ts <format> <directory>to find all qmd tests that list or test the format (perhaps a filter inrun-tests.shwould be better but this was easy)quarto run tools/filter-pdf-errors.ts- takes logs of rendering all qmds and creates report belowTypst Results
Typst has only 3 kinds of error, only one unexpected:
We do not have any tests that get past Typst's built-in validation that do not pass veraPDF.
The invalid document structure is orange-book, filed upstream:
LaTeX Results
These are much more varied, so I'll include the report in full for now (and go shovel more snow):