Fix caption used as fallback alt text in PDF/UA by gordonwoodhull · Pull Request #14142 · quarto-dev/quarto-cli

gordonwoodhull · 2026-03-02T18:57:56Z

Summary

Removes the anti-pattern where figure captions are copied into image alt text for PDF/UA compliance. Captions describe a figure's significance in context; alt text describes what the image looks like. Using one as the other is an accessibility anti-pattern that merely silences validators without helping screen reader users.

This was introduced in the Jan 2026 PDF/UA work (commits a867c3c24 and ba75b374f) to satisfy PDF/UA validators, which require every <Figure> structure element to have an /Alt string.

Fixes #14107

What changed

LaTeX (latex.lua): Removed 3 caption-as-alt blocks that copied image.caption into image.attributes["alt"] before clearing the caption for separate rendering.

Typst (pandoc3_figure.lua, floatreftarget.lua, typst.lua): Used a marker attribute (_quarto_no_caption_alt) to prevent the caption-as-alt fallback for figure images, while preserving it for inline images where image.caption IS the standard markdown alt text (![alt text](img.svg) in running text).

The distinction matters because of how Pandoc 3 represents alt text in its AST.

Pandoc 3 AST analysis

In Pandoc 3, {alt="text"} on an image does not set image.attributes["alt"] — it replaces image.caption with the alt value. The ![visible caption] text moves to figure.caption.long, while image.caption becomes the alt text. This means image.caption serves double duty:

Inline images: image.caption IS the alt text (the ![...] content is alt by the HTML spec)
Figure images: image.caption is a copy of the visible caption (unless overridden by {alt="..."})

To distinguish these cases, we compare image.caption to the figure's visible caption (figure.caption.long or float.caption_long). If they match, no explicit alt was provided — we set _quarto_no_caption_alt to suppress the fallback. If they differ, {alt="..."} was used and image.caption IS the explicit alt text.

This is the same heuristic Pandoc's own Markdown writer uses (confirmed via Pandoc source — it compares image alt to figure caption to decide whether to emit {alt="..."}). Pandoc itself acknowledges the ambiguity: the AST cannot distinguish "caption that happens to match alt" from "no explicit alt provided." The recommended workaround is Quarto's fig-alt attribute, which flows through a separate, unambiguous path.

How `fig-alt` vs `alt` work

{alt="text"}: Pandoc replaces image.caption, no separate attribute. Works but ambiguous when caption and alt are intentionally identical.
{fig-alt="text"}: Quarto stores as image.attributes["fig-alt"], propagated explicitly to \includegraphics[alt=...] (LaTeX) or image(alt: "...") (Typst). Always unambiguous. This is the recommended approach.

Test plan

caption-not-alt-ua.qmd — verifies caption is NOT copied to \includegraphics[alt=] and that Quarto warns about missing alt text
typst-image-alt-text.qmd — TC1 moved to must-not-match; TC8 added for fig-alt
ua-image-alt-text.qmd — now uses explicit fig-alt
All 39 pdf-standard tests pass

Remove the caption-as-alt fallback introduced in the PDF/UA compliance work (a867c3c, ba75b37). Using captions as alt text is an accessibility anti-pattern — captions describe a figure's significance in context while alt text describes what the image looks like. LaTeX: remove 3 caption-as-alt blocks in latex.lua, and add fig-alt to alt conversion in pandoc3_figure.lua for Pandoc 3 Figures without cross-ref labels. Typst: mark figure images with _quarto_no_caption_alt so that the caption-as-alt fallback in typst.lua only fires for inline images (where image.caption IS the standard markdown alt text). Key insight: In Pandoc 3, {alt="text"} replaces the Image's caption content rather than populating image.attributes["alt"]. So image.caption serves double duty as both visible caption and alt text override. We distinguish the two cases by comparing image.caption to figure.caption — when they match, the caption was NOT overridden (the bug case we suppress); when they differ, an explicit {alt="..."} was provided (which we preserve). This is the same heuristic Pandoc's own Markdown writer uses when round-tripping Figures. Explicit fig-alt (Quarto's dedicated attribute) flows through a completely separate path and always works unambiguously. Fixes #14107

- Fix regex in parse-error.ts to handle tagpdf line continuations when filenames are long (the warning wraps across multiple (tagpdf) lines) - Update warning message to recommend fig-alt instead of caption-as-alt - Add printsMessage check to caption-not-alt-ua test to verify the missing alt text warning is surfaced - Use labeled figure in caption-not-alt-ua test to avoid unrelated UA-2 structural nesting issue with unlabeled figures - Add ua2-unlabeled-figure-caption test documenting known LaTeX/tagpdf limitation where unlabeled captioned figures produce <Caption> directly under <Document> instead of inside a grouping element

gordonwoodhull · 2026-03-02T18:58:13Z

Second commit: Fix tagpdf warning parsing and document UA-2 structural issue

The second commit (32d1172ab) addresses two things discovered while testing:

1. tagpdf warning regex fix (`parse-error.ts`)

The regex that parses tagpdf's "Alternative text for graphic is missing" warning assumed the filename and instead. appeared on a single (tagpdf) continuation line. When filenames are long, tagpdf wraps across multiple lines:

Package tagpdf Warning: Alternative text for graphic is missing.
(tagpdf)                Using 'caption-not-alt-ua_files/mediabag/penrose.pdf'
(tagpdf)                instead.

Fixed the regex to allow an optional (tagpdf) line break before instead.. Also updated the warning message to recommend fig-alt instead of the old ![alt text](image.png) advice (which was itself the caption-as-alt anti-pattern).

2. UA-2 structural issue with unlabeled captioned figures (`ua2-unlabeled-figure-caption.qmd`)

Discovered that unlabeled captioned figures (![caption](img.svg) without {#fig-label}) produce invalid UA-2 structure. These go through pandoc3_figure.lua → latexImageFigure() → bare \begin{figure}[H], and tagpdf places <Caption> as a sibling of <Figure> directly under <Document>:

/Document
  /Caption    ← UA-2 violation: must be inside a grouping element
  /Figure     ← orphaned from its caption

Labeled figures ({#fig-label}) go through FloatRefTarget, which wraps in a \Div that provides the grouping context:

/Document
  /Div
    /Caption  ← properly nested ✓
    /Figure

This is a pre-existing LaTeX/tagpdf limitation, not caused by our changes. Added ua2-unlabeled-figure-caption.qmd to document it with the expected verapdf validation warning, and updated caption-not-alt-ua.qmd to use a labeled figure to avoid conflating the two issues.

posit-snyk-bot · 2026-03-02T19:00:14Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scanner	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Licenses	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

gordonwoodhull added 2 commits March 2, 2026 13:18

gordonwoodhull added this to the v1.9 milestone Mar 2, 2026

gordonwoodhull merged commit 4127c26 into main Mar 3, 2026
89 of 93 checks passed

gordonwoodhull deleted the bugfix/14107 branch March 3, 2026 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix caption used as fallback alt text in PDF/UA#14142

Fix caption used as fallback alt text in PDF/UA#14142
gordonwoodhull merged 2 commits intomainfrom
bugfix/14107

gordonwoodhull commented Mar 2, 2026

Uh oh!

gordonwoodhull commented Mar 2, 2026

Uh oh!

posit-snyk-bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gordonwoodhull commented Mar 2, 2026

Summary

What changed

Pandoc 3 AST analysis

How fig-alt vs alt work

Test plan

Uh oh!

gordonwoodhull commented Mar 2, 2026

Second commit: Fix tagpdf warning parsing and document UA-2 structural issue

1. tagpdf warning regex fix (parse-error.ts)

2. UA-2 structural issue with unlabeled captioned figures (ua2-unlabeled-figure-caption.qmd)

Uh oh!

posit-snyk-bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

How `fig-alt` vs `alt` work

1. tagpdf warning regex fix (`parse-error.ts`)

2. UA-2 structural issue with unlabeled captioned figures (`ua2-unlabeled-figure-caption.qmd`)

posit-snyk-bot commented Mar 2, 2026 •

edited

Loading