Skip to content

Fix display math round-trip inside blockquotes / lists (issue #181)#188

Merged
cscheid merged 2 commits into
mainfrom
bugfix/issue-181
May 12, 2026
Merged

Fix display math round-trip inside blockquotes / lists (issue #181)#188
cscheid merged 2 commits into
mainfrom
bugfix/issue-181

Conversation

@cscheid
Copy link
Copy Markdown
Member

@cscheid cscheid commented May 12, 2026

Summary

Fixes #181. Display math $$ ... $$ inside a blockquote retained the > continuation prefix verbatim in Math.text because pandoc_display_math in the tree-sitter grammar matches its body as a single regex token between $$ delimiters and so never goes through the block-continuation machinery. The qmd writer then correctly re-prefixed every line on output, doubling the > and adding another level on each round trip.

The fix is in the AST extractor (crates/pampa/src/pandoc/treesitter.rs):

  • The opening $$ sits at column C = node.start_position().column. On every interior line of the math, bytes at columns 0..C are the accumulated continuation prefix added by all enclosing blocks (any combination of blockquotes, list items, fenced divs, etc.).
  • Strip those bytes column-wise — but only if every one of them is in {>, space, tab}. Lines that don't fit that pattern (lazy continuation, where the user wrote no explicit > ) are left alone rather than having real content chewed off.

Because the column already encodes the cumulative prefix width, this handles arbitrary mixed nesting (> - $$, - > $$, > - > $$, > > $$, divs around either, etc.) without enumerating block types or walking ancestors per block kind.

I first attempted a structural grammar fix — making pandoc_display_math line-structured like pandoc_code_block so each interior line goes through _soft_line_break (which consumes block_continuation). That broke existing tests (Display math with list markers should remain a single paragraph, Display math inside fenced div should parse correctly) into ERROR nodes, because pandoc_display_math is an inline element living inside _inlines, and crossing soft line breaks at the inline level conflicts with _inlines's own line structure. The AST-extraction approach was the smaller, safer fix.

Triage doc and minimal repros live at claude-notes/issue-reports/181/ for context.

Test plan

Regression fixtures added under crates/pampa/tests/roundtrip_tests/qmd-json-qmd/:

  • display_math_in_blockquote.qmd — the reporter's exact input
  • display_math_in_nested_blockquote.qmd> > $$ ... $$
  • display_math_in_list_in_blockquote.qmd> - $$ ... $$
  • display_math_in_blockquote_in_list.qmd- > $$ ... $$
  • display_math_in_bq_list_bq.qmd> - > $$ ... $$

All five previously diverged on qmd → JSON → qmd → JSON and now round-trip cleanly via test_qmd_roundtrip_consistency.

  • cargo nextest run -p pampa — 3685 passed, 2 skipped
  • cargo xtask verify --skip-hub-tests — full Rust workspace + WASM hub-client build + trace-viewer tests pass
  • End-to-end check on the reporter's repro through cargo run --bin pampa: AST is clean (Math DisplayMath "\np = q\n"), round-tripped qmd is correctly > -prefixed, re-parsing yields the same AST as the original (idempotent)
  • Hub-client vitest run not exercised — there is a pre-existing ERR_MODULE_NOT_FOUND failure on main HEAD unrelated to this change

Tracks beads bd-q6ed.

🤖 Generated with Claude Code

cscheid and others added 2 commits May 12, 2026 12:34
…-q6ed)

Parser-side defect: `pandoc_display_math` grammar matches its body as a
single regex between `$$` delimiters and so never consumes
`block_continuation` markers, leaving the literal `> ` prefix bytes
inside `Math.text`. The qmd writer is correct; the grammar needs to be
line-structured the way `pandoc_code_block` already is.

See claude-notes/issue-reports/181/triage.md for full evidence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (bd-q6ed)

The `pandoc_display_math` grammar rule matches its body as a single regex
between `$$` delimiters and so never goes through the block-continuation
machinery. When the math sits inside a blockquote (or any combination of
blockquotes, list items, etc.), the `> ` and indentation bytes those
enclosing blocks would normally consume end up captured verbatim in
`Math.text`. The qmd writer then re-prefixes every line on output, so
each round trip adds another `> ` / indent level (issue #181).

Fix it in the AST extractor: the opening `$$` sits at some column C, so
on every interior line of the math, bytes at columns 0..C are the
accumulated continuation prefix added by all enclosing blocks. Strip
those bytes column-wise — but only if every one is in `{>, space, tab}`,
so lazy-continuation lines (no explicit `> `) aren't chewed.

This handles arbitrary mixed nesting (`> - $$`, `- > $$`, `> - > $$`,
`> > $$`, divs around either, etc.) without enumerating block types,
because the column already encodes the cumulative prefix width.

Regression fixtures in `crates/pampa/tests/roundtrip_tests/qmd-json-qmd/`:
- display_math_in_blockquote.qmd (reporter's exact input)
- display_math_in_nested_blockquote.qmd
- display_math_in_list_in_blockquote.qmd
- display_math_in_blockquote_in_list.qmd
- display_math_in_bq_list_bq.qmd

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cscheid cscheid merged commit 1cbf79a into main May 12, 2026
4 checks passed
@cscheid cscheid deleted the bugfix/issue-181 branch May 12, 2026 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Display math inside a blockquote has its > prefix doubled on round trip

1 participant