Skip to content

include shortcodes + source-info infrastructure#135

Merged
gordonwoodhull merged 9 commits intomainfrom
feature/include-shortcode
Apr 24, 2026
Merged

include shortcodes + source-info infrastructure#135
gordonwoodhull merged 9 commits intomainfrom
feature/include-shortcode

Conversation

@gordonwoodhull
Copy link
Copy Markdown
Member

@gordonwoodhull gordonwoodhull commented Apr 24, 2026

@cscheid, here are include shortcodes. I think this is ready to merge.

PR also includes the current version of the plans for TS Engine Extension (subject to further review).

Summary

  • Adds IncludeExpansionStage so block-level {{< include file.qmd >}} expands at the AST level, before engine execution. Included files' body blocks are spliced into the parent AST; frontmatter is stripped; recursion is supported; missing and circular includes produce diagnostics.
  • Lays down the source-provenance infrastructure the include stage (and later engine work) needs: uniform Block::source_info() / Inline::source_info() accessors, a QMD writer that returns a byte-range-to-AST-node SourceInfo::Concat, and SourceInfo + source_context fields on ExecutionContext.
  • Ships the design plans that motivate the infrastructure. Only Plan 0 is implemented here; the rest are spec-only.

What's in this PR

User-visible

{{< include file.qmd >}} as the sole content of a paragraph now expands: the referenced file is parsed, its YAML frontmatter is stripped, its body blocks are spliced into the parent at that position. Matches Quarto 1 behavior:

  • Recursive includes work; circular includes emit a diagnostic instead of looping.
  • Missing files emit a diagnostic instead of panicking.
  • Inline {{< include … >}} (not sole paragraph content) is left to ShortcodeResolveTransform as before.
  • Included files register in DocumentAst's ast_context.source_context (for map_offset) and in the top-level source_context (for ariadne error snippets), so cross-references from included files are indexed and errors resolve to the correct file.

Infrastructure

  • Uniform source_info() accessors on Block and Inline. Restructures Inline::Attr(Attr, AttrSourceInfo)Inline::Attr(InlineAttr) so every variant has a source_info field. Removes four scattered copies of match-every-variant helper functions across pampa.
  • write_with_source_info on the QMD writer: serializes an AST to bytes and returns a SourceInfo::Concat tiling the output — YAML as one piece, each top-level block (including its preceding blank-line separator) as one piece. The existing write() is unchanged; ~19 other callsites are unaffected.
  • source_info + Arc<SourceContext> on ExecutionContext. EngineExecutionStage now calls write_with_source_info and threads both through. Engines can map error positions across the serialize-and-ship boundary. Backward compatible via Default; no engine currently reads these fields.

Plans (spec only)

~3,900 lines of Markdown in claude-notes/plans/:

  • 2026-04-18-plan0-include-expansion-and-source-info.md — the plan this PR implements.
  • 2026-04-16-ts-engine-extensions-subprocess.md plus Plans 1a / 1b / 1c / 2 / 3 / 4 — the TS-engine-extensions initiative that the source-info plumbing is a prerequisite for. No code in this PR; those plans land in follow-up PRs.
  • 2026-04-23-ipynb-filters-and-engine-partitioning.md — side research note from Plan 2/3 design.

Test plan

  • cargo nextest run --workspace passes (verified locally: 3650/3650 pampa, 1008/1008 quarto-core)
  • cargo xtask verify passes (full Rust + hub-client + WASM build + tests)
  • Review Plan 0 and confirm it matches what landed
  • End-to-end: a QMD file with {{< include child.qmd >}} as a full paragraph — confirm child.qmd's blocks appear inline in rendered output
  • Circular include (a.qmd → b.qmd → a.qmd): confirm a diagnostic rather than a hang
  • Missing-file include: confirm a diagnostic rather than a panic
  • Inline {{< include >}} inside a larger paragraph is left unexpanded
  • Included file's code cells execute under the correct engine (included cells should be visible to EngineExecutionStage)
  • Cross-reference defined in an included file resolves from the parent

Plans for implementing TypeScript engine extensions in q2 via a Deno
subprocess architecture, validated against the Julia engine from Quarto 1.

- Plan 0: Pre-engine include expansion + SourceInfo in ExecutionContext
- Plan 1a: JSON protocol types, subprocess management, TsEngine struct
- Plan 1b: Extension integration, 4-phase detection, echo engine test
- Plan 2: @quarto/markdown package + QuartoAPI assembly
- Plan 3: @quarto/jupyter package (notebook → markdown conversion)
- Plan 4: End-to-end Julia engine validation

Key design decisions documented through discussion:
- Include shortcodes resolved at AST level before engine execution
- QMD writer produces SourceInfo (Concat) mapping serialized text to AST nodes
- SourceInfo serialized as byte-range pieces in protocol; harness reconstructs
  MappedString with .map() provenance (same concept, different implementations)
- Engine's target()/partitionedMarkdown() not in protocol (q2 owns pipeline)
- markdownForFile stays for non-QMD files (percent scripts, engine-specific)
- claimsFile runs before ParseDocument; QMD files skip to parse directly
- Backward compatible: engines written for Quarto 1 work without modification

Also updates extensions grand plan Phase 8 as superseded by this work.
@gordonwoodhull gordonwoodhull changed the title Expand block-level {{< include >}} shortcodes + source-info infrastructure include shortcodes + source-info infrastructure Apr 24, 2026
…cate

Prerequisite for include shortcode expansion and engine source provenance.

Restructure Inline::Attr from tuple variant Attr(Attr, AttrSourceInfo)
to wrapper struct Attr(InlineAttr) with a precomputed source_info field
(from AttrSourceInfo::combined()). This gives every Inline variant a
source_info field, enabling a uniform API.

Add Block::source_info() -> &SourceInfo and Inline::source_info() ->
&SourceInfo methods that match on all variants. Replace 4 duplicate
get_block_source_info/block_source_info free functions in pampa
(treesitter.rs, postprocess.rs, diagnostics.rs, incremental.rs), plus
test-file copies and get_inline_source_info, with calls to the new
methods.
The QMD writer now has an API that serializes a Pandoc AST to bytes and
returns a SourceInfo::Concat mapping byte ranges in the output to the
source_info of the AST nodes that produced them. This enables engines
(and error reporting) to map positions in the serialized text back to
original source files — including through include expansion boundaries.

The pieces tile the entire output with no gaps: YAML frontmatter is one
piece, each top-level block (including its preceding blank-line
separator) is one piece. The existing write() function is unchanged;
all ~19 other callsites are unaffected.

6 new tests verify: piece lengths sum to buffer length, frontmatter +
blocks are tracked, multi-file ASTs map to correct FileIds, map_offset
resolves within a single file, empty documents work, and round-trip
parse-serialize-map_offset resolves code block offsets accurately.
Add source_info: SourceInfo and source_context: Arc<SourceContext> to
ExecutionContext so engines can map error positions back to original
source files. Defaults preserve backward compatibility — existing
engines are unaffected.

Update serialize_ast_to_qmd to use write_with_source_info, returning
(String, SourceInfo). EngineExecutionStage::run() clones the
DocumentAst source_context into an Arc (one-time clone; the context is
finalized after include expansion) and passes both to ExecutionContext.

No engine uses these fields yet — they provide the infrastructure for
include-aware error reporting and for the TS engine extension protocol
(which will serialize source map entries for the engine host).

6 new tests: ExecutionContext field access, serialization produces
Concat, map_offset resolves within a single file, start/end offsets
resolve.
New IncludeExpansionStage resolves block-level {{< include file.qmd >}}
shortcodes by parsing included files and splicing their AST blocks into
the main document. This runs after metadata merge and before pre-engine
sugaring, so included code cells are visible to engines and included
cross-references are indexed.

Key behaviors matching Quarto 1:
- Only block-level includes expanded (shortcode must be sole paragraph
  content); inline includes are left for ShortcodeResolveTransform
- Included files' YAML frontmatter is stripped (only body blocks spliced)
- Recursive includes supported; circular includes produce a diagnostic
- Missing files produce a diagnostic (not a panic)

Included files are registered in both SourceContexts on DocumentAst
(ast_context.source_context for map_offset resolution, top-level
source_context for ariadne error snippets) with consistent FileIds.
The parse-then-remap pattern (same as EngineExecutionStage) ensures
AST nodes from included files carry SourceInfo pointing to the correct
file.

13 new tests: extract_include_path variants (5), simple include,
recursive include, circular include, missing file, frontmatter
stripping, correct FileId on included blocks, inline include not
expanded, include with code cell.
Consolidate @quarto/markdown and @quarto/jupyter into a single
@quarto/api package with subpath exports (text/, markdown/, jupyter/,
format/, path/, system/, console/, crypto/). Two QuartoAPI namespaces
(quarto.text and quarto.mappedString) both pull from @quarto/api/text,
matching Q1's typing.

Introduce a PlatformHost interface so @quarto/api contains no Deno or
Node globals; all I/O goes through a host plugged in by the consumer.
This lets the same package run under @quarto/engine-host-deno today
and a future @quarto/engine-host-wasm (browser/VFS) without changes.

Rename @quarto/engine-host → @quarto/engine-host-deno to reserve room
for the WASM sibling. Drop Q1's QuartoAPIRegistry/register.ts bootstrap;
port the implementations and assemble them via direct construction in
engine-host.

Also split the original Plan 1a (929 lines) into three sub-plans along
Rust/Deno/integration seams:
- Plan 1a (Rust core, 814 lines): protocol types, subprocess management,
  ExecutionEngine trait extensions, TsEngine struct
- Plan 1b (new, 284 lines): @quarto/engine-host-deno package — esbuild
  bundle, host.ts dispatch loop, deno-host.ts PlatformHost impl,
  mapped-source.ts rehydration, quarto-api.ts stub, engine-loader.ts
- Plan 1c (renamed from old 1b, 333 lines): _extension.yml engine
  parsing, deno bundle build step, registry migration to StageContext,
  4-phase detection rewrite, echo engine E2E test

Dependency chain: 1a → 1b → 1c, which Plans 2/3/4 all pick up from.

Add 2026-04-23-ipynb-filters-and-engine-partitioning.md research note
capturing protocol refinements found while spec'ing Plans 2/3 (notably
that partitionedMarkdown needs to be in the Rust ExecutionEngine trait
so Jupyter's ipynb-filter path works).
Five fixtures in crates/quarto/tests/smoke-all/includes/ exercise the
include shortcode end-to-end through the real render pipeline:

- basic/      child content appears in parent output
- crossref/   @fig-ref in parent resolves to a definition in the included
              file (validates IncludeExpansion running before
              PreEngineSugaring)
- code-cell/  code block from the included file survives to
              EngineExecution (execute.eval: false so no kernel needed)
- circular/   circular include produces a "Circular include" WARN
              diagnostic; render continues
- missing/    missing file produces an "Include file not found" WARN
              diagnostic; render continues

Following Q1's fixture idiom (tests/smoke/smoke-all.test.ts), the
circular and missing fixtures pair printsMessage with an explicit
noErrors: default. printsMessage alone does NOT suppress the default
noErrorsOrWarnings assertion — match Q1 semantics exactly.

Also downgrades the "Circular include" and "Include file not found"
diagnostics from ERROR to WARNING in IncludeExpansionStage, since the
pipeline continues rendering after emitting them.
…e hints

Expand Plan 1a's protocol and TsEngine trait surface to cover the full
Quarto 1 ExecutionEngine interface, and add a static-hints mechanism
so q2 can skip launching the Deno subprocess when an engine is clearly
irrelevant:

- Extension authors can declare `languages` and `file-extensions` in
  _extension.yml engine entries. These static hints let q2 skip
  launching the Deno subprocess when the engine is clearly irrelevant
  (e.g. a project with only {python} blocks doesn't need the Julia
  engine).
- If hints are omitted, q2 falls back to always consulting the engine
  dynamically — safe but slower.
- The dynamic claimsLanguage / claimsFile remain the precise check;
  hints are an optimization, not a replacement.

Additional refinements to the shared-subprocess routing model and the
ipynb-filters / engine-partitioning research note.
Step 4 was previously conditional on changes to quarto-core or
quarto-pandoc-types, and motivated as "ensure hub-client/WASM builds
work" — which missed that xtask is also what enforces CI's `-D warnings`
strictness. A Rust-only PR whose plain `cargo build` / `cargo nextest`
pass can still fail CI on an unused-import or dead-code warning, which
happened with make_paragraph in the include-expansion stage.

New wording requires xtask verify on every push, with --skip-hub-build
as the fast path for Rust-only changes, and names the warnings reason
so the step's value is discoverable.
@gordonwoodhull gordonwoodhull force-pushed the feature/include-shortcode branch from 64d743d to 0ab7ae7 Compare April 24, 2026 18:13
@gordonwoodhull gordonwoodhull merged commit 349148a into main Apr 24, 2026
4 checks passed
@gordonwoodhull gordonwoodhull deleted the feature/include-shortcode branch April 24, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant