include shortcodes + source-info infrastructure#135
Merged
gordonwoodhull merged 9 commits intomainfrom Apr 24, 2026
Merged
Conversation
Plans for implementing TypeScript engine extensions in q2 via a Deno subprocess architecture, validated against the Julia engine from Quarto 1. - Plan 0: Pre-engine include expansion + SourceInfo in ExecutionContext - Plan 1a: JSON protocol types, subprocess management, TsEngine struct - Plan 1b: Extension integration, 4-phase detection, echo engine test - Plan 2: @quarto/markdown package + QuartoAPI assembly - Plan 3: @quarto/jupyter package (notebook → markdown conversion) - Plan 4: End-to-end Julia engine validation Key design decisions documented through discussion: - Include shortcodes resolved at AST level before engine execution - QMD writer produces SourceInfo (Concat) mapping serialized text to AST nodes - SourceInfo serialized as byte-range pieces in protocol; harness reconstructs MappedString with .map() provenance (same concept, different implementations) - Engine's target()/partitionedMarkdown() not in protocol (q2 owns pipeline) - markdownForFile stays for non-QMD files (percent scripts, engine-specific) - claimsFile runs before ParseDocument; QMD files skip to parse directly - Backward compatible: engines written for Quarto 1 work without modification Also updates extensions grand plan Phase 8 as superseded by this work.
…cate Prerequisite for include shortcode expansion and engine source provenance. Restructure Inline::Attr from tuple variant Attr(Attr, AttrSourceInfo) to wrapper struct Attr(InlineAttr) with a precomputed source_info field (from AttrSourceInfo::combined()). This gives every Inline variant a source_info field, enabling a uniform API. Add Block::source_info() -> &SourceInfo and Inline::source_info() -> &SourceInfo methods that match on all variants. Replace 4 duplicate get_block_source_info/block_source_info free functions in pampa (treesitter.rs, postprocess.rs, diagnostics.rs, incremental.rs), plus test-file copies and get_inline_source_info, with calls to the new methods.
The QMD writer now has an API that serializes a Pandoc AST to bytes and returns a SourceInfo::Concat mapping byte ranges in the output to the source_info of the AST nodes that produced them. This enables engines (and error reporting) to map positions in the serialized text back to original source files — including through include expansion boundaries. The pieces tile the entire output with no gaps: YAML frontmatter is one piece, each top-level block (including its preceding blank-line separator) is one piece. The existing write() function is unchanged; all ~19 other callsites are unaffected. 6 new tests verify: piece lengths sum to buffer length, frontmatter + blocks are tracked, multi-file ASTs map to correct FileIds, map_offset resolves within a single file, empty documents work, and round-trip parse-serialize-map_offset resolves code block offsets accurately.
Add source_info: SourceInfo and source_context: Arc<SourceContext> to ExecutionContext so engines can map error positions back to original source files. Defaults preserve backward compatibility — existing engines are unaffected. Update serialize_ast_to_qmd to use write_with_source_info, returning (String, SourceInfo). EngineExecutionStage::run() clones the DocumentAst source_context into an Arc (one-time clone; the context is finalized after include expansion) and passes both to ExecutionContext. No engine uses these fields yet — they provide the infrastructure for include-aware error reporting and for the TS engine extension protocol (which will serialize source map entries for the engine host). 6 new tests: ExecutionContext field access, serialization produces Concat, map_offset resolves within a single file, start/end offsets resolve.
New IncludeExpansionStage resolves block-level {{< include file.qmd >}}
shortcodes by parsing included files and splicing their AST blocks into
the main document. This runs after metadata merge and before pre-engine
sugaring, so included code cells are visible to engines and included
cross-references are indexed.
Key behaviors matching Quarto 1:
- Only block-level includes expanded (shortcode must be sole paragraph
content); inline includes are left for ShortcodeResolveTransform
- Included files' YAML frontmatter is stripped (only body blocks spliced)
- Recursive includes supported; circular includes produce a diagnostic
- Missing files produce a diagnostic (not a panic)
Included files are registered in both SourceContexts on DocumentAst
(ast_context.source_context for map_offset resolution, top-level
source_context for ariadne error snippets) with consistent FileIds.
The parse-then-remap pattern (same as EngineExecutionStage) ensures
AST nodes from included files carry SourceInfo pointing to the correct
file.
13 new tests: extract_include_path variants (5), simple include,
recursive include, circular include, missing file, frontmatter
stripping, correct FileId on included blocks, inline include not
expanded, include with code cell.
Consolidate @quarto/markdown and @quarto/jupyter into a single @quarto/api package with subpath exports (text/, markdown/, jupyter/, format/, path/, system/, console/, crypto/). Two QuartoAPI namespaces (quarto.text and quarto.mappedString) both pull from @quarto/api/text, matching Q1's typing. Introduce a PlatformHost interface so @quarto/api contains no Deno or Node globals; all I/O goes through a host plugged in by the consumer. This lets the same package run under @quarto/engine-host-deno today and a future @quarto/engine-host-wasm (browser/VFS) without changes. Rename @quarto/engine-host → @quarto/engine-host-deno to reserve room for the WASM sibling. Drop Q1's QuartoAPIRegistry/register.ts bootstrap; port the implementations and assemble them via direct construction in engine-host. Also split the original Plan 1a (929 lines) into three sub-plans along Rust/Deno/integration seams: - Plan 1a (Rust core, 814 lines): protocol types, subprocess management, ExecutionEngine trait extensions, TsEngine struct - Plan 1b (new, 284 lines): @quarto/engine-host-deno package — esbuild bundle, host.ts dispatch loop, deno-host.ts PlatformHost impl, mapped-source.ts rehydration, quarto-api.ts stub, engine-loader.ts - Plan 1c (renamed from old 1b, 333 lines): _extension.yml engine parsing, deno bundle build step, registry migration to StageContext, 4-phase detection rewrite, echo engine E2E test Dependency chain: 1a → 1b → 1c, which Plans 2/3/4 all pick up from. Add 2026-04-23-ipynb-filters-and-engine-partitioning.md research note capturing protocol refinements found while spec'ing Plans 2/3 (notably that partitionedMarkdown needs to be in the Rust ExecutionEngine trait so Jupyter's ipynb-filter path works).
Five fixtures in crates/quarto/tests/smoke-all/includes/ exercise the
include shortcode end-to-end through the real render pipeline:
- basic/ child content appears in parent output
- crossref/ @fig-ref in parent resolves to a definition in the included
file (validates IncludeExpansion running before
PreEngineSugaring)
- code-cell/ code block from the included file survives to
EngineExecution (execute.eval: false so no kernel needed)
- circular/ circular include produces a "Circular include" WARN
diagnostic; render continues
- missing/ missing file produces an "Include file not found" WARN
diagnostic; render continues
Following Q1's fixture idiom (tests/smoke/smoke-all.test.ts), the
circular and missing fixtures pair printsMessage with an explicit
noErrors: default. printsMessage alone does NOT suppress the default
noErrorsOrWarnings assertion — match Q1 semantics exactly.
Also downgrades the "Circular include" and "Include file not found"
diagnostics from ERROR to WARNING in IncludeExpansionStage, since the
pipeline continues rendering after emitting them.
…e hints
Expand Plan 1a's protocol and TsEngine trait surface to cover the full
Quarto 1 ExecutionEngine interface, and add a static-hints mechanism
so q2 can skip launching the Deno subprocess when an engine is clearly
irrelevant:
- Extension authors can declare `languages` and `file-extensions` in
_extension.yml engine entries. These static hints let q2 skip
launching the Deno subprocess when the engine is clearly irrelevant
(e.g. a project with only {python} blocks doesn't need the Julia
engine).
- If hints are omitted, q2 falls back to always consulting the engine
dynamically — safe but slower.
- The dynamic claimsLanguage / claimsFile remain the precise check;
hints are an optimization, not a replacement.
Additional refinements to the shared-subprocess routing model and the
ipynb-filters / engine-partitioning research note.
Step 4 was previously conditional on changes to quarto-core or quarto-pandoc-types, and motivated as "ensure hub-client/WASM builds work" — which missed that xtask is also what enforces CI's `-D warnings` strictness. A Rust-only PR whose plain `cargo build` / `cargo nextest` pass can still fail CI on an unused-import or dead-code warning, which happened with make_paragraph in the include-expansion stage. New wording requires xtask verify on every push, with --skip-hub-build as the fast path for Rust-only changes, and names the warnings reason so the step's value is discoverable.
64d743d to
0ab7ae7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@cscheid, here are include shortcodes. I think this is ready to merge.
PR also includes the current version of the plans for TS Engine Extension (subject to further review).
Summary
IncludeExpansionStageso block-level{{< include file.qmd >}}expands at the AST level, before engine execution. Included files' body blocks are spliced into the parent AST; frontmatter is stripped; recursion is supported; missing and circular includes produce diagnostics.Block::source_info()/Inline::source_info()accessors, a QMD writer that returns a byte-range-to-AST-nodeSourceInfo::Concat, andSourceInfo+source_contextfields onExecutionContext.What's in this PR
User-visible
{{< include file.qmd >}}as the sole content of a paragraph now expands: the referenced file is parsed, its YAML frontmatter is stripped, its body blocks are spliced into the parent at that position. Matches Quarto 1 behavior:{{< include … >}}(not sole paragraph content) is left toShortcodeResolveTransformas before.DocumentAst'sast_context.source_context(formap_offset) and in the top-levelsource_context(for ariadne error snippets), so cross-references from included files are indexed and errors resolve to the correct file.Infrastructure
source_info()accessors onBlockandInline. RestructuresInline::Attr(Attr, AttrSourceInfo)→Inline::Attr(InlineAttr)so every variant has asource_infofield. Removes four scattered copies of match-every-variant helper functions across pampa.write_with_source_infoon the QMD writer: serializes an AST to bytes and returns aSourceInfo::Concattiling the output — YAML as one piece, each top-level block (including its preceding blank-line separator) as one piece. The existingwrite()is unchanged; ~19 other callsites are unaffected.source_info+Arc<SourceContext>onExecutionContext.EngineExecutionStagenow callswrite_with_source_infoand threads both through. Engines can map error positions across the serialize-and-ship boundary. Backward compatible viaDefault; no engine currently reads these fields.Plans (spec only)
~3,900 lines of Markdown in
claude-notes/plans/:2026-04-18-plan0-include-expansion-and-source-info.md— the plan this PR implements.2026-04-16-ts-engine-extensions-subprocess.mdplus Plans 1a / 1b / 1c / 2 / 3 / 4 — the TS-engine-extensions initiative that the source-info plumbing is a prerequisite for. No code in this PR; those plans land in follow-up PRs.2026-04-23-ipynb-filters-and-engine-partitioning.md— side research note from Plan 2/3 design.Test plan
cargo nextest run --workspacepasses (verified locally: 3650/3650 pampa, 1008/1008 quarto-core)cargo xtask verifypasses (full Rust + hub-client + WASM build + tests){{< include child.qmd >}}as a full paragraph — confirmchild.qmd's blocks appear inline in rendered output{{< include >}}inside a larger paragraph is left unexpandedEngineExecutionStage)