feat(describegpt): add --markdown-template for customizable Markdown output#3834
Merged
jqnatividad merged 7 commits intomasterfrom May 8, 2026
Merged
feat(describegpt): add --markdown-template for customizable Markdown output#3834jqnatividad merged 7 commits intomasterfrom
jqnatividad merged 7 commits intomasterfrom
Conversation
…output
Adds a `--markdown-template <file>` flag and a default
`resources/describegpt_md_defaults.toml` so users can customize the
Markdown layout without recompiling. Per-inference-type templates
(`dictionary_md_template`, `description_md_template`, `tags_md_template`,
`custom_prompt_md_template`) drive the H1/REASONING/TOKEN-USAGE wrapper,
and a separate `dictionary_md_body_template` drives the per-field table
that fills `{{ llm_response }}` for the Dictionary kind. The wrapper
templates and the body template share one Mini Jinja Environment with
custom filters: `pipe_escape`, `br_replace`, `human_count`, `dict_cell`,
`humanize_examples`. Defaults reproduce the legacy hardcoded `format!()`
output byte-for-byte; two embedded tests lock that in.
Also fixes a pre-existing race in `get_prompt_file` where two threads
racing to initialize `PROMPT_FILE` would panic the loser via
`OnceLock::set().unwrap()`; now the loser silently uses the winner's
value (same pattern as the new `get_md_template_file`).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Thread a single SharedRenderCtx (attribution + timestamp) through
render_dictionary_md_body and render_markdown_template so the body
footer's attribution and the wrapper's `{{ timestamp }}` /
`{{ generated_by_signature }}` are byte-identical in the same phase.
- Cache the Mini Jinja Environment in a LazyLock so filters
(pipe_escape, br_replace, human_count, dict_cell, humanize_examples)
are registered once instead of on every render call.
- Add `default_wrapper_templates_are_byte_identical` test plus a TOML
comment locking the four wrapper templates in sync — drift fails
loudly instead of silently diverging.
- Reuse the shared timestamp from the SharedRenderCtx (no second
`chrono::Utc::now()` call per dictionary phase).
- Add a doc comment on PROMPT_FILE / MD_TEMPLATE_FILE explaining the
test-binary singleton constraint and what tests touching custom
templates would need.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | -11 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a TOML-driven MiniJinja templating layer for qsv describegpt Markdown output so users can customize the rendered Markdown (including the dictionary table body) without recompiling, while preserving byte-identical legacy output via default templates and tests.
Changes:
- Introduces
--markdown-template <file>and a loader that parses/caches a Markdown template TOML (or uses an embedded default). - Refactors dictionary Markdown rendering into MiniJinja templates + custom filters, using a shared per-phase render context (single attribution + timestamp).
- Adds
resources/describegpt_md_defaults.toml, updates generated help docs, and adds tests asserting byte-identity with the legacy Markdown output.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/cmd/describegpt/formatters.rs | Removes the legacy hardcoded dictionary Markdown formatter (dictionary Markdown now template-driven). |
| src/cmd/describegpt.rs | Adds --markdown-template, template loading/rendering helpers, shared render context, and byte-identity tests; also fixes a OnceLock::set().unwrap() race in prompt loading. |
| resources/describegpt_md_defaults.toml | New embedded default Markdown template TOML (wrapper + dictionary-body templates, with documented variables/filters). |
| docs/help/describegpt.md | Regenerated help docs to include --markdown-template. |
- USAGE help and generated docs now mention dictionary_md_body_template and the available custom Mini Jinja filters, pointing users at the inline variable/filter docs in the default TOML. - Markdown template TOML parse error now includes the source path (or "<default embedded TOML>") so users can tell which file failed. - Metadata fields (name/description/author/version) on MarkdownTemplateFile are now `#[serde(default)]` so users can supply a minimal --markdown-template file without copying the metadata header. - Replaced struct-level `#[allow(dead_code)]` with per-field allows on only the genuinely-unused metadata fields, so future genuinely unused additions get caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Added MarkdownTemplateOverride struct with per-field Option<String>
fields and an `apply_to` overlay that fills any omitted field from
the embedded defaults. Now a user can drop in a --markdown-template
TOML containing only the templates they want to change instead of
copying the entire default file. USAGE updated to mention this.
- New `markdown_template_override_falls_back_per_field` test locks in
that omitted templates fall back to embedded defaults.
- Inlined `source_label` inside the toml::from_str map_err closure
using `args.flag_markdown_template.as_deref().unwrap_or("...")`,
removing the cross-branch tuple binding that was fragile under
future refactors.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI uses `cargo +nightly fmt --check`, which respects the unstable `struct_field_align_threshold = 20` in rustfmt.toml. My local stable rustfmt silently ignores that option, so a few struct field declarations landed with hand-aligned colons that nightly rejects. Run nightly fmt to bring the file in line with what CI expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…indows CI)
The markdown_default_template_byte_identical_to_legacy and
dictionary_body_default_template_byte_identical_to_legacy tests fail on
Windows runners because:
- `* text=auto` in .gitattributes lets git check out *.toml with CRLF
on Windows; `include_str!` then bundles the embedded default
`describegpt_md_defaults.toml` with CRLF in its template strings.
- Mini Jinja preserves the source line endings verbatim, so the
rendered Markdown carries CRLF on Windows.
- The legacy `format!()` baseline the tests compare against is LF only.
Two-pronged fix:
1. Code-level: normalize CRLF -> LF in `get_md_template_file` before
handing content to `toml::from_str`, so both the embedded default
and any user-supplied --markdown-template TOML render with LF
regardless of the platform git checked them out on.
2. .gitattributes: add `*.toml text eol=lf` so the TOML files are
committed AND checked out as LF on every platform.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--markdown-template <file>todescribegptso users can customize Markdown output without recompiling, with an embeddedresources/describegpt_md_defaults.tomldefault that reproduces the legacyformat!()wrapper byte-for-byte.formatters::format_dictionary_markdowninto adictionary_md_body_template, exposing structuredentries+addl_col_namesplus custom Mini Jinja filters (pipe_escape,br_replace,human_count,dict_cell,humanize_examples) so the per-field layout is also TOML-driven.SharedRenderCtx(one attribution + one timestamp) through both render helpers per phase, caches the Mini JinjaEnvironmentin aLazyLock, and locks in byte-identity to the legacy output via three embedded tests.OnceLock::set().unwrap()race inget_prompt_filethat could panic the loser when two threads raced to initialize the prompt file (latent until two unit tests started exercising it concurrently).Test plan
cargo build --locked --bin qsv -F all_featurescargo test describegpt -F all_features— 64 integration tests passcargo test cmd::describegpt::tests -F all_features— 16 embedded module tests pass (including 3 new byte-identity tests)QSV_TEST_DESCRIBEGPT=1 cargo test describegpt -F all_features— 64 CI tests pass against live LM Studiocargo clippy --bin qsv -F all_features --no-deps— cleanqsv --generate-help-md—docs/help/describegpt.mdregeneratedsniff: refactor to just use the csv crate and drop the qsv-sniffer crate #1976 — addressed all 5 Low findings, closedFor #3735
🤖 Generated with Claude Code