Skip to content

feat(describegpt): add --markdown-template for customizable Markdown output#3834

Merged
jqnatividad merged 7 commits intomasterfrom
describegpt-markdown-template
May 8, 2026
Merged

feat(describegpt): add --markdown-template for customizable Markdown output#3834
jqnatividad merged 7 commits intomasterfrom
describegpt-markdown-template

Conversation

@jqnatividad
Copy link
Copy Markdown
Collaborator

@jqnatividad jqnatividad commented May 7, 2026

Summary

  • Adds --markdown-template <file> to describegpt so users can customize Markdown output without recompiling, with an embedded resources/describegpt_md_defaults.toml default that reproduces the legacy format!() wrapper byte-for-byte.
  • Refactors the dictionary table from formatters::format_dictionary_markdown into a dictionary_md_body_template, exposing structured entries + addl_col_names plus custom Mini Jinja filters (pipe_escape, br_replace, human_count, dict_cell, humanize_examples) so the per-field layout is also TOML-driven.
  • Threads a single SharedRenderCtx (one attribution + one timestamp) through both render helpers per phase, caches the Mini Jinja Environment in a LazyLock, and locks in byte-identity to the legacy output via three embedded tests.
  • Drive-by: fixes a pre-existing OnceLock::set().unwrap() race in get_prompt_file that could panic the loser when two threads raced to initialize the prompt file (latent until two unit tests started exercising it concurrently).

Test plan

  • cargo build --locked --bin qsv -F all_features
  • cargo test describegpt -F all_features — 64 integration tests pass
  • cargo test cmd::describegpt::tests -F all_features — 16 embedded module tests pass (including 3 new byte-identity tests)
  • QSV_TEST_DESCRIBEGPT=1 cargo test describegpt -F all_features — 64 CI tests pass against live LM Studio
  • cargo clippy --bin qsv -F all_features --no-deps — clean
  • qsv --generate-help-mddocs/help/describegpt.md regenerated
  • roborev review sniff: refactor to just use the csv crate and drop the qsv-sniffer crate #1976 — addressed all 5 Low findings, closed

For #3735

🤖 Generated with Claude Code

jqnatividad and others added 3 commits May 7, 2026 16:22
…output

Adds a `--markdown-template <file>` flag and a default
`resources/describegpt_md_defaults.toml` so users can customize the
Markdown layout without recompiling. Per-inference-type templates
(`dictionary_md_template`, `description_md_template`, `tags_md_template`,
`custom_prompt_md_template`) drive the H1/REASONING/TOKEN-USAGE wrapper,
and a separate `dictionary_md_body_template` drives the per-field table
that fills `{{ llm_response }}` for the Dictionary kind. The wrapper
templates and the body template share one Mini Jinja Environment with
custom filters: `pipe_escape`, `br_replace`, `human_count`, `dict_cell`,
`humanize_examples`. Defaults reproduce the legacy hardcoded `format!()`
output byte-for-byte; two embedded tests lock that in.

Also fixes a pre-existing race in `get_prompt_file` where two threads
racing to initialize `PROMPT_FILE` would panic the loser via
`OnceLock::set().unwrap()`; now the loser silently uses the winner's
value (same pattern as the new `get_md_template_file`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Thread a single SharedRenderCtx (attribution + timestamp) through
  render_dictionary_md_body and render_markdown_template so the body
  footer's attribution and the wrapper's `{{ timestamp }}` /
  `{{ generated_by_signature }}` are byte-identical in the same phase.
- Cache the Mini Jinja Environment in a LazyLock so filters
  (pipe_escape, br_replace, human_count, dict_cell, humanize_examples)
  are registered once instead of on every render call.
- Add `default_wrapper_templates_are_byte_identical` test plus a TOML
  comment locking the four wrapper templates in sync — drift fails
  loudly instead of silently diverging.
- Reuse the shared timestamp from the SharedRenderCtx (no second
  `chrono::Utc::now()` call per dictionary phase).
- Add a doc comment on PROMPT_FILE / MD_TEMPLATE_FILE explaining the
  test-binary singleton constraint and what tests touching custom
  templates would need.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/cmd/describegpt.rs Dismissed
Comment thread src/cmd/describegpt.rs Dismissed
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 7, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics -11 complexity

Metric Results
Complexity -11

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a TOML-driven MiniJinja templating layer for qsv describegpt Markdown output so users can customize the rendered Markdown (including the dictionary table body) without recompiling, while preserving byte-identical legacy output via default templates and tests.

Changes:

  • Introduces --markdown-template <file> and a loader that parses/caches a Markdown template TOML (or uses an embedded default).
  • Refactors dictionary Markdown rendering into MiniJinja templates + custom filters, using a shared per-phase render context (single attribution + timestamp).
  • Adds resources/describegpt_md_defaults.toml, updates generated help docs, and adds tests asserting byte-identity with the legacy Markdown output.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/cmd/describegpt/formatters.rs Removes the legacy hardcoded dictionary Markdown formatter (dictionary Markdown now template-driven).
src/cmd/describegpt.rs Adds --markdown-template, template loading/rendering helpers, shared render context, and byte-identity tests; also fixes a OnceLock::set().unwrap() race in prompt loading.
resources/describegpt_md_defaults.toml New embedded default Markdown template TOML (wrapper + dictionary-body templates, with documented variables/filters).
docs/help/describegpt.md Regenerated help docs to include --markdown-template.

Comment thread src/cmd/describegpt.rs Outdated
Comment thread docs/help/describegpt.md Outdated
Comment thread src/cmd/describegpt.rs Outdated
Comment thread src/cmd/describegpt.rs Outdated
jqnatividad and others added 4 commits May 7, 2026 16:44
- USAGE help and generated docs now mention dictionary_md_body_template
  and the available custom Mini Jinja filters, pointing users at the
  inline variable/filter docs in the default TOML.
- Markdown template TOML parse error now includes the source path (or
  "<default embedded TOML>") so users can tell which file failed.
- Metadata fields (name/description/author/version) on
  MarkdownTemplateFile are now `#[serde(default)]` so users can supply
  a minimal --markdown-template file without copying the metadata
  header.
- Replaced struct-level `#[allow(dead_code)]` with per-field allows
  on only the genuinely-unused metadata fields, so future genuinely
  unused additions get caught.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Added MarkdownTemplateOverride struct with per-field Option<String>
  fields and an `apply_to` overlay that fills any omitted field from
  the embedded defaults. Now a user can drop in a --markdown-template
  TOML containing only the templates they want to change instead of
  copying the entire default file. USAGE updated to mention this.
- New `markdown_template_override_falls_back_per_field` test locks in
  that omitted templates fall back to embedded defaults.
- Inlined `source_label` inside the toml::from_str map_err closure
  using `args.flag_markdown_template.as_deref().unwrap_or("...")`,
  removing the cross-branch tuple binding that was fragile under
  future refactors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI uses `cargo +nightly fmt --check`, which respects the unstable
`struct_field_align_threshold = 20` in rustfmt.toml. My local stable
rustfmt silently ignores that option, so a few struct field declarations
landed with hand-aligned colons that nightly rejects. Run nightly fmt
to bring the file in line with what CI expects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…indows CI)

The markdown_default_template_byte_identical_to_legacy and
dictionary_body_default_template_byte_identical_to_legacy tests fail on
Windows runners because:
  - `* text=auto` in .gitattributes lets git check out *.toml with CRLF
    on Windows; `include_str!` then bundles the embedded default
    `describegpt_md_defaults.toml` with CRLF in its template strings.
  - Mini Jinja preserves the source line endings verbatim, so the
    rendered Markdown carries CRLF on Windows.
  - The legacy `format!()` baseline the tests compare against is LF only.

Two-pronged fix:
  1. Code-level: normalize CRLF -> LF in `get_md_template_file` before
     handing content to `toml::from_str`, so both the embedded default
     and any user-supplied --markdown-template TOML render with LF
     regardless of the platform git checked them out on.
  2. .gitattributes: add `*.toml text eol=lf` so the TOML files are
     committed AND checked out as LF on every platform.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jqnatividad jqnatividad merged commit a8d5cc5 into master May 8, 2026
15 of 17 checks passed
@jqnatividad jqnatividad deleted the describegpt-markdown-template branch May 8, 2026 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants