feat(template): add data-wrangling MiniJinja filters shared across commands by jqnatividad · Pull Request #3921 · dathere/qsv

jqnatividad · 2026-05-29T15:37:43Z

Summary

Adds a shared src/minijinja_filters.rs module of pure, always-on MiniJinja filters/functions that close real gaps for data-wrangling templates, wired into all four MiniJinja-powered commands (template, fetchpost, describegpt, profile) via a single register(env) call.

Each filter was verified to be genuinely missing — not provided by minijinja 2.20 core, minijinja-contrib, or qsv's existing filters (e.g. nothing anywhere offers regex; pycompat has no zfill/rjust/ljust; core round has no rounding-mode arg).

Filters / functions added

Name	Purpose
`regex_replace(pattern, repl)`	Replace ALL matches (`$1`/`${name}` capture refs)
`regex_match(pattern)`	Bool — for `{% if %}`
`regex_find(pattern)`	First whole match, or `""`
`floor` / `ceil`	Round down / up to a whole integer
`datefmt(fmt[, prefer_dmy])`	Parse a messy date string (19+ formats via `qsv-dateparser`) and reformat
`zfill(width)` / `lpad(width[, fill])` / `rpad(...)`	Padding incl. leading-zero preservation
`slugify`	URL/DB/CKAN-safe slug
`blake3`	BLAKE3 hex digest (stable surrogate/content keys)
`fromjson` / `parse_json`	Parse a JSON-in-a-cell string into an indexable value
`coalesce(a, b, ...)`	First arg that isn't undefined/none/empty

Design notes

No cargo feature gate — regex, blake3, qsv-dateparser, serde_json are always compiled in, so the filters exist in qsv, qsvlite, qsvdp, and qsvmcp.
All functions are pure and Send + Sync, so the single Environment that template shares across rayon worker threads can call them concurrently. The regex cache uses read-lock-then-clone (Regex is Arc-backed), so matching never holds the lock.
Errors map to minijinja::Error; in template, a bad pattern/value degrades to a per-row RENDERING ERROR (caught + counted), not a crash.

Context

This is the outcome of evaluating a "Luau-in-templates" idea, which on review mostly overlapped with existing capabilities (pycompat, printf format, qsv's format_float/round_banker/lookup) and carried real costs (per-thread Lua VM, per-row context serialization, two languages in one template). These targeted filters deliver the practical value at a fraction of the complexity; heavy logic remains better served by qsv luau in a pipeline.

Testing

13 new tests in tests/test_template.rs; all passing.
Full suites pass with no regressions: template (52), profile (64), describegpt (74).
Builds clean: -F all_features, -F lite, -F datapusher_plus.
cargo clippy -F all_features clean for the new code; cargo +nightly fmt applied.
template USAGE updated and docs/help/template.md regenerated via qsv --generate-help-md.

🤖 Generated with Claude Code

…mmands Add a shared `src/minijinja_filters.rs` module registering pure, always-on MiniJinja filters/functions that fill real gaps (verified absent from minijinja 2.20 core, minijinja-contrib, and qsv's existing filters): - regex_replace / regex_match / regex_find (runtime-cached patterns) - floor / ceil (core round has no rounding mode) - datefmt(fmt[, prefer_dmy]) (parse 19+ messy date formats via qsv-dateparser, then reformat) - zfill / lpad / rpad (pycompat lacks zfill/rjust/ljust) - slugify (URL/DB/CKAN-safe slugs) - blake3 (stable surrogate/content keys) - fromjson / parse_json (JSON-in-a-cell -> indexable value) - coalesce(a, b, ...) (first non-empty arg) Wired into all four MiniJinja-powered commands via a single register() call: template, fetchpost, describegpt, and profile. No cargo feature gate -- every dependency used (regex, blake3, qsv-dateparser, serde_json) is always compiled in, so the filters are available in qsv, qsvlite, qsvdp, and qsvmcp. All functions are pure and Send + Sync, so the single Environment that `template` shares across rayon worker threads calls them concurrently; the regex cache uses read-lock-then-clone so matching never holds the lock. Adds 13 tests to tests/test_template.rs; updates `template` USAGE and regenerates docs/help/template.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codacy-production · 2026-05-29T15:38:31Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 38 complexity

Metric Results

Complexity 38

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

) - regex cache: cap at 256 entries so data-derived patterns (e.g. regex_match(pattern_column)) can't grow the process-global cache unbounded and exhaust memory. - floor/ceil: reject NaN/infinity and out-of-i64-range values via a to_i64() guard instead of an `as` cast that silently saturates to 0/i64::MIN/i64::MAX. - add regression tests for the out-of-range floor guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…borev #2589) i64::MAX as f64 rounds up to 2^63 (9223372036854775808.0), so the prior inclusive range check admitted 2^63 and then saturated it to i64::MAX on cast. Switch to an exclusive upper bound at 2^63 (i64::MIN is exactly representable, so the lower bound stays inclusive). Extend the regression test to cover the finite out-of-range value 2^63, not only infinity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

) floor/ceil of an integer is the integer itself, so integer-string inputs now short-circuit via an i64 parse before any f64 conversion. This makes valid boundary values like i64::MAX (9223372036854775807, which rounds UP to 2^63 as f64) round-trip exactly instead of being rejected by the 2^63 range guard. Only genuinely fractional inputs take the f64 path. Add an i64::MAX/i64::MIN boundary regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Returning a float removes the i64-cast range/precision guardrails entirely (no saturation, no 2^63 boundary edge cases, no precision loss for huge integers). NaN/infinity now render transparently rather than silently becoming a wrong integer. Users pipe `|int` when an integer is wanted (`{{ v|floor|int }}`). Updates the USAGE note + regenerates docs/help/template.md, and replaces the i64-cast regression tests with float-output, `|int`, and non-numeric-error cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Returning pure f64 lost precision for large integer inputs (e.g. i64::MAX or any ID above 2^53 rounded during the f64 parse). Restore an integer- string fast path: integer inputs pass through exactly as an integer Value, and only genuinely fractional inputs go through f64 (returning a float). No i64 cast, so still no saturation/range pitfalls. Re-add boundary regression coverage for 2^53+1, i64::MAX, and i64::MIN. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…2593) Extend the exact integer fast path to u64 so large unsigned IDs (up to u64::MAX) pass through unchanged. Integer-syntax strings that fit neither i64 nor u64 now error instead of silently approximating through f64, making the "integers stay exact" contract honest. Add regression coverage for i64::MAX+1 and u64::MAX (exact) plus u64::MAX+1 and i64::MIN-1 (error). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds a shared minijinja_filters module for data-wrangling-oriented MiniJinja filters/functions and wires it into the MiniJinja-powered commands so templates can use the same helpers consistently.

Changes:

Adds shared filters/functions for regex, rounding, date formatting, padding, slugging, hashing, JSON parsing, and coalescing.
Registers the shared filter set in template, fetchpost, describegpt, and profile, plus all binary entry points.
Adds template integration tests and updates template help documentation.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`src/minijinja_filters.rs`	Implements and registers the shared MiniJinja filters/functions.
`src/main.rs`	Adds the shared module to the main binary.
`src/mainlite.rs`	Adds the shared module to the lite binary.
`src/maindp.rs`	Adds the shared module to the datapusher-plus binary.
`src/cmd/template.rs`	Documents and registers the shared filters for `template`.
`src/cmd/fetchpost.rs`	Registers the shared filters for payload templates.
`src/cmd/describegpt.rs`	Registers the shared filters for markdown and prompt rendering.
`src/cmd/profile/formula_helpers.rs`	Registers the shared filters in profile formula environments.
`tests/test_template.rs`	Adds coverage for the new filters through `qsv template`.
`docs/help/template.md`	Updates generated help for the new template filters.

The new USAGE block started with `qsv ` (triggering help_markdown_gen's console auto-fence) and relied on indentation the generator strips, so docs/help/template.md rendered the intro as a console block and collapsed the aligned filter list. Reword the intro to not start with `qsv ` and wrap the list in an explicit ``` fence (preserved verbatim by the generator), so the columns and multi-line entries render correctly. Regenerate the help md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

#2595) format_examples passed literal `> ...` blockquote lines (e.g. GitHub `[!NOTE]` admonitions) straight to the catch-all without a trailing blank line, so the note wasn't closed and the following paragraph could be absorbed as a CommonMark lazy continuation. Add a blockquote branch mirroring the existing `#`-comment handling: emit the line, then a blank line once the next non-empty line is not a blockquote line. Regenerate docs/help/template.md (only file affected). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…roborev #2596) Add regression tests for the blockquote fix: one asserts a blank line separates a `> [!NOTE]` block from following prose (guarding against the lazy-continuation bug), another asserts adjacent `>` lines stay in a single blockquote. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jqnatividad and others added 6 commits May 29, 2026 11:47

jqnatividad requested a review from Copilot May 29, 2026 16:56

Copilot started reviewing on behalf of jqnatividad May 29, 2026 16:57 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread docs/help/template.md Outdated

jqnatividad and others added 3 commits May 29, 2026 14:41

jqnatividad merged commit 428090e into master May 29, 2026
17 checks passed

jqnatividad deleted the more-minijinja-custom-functions branch May 29, 2026 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(template): add data-wrangling MiniJinja filters shared across commands#3921

feat(template): add data-wrangling MiniJinja filters shared across commands#3921
jqnatividad merged 10 commits into
masterfrom
more-minijinja-custom-functions

jqnatividad commented May 29, 2026

Uh oh!

codacy-production Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jqnatividad commented May 29, 2026

Summary

Filters / functions added

Design notes

Context

Testing

Uh oh!

codacy-production Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codacy-production Bot commented May 29, 2026 •

edited

Loading