-
Notifications
You must be signed in to change notification settings - Fork 104
MiniJinja Templating
Tier: Intermediate
Commands that use it: template, fetchpost, describegpt, profile
Note
This page is the cross-cutting templating layer. Per-command flags live in /docs/help/. For the full template language, see the MiniJinja docs and the Jinja2 template syntax reference.
Several qsv commands embed MiniJinja (currently v2.20), a Rust implementation of Jinja2. Wherever you see the ⛩️ symbol in the Command Reference, that command renders text from your data with the same template language — so a filter you learn for template works the same in a fetchpost payload, a describegpt prompt, or a profile formula.
| Command | What gets templated | How to supply the template |
|---|---|---|
template |
An arbitrary text/Markdown/HTML/CSV-per-row document |
--template <str> or -t, --template-file <file>
|
fetchpost |
The HTTP POST request body (JSON or any content type) | -t, --payload-tpl <file> |
describegpt |
The LLM prompt(s) sent for inference |
--prompt-file <toml> (templated prompt fields) |
profile |
CKAN scheming formula / suggestion_formula fields → derived metadata |
--spec <yaml> |
In every case, CSV column values become template variables and the rendered output is what the command emits or sends.
For per-row commands (template, fetchpost), each row is rendered independently:
-
Column headers become variable names. Non-alphanumeric characters are converted to underscore:
first name→{{ first_name }},us-state→{{ us_state }}. -
With
--no-headers, columns are addressed by 1-based index with a_cprefix:{{ _c1 }},{{ _c2 }}, … (template). -
QSV_ROWNOholds the current 1-based row number — handy for output filenames (--outfilenameintemplate). -
All field values are strings. Cast with
|intor|floatbefore math or before any filter that needs a number (see Tips below).
template (-J, --globals-json <file>) and fetchpost accept a JSON file of values shared across every row render, accessed under the qsv_g namespace:
Report for {{ qsv_g.school_name }} — {{ qsv_g.year }}
Student: {{ last_name|title }}, {{ first_name|title }}Beyond the stock Jinja2 filters and the minijinja-contrib set, template registers these qsv-specific helpers. They do not require casting — they accept the raw string field for convenience:
| Filter | Purpose | Example |
|---|---|---|
substr(start[, end]) |
Substring by byte range | `{{ code |
format_float(precision) |
Parse → fixed-precision float (max 16) | `{{ balance |
human_count |
Integer with thousands separators | `{{ rows |
human_float_count |
Float with thousands separators | `{{ amount |
round_banker(places) |
Banker's rounding (round-half-to-even) | `{{ rate |
to_bool |
Truthiness of true/1/yes/t/y or non-zero number |
`{% if active |
lookup("table", "Column") |
Value from a registered lookup table | `{{ us_state |
Plus one custom function:
| Function | Purpose |
|---|---|
register_lookup("name", "resource") |
Load a lookup table (local path, HTTP/HTTPS, dathere://, or ckan://) and bind it to name for |lookup. Returns true on success. |
Important
Filter errors are values, not crashes. When a custom filter can't parse its input (e.g. format_float on non-numeric text), it returns the --customfilter-error string (default <FILTER_ERROR>) instead of aborting the run. Set --customfilter-error "<empty string>" to emit nothing on error.
register_lookup() is pre-scanned from the template body before rendering, so the table is ready on the first row. Because of this pre-scan, a register_lookup(...) call buried inside a conditional still runs at startup.
{% set ok = register_lookup("us_states", "dathere://us-states-example.csv") -%}
{% if ok and us_state not in ["DE", "CA"] -%}
{% set tax = us_state|lookup("us_states", "Sales Tax (2023)")|float -%}
{{ us_state|lookup("us_states", "Name") }}: {{ tax }}%
{% endif %}See Lookup Tables for resource schemes, caching (--cache-dir, QSV_CACHE_DIR), and CKAN options (--ckan-api, --ckan-token).
These fill gaps that neither MiniJinja core nor minijinja-contrib cover (regex, integer-exact rounding, messy-date parsing, padding, slugs, hashing). They are registered on every MiniJinja-powered command — template, fetchpost, describegpt, and profile — and are present in all binary variants (qsv, qsvlite, qsvdp, qsvmcp), with no cargo feature gate. Values are coerced from strings, so you usually don't need |float first.
| Filter | Purpose | Example |
|---|---|---|
regex_replace("pat", "rep") |
Replace all regex matches; $1 / ${name} capture refs in the replacement |
`{{ phone |
regex_match("pat") |
true if the regex matches anywhere |
`{% if id |
regex_find("pat") |
First whole match, or "" if none |
`{{ text |
floor / ceil
|
Round down / up. Integer inputs stay exact (incl. values beyond f64's 2⁵³ range, up to u64); fractional inputs return a float | `{{ "42.7" |
datefmt("fmt"[, prefer_dmy]) |
Parse a messy date string (19+ formats, via qsv-dateparser) and reformat with a chrono format string. Unlike contrib's dateformat, this parses arbitrary strings |
`{{ d |
zfill(width) |
Left-pad with zeros, keeping a leading sign | `{{ "42" |
lpad(width[, fill]) / rpad(width[, fill])
|
Left / right pad to width with fill (default space) |
`{{ name |
slugify |
URL/DB/CKAN-safe slug (lowercase, non-alphanumeric runs → -, trimmed) |
`{{ title |
blake3 |
BLAKE3 hex digest of the value — stable surrogate / content keys for dedup, joins, change-detection | `{{ row |
fromjson / parse_json
|
Parse a JSON-in-a-cell string into an indexable value | `{{ (meta |
Plus one function:
| Function | Purpose |
|---|---|
coalesce(a, b, …) |
First argument that is not undefined / none / empty string (broader than the single-fallback default/d) |
Notes:
-
floor/ceilprecision. Integer inputs pass through exactly (signedi64and large unsignedu64IDs alike); an integer literal too large for either is rejected with an error rather than silently approximated. Fractional inputs go throughf64and return a float — pipe|intfor a clean integer. - Regex caching. Compiled patterns are cached (bounded) for reuse across rows, so a literal pattern compiles once.
-
Errors. Invalid regex, unparseable dates, malformed JSON, and non-numeric
floor/ceilinputs raise a template error. Intemplate, that surfaces as a per-rowRENDERING ERROR(counted), not a crash.
describegpt --infer-content-type uses datefmt in its default dictionary template. For Date/DateTime fields, the LLM infers the column's actual strftime format (validated against the data) and stamps it onto the Content Type, e.g. date:%m/%d/%Y or datetime:%m/%d/%Y %I:%M:%S %p. The dictionary's Min and Max come from qsv stats normalized to RFC 3339 (2013-01-24), which looks different from how the dates actually appear in the data. The template extracts the inferred format and reformats Min/Max so they match the column's real presentation (and the verbatim Examples/Enumeration values):
{% set df = entry.content_type | regex_replace("^(date|datetime):", "")
if entry.content_type | regex_match("^(date|datetime):") else "" %}
{% if df and entry.min %}{{ entry.min | datefmt(df) }}{% else %}{{ entry.min }}{% endif %}So a column whose dates read 01/24/2013 shows Min/Max as 01/24/2013 too, not 2013-01-24. Fields without an inferred date format (bare date/datetime, or non-date content types) are left unchanged. Custom --prompt-file dictionary templates can adopt the same {% set df %} pattern. The ^(date|datetime): anchor matches only the bare-token prefix up to the first :, so formats containing colons (%I:%M:%S) are preserved intact.
profile's --spec formulas run in a richer environment — a native Rust port of DataPusher+'s jinja2_helpers.py — with metadata-oriented helpers such as format_bytes, format_date, format_coordinates, calculate_percentage, sanitize_iso_8601_interval, spatial_extent_wkt, temporal_resolution, guess_accrual_periodicity, build_csvw_schema, and build_croissant_fields. These are specific to the profiling pipeline; see Metadata Profiling and DataPusher+'s dataset-druf.yaml for usage.
qsv compiles MiniJinja with these capabilities, so they're available in every templated command:
-
pycompat— call Python-style string methods directly:{{ name.upper() }},{{ s.startswith("A") }},{{ s.strip() }}. -
datetime+timezone(minijinja-contrib) —datetimeformat,dateformat,timeformat,now()for date/time rendering. -
urlencode— percent-encode values, important forfetchpostpayloads:{{ q|urlencode }}. -
loop_controls—{% break %}/{% continue %}inside{% for %}loops. -
rand(minijinja-contrib) —random/randrange-style helpers. -
Text shaping (
minijinja-contrib) —wordwrap,wordcount, Unicode-aware word wrapping. -
json—{{ obj|tojson }}for emitting valid JSON (the backbone offetchpostJSON payloads). -
speedups/stacker— performance and deep-recursion safety; no syntax impact.
Cast before you compute. Fields arrive as strings. Stock Jinja math and many filters need numbers:
{{ (balance|float - discount|float)|format_float(2) }}
{{ "%.1f"|format(score|float) }}Trim whitespace with -. Add a minus to a block's delimiters to strip surrounding whitespace/newlines — essential when generating compact JSON or clean CSV:
{%- for r in rows -%}
{{ r.id }}{% if not loop.last %},{% endif %}
{%- endfor -%}Comments don't render. Use {# … #} for notes that won't appear in output.
fetchpost: keep JSON valid. Build the body with |tojson and |urlencode rather than hand-quoting. fetchpost validates the rendered JSON and aborts on malformed output, so let the filters do the escaping:
{"name": {{ full_name|tojson }}, "q": "{{ query|urlencode }}"}describegpt: prompts are templates too. The --prompt-file TOML's prompt fields are MiniJinja — you can interpolate dataset stats/frequency/dictionary context into the LLM prompt. See resources/describegpt_defaults.toml for the built-in templates.
Use loop.* variables. loop.index, loop.first, loop.last, loop.length make separators and headers easy in template documents.
Python string methods work (via pycompat): {{ code.replace("-", "") }}, {{ name.title() }}, {{ s.split(",") }}.
-
MiniJinja crate docs ·
minijinja-contrib· Jinja2 template syntax -
template— render CSV rows into any text format ·/docs/help/template.md -
fetchpost— MiniJinja-templated POST bodies ·/docs/help/fetchpost.md -
describegpt— templated LLM prompts ·/docs/help/describegpt.md -
Metadata Profiling (
profile) — CKAN scheming formulas -
Lookup Tables —
register_lookup/\|lookupresource schemes & caching -
tests/test_template.rs— worked examples ·scripts/template.tpl— a complex real template
qsv — GitHub · Releases · Discussions · qsv pro · Try it online · Benchmarks · datHere · DeepWiki · Dual-licensed MIT / Unlicense
Edit this page: Contributing to the Wiki
Home · Why qsv? · Tier legend
- All Commands (index)
- Selection & Inspection
- Transform & Reshape
- Aggregation & Statistics
- Joins & Set Ops
- SQL & Polars
- Validation & Schema
- Metadata Profiling (profile)
- Conversion & I/O
- Geospatial
- HTTP & Web
- Get & Disk Cache
- Scripting (Luau / Python)
- Indexing, Compression & Diff
- AI & Documentation