MiniJinja Templating

Tier: Intermediate Commands that use it: template, fetchpost, describegpt, profile

Note

This page is the cross-cutting templating layer. Per-command flags live in /docs/help/. For the full template language, see the MiniJinja docs and the Jinja2 template syntax reference.

Several qsv commands embed MiniJinja (currently v2.20), a Rust implementation of Jinja2. Wherever you see the ⛩️ symbol in the Command Reference, that command renders text from your data with the same template language — so a filter you learn for template works the same in a fetchpost payload, a describegpt prompt, or a profile formula.

Where MiniJinja shows up

Command	What gets templated	How to supply the template
`template`	An arbitrary text/Markdown/HTML/CSV-per-row document	`--template <str>` or `-t, --template-file <file>`
`fetchpost`	The HTTP POST request body (JSON or any content type)	`-t, --payload-tpl <file>`
`describegpt`	The LLM prompt(s) sent for inference	`--prompt-file <toml>` (templated prompt fields)
`profile`	CKAN scheming `formula` / `suggestion_formula` fields → derived metadata	`--spec <yaml>`

In every case, CSV column values become template variables and the rendered output is what the command emits or sends.

How CSV data maps to template variables

For per-row commands (template, fetchpost), each row is rendered independently:

Column headers become variable names. Non-alphanumeric characters are converted to underscore: first name → {{ first_name }}, us-state → {{ us_state }}.
With --no-headers, columns are addressed by 1-based index with a _c prefix: {{ _c1 }}, {{ _c2 }}, … (template).
QSV_ROWNO holds the current 1-based row number — handy for output filenames (--outfilename in template).
All field values are strings. Cast with |int or |float before math or before any filter that needs a number (see Tips below).

Shared globals (`qsv_g`)

template (-J, --globals-json <file>) and fetchpost accept a JSON file of values shared across every row render, accessed under the qsv_g namespace:

Report for {{ qsv_g.school_name }} — {{ qsv_g.year }}
Student: {{ last_name|title }}, {{ first_name|title }}

qsv's custom filters & functions (`template`)

Beyond the stock Jinja2 filters and the minijinja-contrib set, template registers these qsv-specific helpers. They do not require casting — they accept the raw string field for convenience:

Filter	Purpose	Example
`substr(start[, end])`	Substring by byte range	`{{ code
`format_float(precision)`	Parse → fixed-precision float (max 16)	`{{ balance
`human_count`	Integer with thousands separators	`{{ rows
`human_float_count`	Float with thousands separators	`{{ amount
`round_banker(places)`	Banker's rounding (round-half-to-even)	`{{ rate
`to_bool`	Truthiness of `true/1/yes/t/y` or non-zero number	`{% if active
`lookup("table", "Column")`	Value from a registered lookup table	`{{ us_state

Plus one custom function:

Function	Purpose
`register_lookup("name", "resource")`	Load a lookup table (local path, HTTP/HTTPS, `dathere://`, or `ckan://`) and bind it to `name` for `\|lookup`. Returns true on success.

Important

Filter errors are values, not crashes. When a custom filter can't parse its input (e.g. format_float on non-numeric text), it returns the --customfilter-error string (default <FILTER_ERROR>) instead of aborting the run. Set --customfilter-error "<empty string>" to emit nothing on error.

Lookup tables in templates

register_lookup() is pre-scanned from the template body before rendering, so the table is ready on the first row. Because of this pre-scan, a register_lookup(...) call buried inside a conditional still runs at startup.

{% set ok = register_lookup("us_states", "dathere://us-states-example.csv") -%}
{% if ok and us_state not in ["DE", "CA"] -%}
  {% set tax = us_state|lookup("us_states", "Sales Tax (2023)")|float -%}
  {{ us_state|lookup("us_states", "Name") }}: {{ tax }}%
{% endif %}

See Lookup Tables for resource schemes, caching (--cache-dir, QSV_CACHE_DIR), and CKAN options (--ckan-api, --ckan-token).

Shared data-wrangling filters & functions (all commands)

These fill gaps that neither MiniJinja core nor minijinja-contrib cover (regex, integer-exact rounding, messy-date parsing, padding, slugs, hashing). They are registered on every MiniJinja-powered command — template, fetchpost, describegpt, and profile — and are present in all binary variants (qsv, qsvlite, qsvdp, qsvmcp), with no cargo feature gate. Values are coerced from strings, so you usually don't need |float first.

Filter	Purpose	Example
`regex_replace("pat", "rep")`	Replace all regex matches; `$1` / `${name}` capture refs in the replacement	`{{ phone
`regex_match("pat")`	`true` if the regex matches anywhere	`{% if id
`regex_find("pat")`	First whole match, or `""` if none	`{{ text
`floor` / `ceil`	Round down / up. Integer inputs stay exact (incl. values beyond f64's 2⁵³ range, up to u64); fractional inputs return a float	`{{ "42.7"
`datefmt("fmt"[, prefer_dmy])`	Parse a messy date string (19+ formats, via `qsv-dateparser`) and reformat with a chrono format string. Unlike contrib's `dateformat`, this parses arbitrary strings	`{{ d
`zfill(width)`	Left-pad with zeros, keeping a leading sign	`{{ "42"
`lpad(width[, fill])` / `rpad(width[, fill])`	Left / right pad to `width` with `fill` (default space)	`{{ name
`slugify`	URL/DB/CKAN-safe slug (lowercase, non-alphanumeric runs → `-`, trimmed)	`{{ title
`blake3`	BLAKE3 hex digest of the value — stable surrogate / content keys for dedup, joins, change-detection	`{{ row
`fromjson` / `parse_json`	Parse a JSON-in-a-cell string into an indexable value	`{{ (meta

Plus one function:

Function	Purpose
`coalesce(a, b, …)`	First argument that is not undefined / none / empty string (broader than the single-fallback `default`/`d`)

Notes:

floor/ceil precision. Integer inputs pass through exactly (signed i64 and large unsigned u64 IDs alike); an integer literal too large for either is rejected with an error rather than silently approximated. Fractional inputs go through f64 and return a float — pipe |int for a clean integer.
Regex caching. Compiled patterns are cached (bounded) for reuse across rows, so a literal pattern compiles once.
Errors. Invalid regex, unparseable dates, malformed JSON, and non-numeric floor/ceil inputs raise a template error. In template, that surfaces as a per-row RENDERING ERROR (counted), not a crash.

`datefmt` in action: `describegpt` dictionary Min/Max

describegpt --infer-content-type uses datefmt in its default dictionary template. For Date/DateTime fields, the LLM infers the column's actual strftime format (validated against the data) and stamps it onto the Content Type, e.g. date:%m/%d/%Y or datetime:%m/%d/%Y %I:%M:%S %p. The dictionary's Min and Max come from qsv stats normalized to RFC 3339 (2013-01-24), which looks different from how the dates actually appear in the data. The template extracts the inferred format and reformats Min/Max so they match the column's real presentation (and the verbatim Examples/Enumeration values):

{% set df = entry.content_type | regex_replace("^(date|datetime):", "")
            if entry.content_type | regex_match("^(date|datetime):") else "" %}
{% if df and entry.min %}{{ entry.min | datefmt(df) }}{% else %}{{ entry.min }}{% endif %}

So a column whose dates read 01/24/2013 shows Min/Max as 01/24/2013 too, not 2013-01-24. Fields without an inferred date format (bare date/datetime, or non-date content types) are left unchanged. Custom --prompt-file dictionary templates can adopt the same {% set df %} pattern. The ^(date|datetime): anchor matches only the bare-token prefix up to the first :, so formats containing colons (%I:%M:%S) are preserved intact.

`profile` formula helpers

profile's --spec formulas run in a richer environment — a native Rust port of DataPusher+'s jinja2_helpers.py — with metadata-oriented helpers such as format_bytes, format_date, format_coordinates, calculate_percentage, sanitize_iso_8601_interval, spatial_extent_wkt, temporal_resolution, guess_accrual_periodicity, build_csvw_schema, and build_croissant_fields. These are specific to the profiling pipeline; see Metadata Profiling and DataPusher+'s dataset-druf.yaml for usage.

What's enabled (and why it matters)

qsv compiles MiniJinja with these capabilities, so they're available in every templated command:

pycompat — call Python-style string methods directly: {{ name.upper() }}, {{ s.startswith("A") }}, {{ s.strip() }}.
datetime + timezone (minijinja-contrib) — datetimeformat, dateformat, timeformat, now() for date/time rendering.
urlencode — percent-encode values, important for fetchpost payloads: {{ q|urlencode }}.
loop_controls — {% break %} / {% continue %} inside {% for %} loops.
rand (minijinja-contrib) — random/randrange-style helpers.
Text shaping (minijinja-contrib) — wordwrap, wordcount, Unicode-aware word wrapping.
json — {{ obj|tojson }} for emitting valid JSON (the backbone of fetchpost JSON payloads).
speedups / stacker — performance and deep-recursion safety; no syntax impact.

Tips & tricks

Cast before you compute. Fields arrive as strings. Stock Jinja math and many filters need numbers:

{{ (balance|float - discount|float)|format_float(2) }}
{{ "%.1f"|format(score|float) }}

Trim whitespace with -. Add a minus to a block's delimiters to strip surrounding whitespace/newlines — essential when generating compact JSON or clean CSV:

{%- for r in rows -%}
  {{ r.id }}{% if not loop.last %},{% endif %}
{%- endfor -%}

Comments don't render. Use {# … #} for notes that won't appear in output.

fetchpost: keep JSON valid. Build the body with |tojson and |urlencode rather than hand-quoting. fetchpost validates the rendered JSON and aborts on malformed output, so let the filters do the escaping:

{"name": {{ full_name|tojson }}, "q": "{{ query|urlencode }}"}

describegpt: prompts are templates too. The --prompt-file TOML's prompt fields are MiniJinja — you can interpolate dataset stats/frequency/dictionary context into the LLM prompt. See resources/describegpt_defaults.toml for the built-in templates.

Use loop.* variables. loop.index, loop.first, loop.last, loop.length make separators and headers easy in template documents.

Python string methods work (via pycompat): {{ code.replace("-", "") }}, {{ name.title() }}, {{ s.split(",") }}.

MiniJinja Templating

MiniJinja Templating

Where MiniJinja shows up

How CSV data maps to template variables

Shared globals (qsv_g)

qsv's custom filters & functions (template)

Lookup tables in templates

Shared data-wrangling filters & functions (all commands)

datefmt in action: describegpt dictionary Min/Max

profile formula helpers

What's enabled (and why it matters)

Tips & tricks

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Get Started

Command Reference

Cookbook

Tuning & Internals

Ecosystem

Reference

Legacy

Clone this wiki locally

Shared globals (`qsv_g`)

qsv's custom filters & functions (`template`)

`datefmt` in action: `describegpt` dictionary Min/Max

`profile` formula helpers