-
Notifications
You must be signed in to change notification settings - Fork 102
MiniJinja Templating
Tier: Intermediate
Commands that use it: template, fetchpost, describegpt, profile
This page is the cross-cutting templating layer. Per-command flags live in
/docs/help/. For the full template language, see the MiniJinja docs and the Jinja2 template syntax reference.
Several qsv commands embed MiniJinja (currently v2.20), a Rust implementation of Jinja2. Wherever you see the ⛩️ symbol in the Command Reference, that command renders text from your data with the same template language — so a filter you learn for template works the same in a fetchpost payload, a describegpt prompt, or a profile formula.
| Command | What gets templated | How to supply the template |
|---|---|---|
template |
An arbitrary text/Markdown/HTML/CSV-per-row document |
--template <str> or -t, --template-file <file>
|
fetchpost |
The HTTP POST request body (JSON or any content type) | -t, --payload-tpl <file> |
describegpt |
The LLM prompt(s) sent for inference |
--prompt-file <toml> (templated prompt fields) |
profile |
CKAN scheming formula / suggestion_formula fields → derived metadata |
--spec <yaml> |
In every case, CSV column values become template variables and the rendered output is what the command emits or sends.
For per-row commands (template, fetchpost), each row is rendered independently:
-
Column headers become variable names. Non-alphanumeric characters are converted to underscore:
first name→{{ first_name }},us-state→{{ us_state }}. -
With
--no-headers, columns are addressed by 1-based index with a_cprefix:{{ _c1 }},{{ _c2 }}, … (template). -
QSV_ROWNOholds the current 1-based row number — handy for output filenames (--outfilenameintemplate). -
All field values are strings. Cast with
|intor|floatbefore math or before any filter that needs a number (see Tips below).
template (-J, --globals-json <file>) and fetchpost accept a JSON file of values shared across every row render, accessed under the qsv_g namespace:
Report for {{ qsv_g.school_name }} — {{ qsv_g.year }}
Student: {{ last_name|title }}, {{ first_name|title }}Beyond the stock Jinja2 filters and the minijinja-contrib set, template registers these qsv-specific helpers. They do not require casting — they accept the raw string field for convenience:
| Filter | Purpose | Example |
|---|---|---|
substr(start[, end]) |
Substring by byte range | `{{ code |
format_float(precision) |
Parse → fixed-precision float (max 16) | `{{ balance |
human_count |
Integer with thousands separators | `{{ rows |
human_float_count |
Float with thousands separators | `{{ amount |
round_banker(places) |
Banker's rounding (round-half-to-even) | `{{ rate |
to_bool |
Truthiness of true/1/yes/t/y or non-zero number |
`{% if active |
lookup("table", "Column") |
Value from a registered lookup table | `{{ us_state |
Plus one custom function:
| Function | Purpose |
|---|---|
register_lookup("name", "resource") |
Load a lookup table (local path, HTTP/HTTPS, dathere://, or ckan://) and bind it to name for |lookup. Returns true on success. |
Filter errors are values, not crashes. When a custom filter can't parse its input (e.g.
format_floaton non-numeric text), it returns the--customfilter-errorstring (default<FILTER_ERROR>) instead of aborting the run. Set--customfilter-error "<empty string>"to emit nothing on error.
register_lookup() is pre-scanned from the template body before rendering, so the table is ready on the first row. Because of this pre-scan, a register_lookup(...) call buried inside a conditional still runs at startup.
{% set ok = register_lookup("us_states", "dathere://us-states-example.csv") -%}
{% if ok and us_state not in ["DE", "CA"] -%}
{% set tax = us_state|lookup("us_states", "Sales Tax (2023)")|float -%}
{{ us_state|lookup("us_states", "Name") }}: {{ tax }}%
{% endif %}See Lookup Tables for resource schemes, caching (--cache-dir, QSV_CACHE_DIR), and CKAN options (--ckan-api, --ckan-token).
profile's --spec formulas run in a richer environment — a native Rust port of DataPusher+'s jinja2_helpers.py — with metadata-oriented helpers such as format_bytes, format_date, format_coordinates, calculate_percentage, sanitize_iso_8601_interval, spatial_extent_wkt, temporal_resolution, guess_accrual_periodicity, build_csvw_schema, and build_croissant_fields. These are specific to the profiling pipeline; see Metadata Profiling and DataPusher+'s dataset-druf.yaml for usage.
qsv compiles MiniJinja with these capabilities, so they're available in every templated command:
-
pycompat— call Python-style string methods directly:{{ name.upper() }},{{ s.startswith("A") }},{{ s.strip() }}. -
datetime+timezone(minijinja-contrib) —datetimeformat,dateformat,timeformat,now()for date/time rendering. -
urlencode— percent-encode values, important forfetchpostpayloads:{{ q|urlencode }}. -
loop_controls—{% break %}/{% continue %}inside{% for %}loops. -
rand(minijinja-contrib) —random/randrange-style helpers. -
Text shaping (
minijinja-contrib) —wordwrap,wordcount, Unicode-aware word wrapping. -
json—{{ obj|tojson }}for emitting valid JSON (the backbone offetchpostJSON payloads). -
speedups/stacker— performance and deep-recursion safety; no syntax impact.
Cast before you compute. Fields arrive as strings. Stock Jinja math and many filters need numbers:
{{ (balance|float - discount|float)|format_float(2) }}
{{ "%.1f"|format(score|float) }}Trim whitespace with -. Add a minus to a block's delimiters to strip surrounding whitespace/newlines — essential when generating compact JSON or clean CSV:
{%- for r in rows -%}
{{ r.id }}{% if not loop.last %},{% endif %}
{%- endfor -%}Comments don't render. Use {# … #} for notes that won't appear in output.
fetchpost: keep JSON valid. Build the body with |tojson and |urlencode rather than hand-quoting. fetchpost validates the rendered JSON and aborts on malformed output, so let the filters do the escaping:
{"name": {{ full_name|tojson }}, "q": "{{ query|urlencode }}"}describegpt: prompts are templates too. The --prompt-file TOML's prompt fields are MiniJinja — you can interpolate dataset stats/frequency/dictionary context into the LLM prompt. See resources/describegpt_defaults.toml for the built-in templates.
Use loop.* variables. loop.index, loop.first, loop.last, loop.length make separators and headers easy in template documents.
Python string methods work (via pycompat): {{ code.replace("-", "") }}, {{ name.title() }}, {{ s.split(",") }}.
-
MiniJinja crate docs ·
minijinja-contrib· Jinja2 template syntax -
template— render CSV rows into any text format ·/docs/help/template.md -
fetchpost— MiniJinja-templated POST bodies ·/docs/help/fetchpost.md -
describegpt— templated LLM prompts ·/docs/help/describegpt.md -
Metadata Profiling (
profile) — CKAN scheming formulas -
Lookup Tables —
register_lookup/\|lookupresource schemes & caching -
tests/test_template.rs— worked examples ·scripts/template.tpl— a complex real template
qsv — GitHub · Releases · Discussions · qsv pro · Try it online · Benchmarks · datHere · DeepWiki · Dual-licensed MIT / Unlicense
Edit this page: Contributing to the Wiki
Home · Why qsv? · Tier legend
- All Commands (index)
- Selection & Inspection
- Transform & Reshape
- Aggregation & Statistics
- Joins & Set Ops
- SQL & Polars
- Validation & Schema
- Metadata Profiling (profile)
- Conversion & I/O
- Geospatial
- HTTP & Web
- Get & Disk Cache
- Scripting (Luau / Python)
- Indexing, Compression & Diff
- AI & Documentation
- Recipes index
- Inspect an Unknown CSV
- Clean & Normalize
- Geographic Enrichment
- Date Enrichment
- CKAN Integration
- JSON Schema Validation
- Build a Data Pipeline
- Stats → Insights
- Fetch & Cache
- Larger-than-RAM CSV
- Diff & Audit
- Multi-table Joins
- Synthesize Fake Data