Background
mlxcel renders model-supplied chat templates (tokenizer_config.json chat_template / chat_template.jinja) through minijinja in src/server/chat_template.rs. Unlike CVE-2026-5760 (an RCE in a Python framework), mlxcel's minijinja-based path cannot reach code execution (see the companion regression-test issue). The realistic residual risk of rendering an untrusted template is CPU/memory denial-of-service: a deliberately pathological template (e.g. deeply nested {% for i in range(...) %} loops, or expressions building very large strings) can consume unbounded CPU/memory during rendering.
minijinja provides Environment::set_fuel(Some(n)) to cap the number of operations executed per render, returning an error once the budget is exhausted. Today configure_environment() (src/server/chat_template.rs:586) sets no fuel budget, so rendering is unbounded. Rendering happens both at model-load time (the supports_tools probe) and per request.
Threat model / severity
- Severity: LOW (defense-in-depth) for the current architecture. Template source comes only from (1) the model the operator loads at startup, or (2) the operator's
--chat-template flag. HTTP clients cannot supply template source, and the request model field does not trigger loading of arbitrary models — model loading is operator-controlled at startup (src/server/model_worker.rs, src/server/startup.rs). Triggering this therefore requires the operator to deliberately load a malicious third-party model.
- Severity rises to MEDIUM in a multi-tenant / hosted deployment where untrusted parties could cause arbitrary (e.g. Hugging Face) models to be loaded. This bound is the right preventive control before such a deployment.
Tasks
Acceptance criteria
- A pathological template render returns an error promptly instead of consuming unbounded CPU/memory.
- All locally-available real model templates still render successfully (no false positives), verified via the audit test.
- The chosen
set_fuel budget and its rationale are documented in configure_environment.
References
- CVE-2026-5760 / CERT VU#915947 (motivating context — note mlxcel is NOT RCE-vulnerable; this is DoS defense-in-depth)
- minijinja
Environment::set_fuel documentation
src/server/chat_template.rs:586 (configure_environment)
Companion hardening issue: #128
Background
mlxcel renders model-supplied chat templates (
tokenizer_config.jsonchat_template/chat_template.jinja) throughminijinjainsrc/server/chat_template.rs. Unlike CVE-2026-5760 (an RCE in a Python framework), mlxcel's minijinja-based path cannot reach code execution (see the companion regression-test issue). The realistic residual risk of rendering an untrusted template is CPU/memory denial-of-service: a deliberately pathological template (e.g. deeply nested{% for i in range(...) %}loops, or expressions building very large strings) can consume unbounded CPU/memory during rendering.minijinja provides
Environment::set_fuel(Some(n))to cap the number of operations executed per render, returning an error once the budget is exhausted. Todayconfigure_environment()(src/server/chat_template.rs:586) sets no fuel budget, so rendering is unbounded. Rendering happens both at model-load time (thesupports_toolsprobe) and per request.Threat model / severity
--chat-templateflag. HTTP clients cannot supply template source, and the requestmodelfield does not trigger loading of arbitrary models — model loading is operator-controlled at startup (src/server/model_worker.rs,src/server/startup.rs). Triggering this therefore requires the operator to deliberately load a malicious third-party model.Tasks
fuelcargo feature (add"fuel"to theminijinjafeatures inCargo.toml, alongside the existingloop_controls).set_fuelis gated behind this feature.configure_environment(), callenv.set_fuel(Some(N))with a budget large enough never to affect legitimate templates (tool-rich Qwen/Nemotron-style templates with long histories do substantial work — choose a generous bound and document the rationale in a comment).Result) and maps to a sensible server response — never a panic.cargo test --release --lib test_all_local_model_templates_render -- --ignored --nocaptureover the localmodels/directory and confirm every model still renders within budget.Acceptance criteria
set_fuelbudget and its rationale are documented inconfigure_environment.References
Environment::set_fueldocumentationsrc/server/chat_template.rs:586(configure_environment)Companion hardening issue: #128