security: bound chat-template rendering with minijinja fuel to limit DoS from untrusted model templates

## Background

mlxcel renders model-supplied chat templates (`tokenizer_config.json` `chat_template` / `chat_template.jinja`) through `minijinja` in `src/server/chat_template.rs`. Unlike CVE-2026-5760 (an RCE in a Python framework), mlxcel's minijinja-based path **cannot** reach code execution (see the companion regression-test issue). The realistic residual risk of rendering an untrusted template is **CPU/memory denial-of-service**: a deliberately pathological template (e.g. deeply nested `{% for i in range(...) %}` loops, or expressions building very large strings) can consume unbounded CPU/memory during rendering.

minijinja provides `Environment::set_fuel(Some(n))` to cap the number of operations executed per render, returning an error once the budget is exhausted. Today `configure_environment()` (`src/server/chat_template.rs:586`) sets no fuel budget, so rendering is unbounded. Rendering happens both at model-load time (the `supports_tools` probe) and per request.

## Threat model / severity

- **Severity: LOW (defense-in-depth)** for the current architecture. Template source comes only from (1) the model the **operator** loads at startup, or (2) the operator's `--chat-template` flag. HTTP clients cannot supply template source, and the request `model` field does not trigger loading of arbitrary models — model loading is operator-controlled at startup (`src/server/model_worker.rs`, `src/server/startup.rs`). Triggering this therefore requires the operator to deliberately load a malicious third-party model.
- **Severity rises to MEDIUM** in a multi-tenant / hosted deployment where untrusted parties could cause arbitrary (e.g. Hugging Face) models to be loaded. This bound is the right preventive control before such a deployment.

## Tasks

- [ ] Enable the minijinja `fuel` cargo feature (add `"fuel"` to the `minijinja` features in `Cargo.toml`, alongside the existing `loop_controls`). `set_fuel` is gated behind this feature.
- [ ] In `configure_environment()`, call `env.set_fuel(Some(N))` with a budget large enough never to affect legitimate templates (tool-rich Qwen/Nemotron-style templates with long histories do substantial work — choose a generous bound and document the rationale in a comment).
- [ ] Ensure fuel exhaustion surfaces as a clean rendering error (the render path already returns `Result`) and maps to a sensible server response — never a panic.
- [ ] Validate no regression against real templates: run the existing audit `cargo test --release --lib test_all_local_model_templates_render -- --ignored --nocapture` over the local `models/` directory and confirm every model still renders within budget.
- [ ] Add a unit test with a pathological template (e.g. a large nested-range loop) asserting it now errors via fuel exhaustion instead of running unboundedly.

## Acceptance criteria
- A pathological template render returns an error promptly instead of consuming unbounded CPU/memory.
- All locally-available real model templates still render successfully (no false positives), verified via the audit test.
- The chosen `set_fuel` budget and its rationale are documented in `configure_environment`.

## References
- CVE-2026-5760 / CERT VU#915947 (motivating context — note mlxcel is **NOT** RCE-vulnerable; this is DoS defense-in-depth)
- minijinja `Environment::set_fuel` documentation
- `src/server/chat_template.rs:586` (`configure_environment`)


---
Companion hardening issue: #128


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: bound chat-template rendering with minijinja fuel to limit DoS from untrusted model templates #129

Background

Threat model / severity

Tasks

Acceptance criteria

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

security: bound chat-template rendering with minijinja fuel to limit DoS from untrusted model templates #129

Description

Background

Threat model / severity

Tasks

Acceptance criteria

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions