A lightweight, single-header C++11 Jinja2 template engine designed for LLM chat templates. ** (HuggingFace style).
It focuses on supporting the subset of Jinja2 used by modern Large Language Models (LLMs) like Llama 3, Qwen 2.5/3, DeepSeek, and others, enabling seamless inference integration in C++ environments.
- C++11 Compatible: Ensures maximum compatibility across older compiler versions and embedded systems.
- Lightweight: Minimal dependencies (only
nlohmann/json). - LLM Focused: Native support for
messages,tools,add_generation_prompt, and special tokens. - Strictly Typed: Uses
nlohmann::jsonfor context management. - Custom Function Interop: Easily inject C++ functions (e.g.,
strftime_now) into templates. - Robust: Validated against official Python
transformersoutputs using fuzzy matching tests.
The library is a single header file. Just copy jinja.hpp to your project's include directory (or root).
You can check the library version using standard macros:
#include "jinja.hpp"
#if JINJA_VERSION_MAJOR >= 0
// Use jinja.cpp features
#endifTested and verified with templates from:
- Qwen 2.5 / 3 (Coder, Math, VL, Omni, Instruct, Thinking, QwQ)
- DeepSeek (V3, R1)
- Llama 3 / 3.1 / 3.2 (Instruct & Vision)
- Mistral
- Gemma
- SmolLM
- Phi
- And more...
- CMake 3.10+
- C++11 compatible compiler (GCC, Clang, MSVC)
mkdir build
cd build
cmake ..
makeThe project includes a comprehensive test suite based on real-world model templates.
./test_main#include "jinja.hpp"
#include <iostream>
int main() {
std::string template_str = "Hello {{ name }}!";
jinja::Template tpl(template_str);
nlohmann::json context;
context["name"] = "World";
std::string result = tpl.render(context);
std::cout << result << std::endl; // Output: Hello World!
return 0;
}#include "jinja.hpp"
// Load your tokenizer_config.json's "chat_template"
std::string chat_template_str = "...";
jinja::Template tpl(chat_template_str);
nlohmann::json messages = nlohmann::json::array({
{{"role", "user"}, {"content", "Hello!"}}
});
// Apply template
std::string prompt = tpl.apply_chat_template(
messages,
true, // add_generation_prompt
nlohmann::json::array() // tools
);You can register custom C++ functions to be called from within the template.
tpl.add_function("strftime_now", [](const std::vector<nlohmann::json>& args) {
// Return current time string
return "2025-12-16";
});For detailed implementation details, see doc/implementation_details.md.
Apache License 2.0. See LICENSE file for details.