Skip to content

RFC: Jinja2cpp Support on ExecuTorch #15147

@DannyYuyang-quic

Description

@DannyYuyang-quic

🚀 The feature, motivation and pitch

Context

Many LLMs/VLMs today rely on structured prompting using chat_template.jinja file defined in HuggingFace. These templates are essential for formatting multi-turn conversations and aligning with model pretraining/fine tuning.

Currently, ExecuTorch does not support chat_template.jinja at runtime, which limits the ability to run chat-style LLMs out of the box i.e., without requiring host-side preprocessing or manual prompt formatting.

Motivation

  • Consistency: Ensures the same prompt formatting used during calibration/inference on HuggingFace is preserved in runtime.
  • Portability: Avoids duplicating chat template logic in runtime.
  • Usability: Enables developers to pass structured chat messages (e.g., role/content pairs or system/content) directly to the runtime without manual formatting.

Details

I’m exploring whether we could integrate Jinja2Cpp as a third-party dependency in ExecuTorch to support chat_template.jinja at runtime.

This would enable structured prompting for chat-style LLMs directly on-device, without requiring host-side preprocessing or manual prompt formatting.

References

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    rfcRequest for comment and feedback on a post, proposal, etc.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions