feat(gpt-oss): Add `{% generation %}` markers for training chat template by casinca · Pull Request #5484 · huggingface/trl

casinca · 2026-04-09T11:37:43Z

What does this PR do?

This PR aims to add {% generation %} tags/markers for gpt-oss: part of #5471

gptoss_training.ninja is just copied from the existing gptoss.jinja with wrapped {% generation %} / {%- endgeneration %} changes, just like qwen3.

Diff: gptoss.jinja vs gptoss_training.jinja

@@ -1,4 +1,9 @@
 {#-
+  Training variant of the GPT-OSS chat template (see gptoss.jinja for the original).
+  Modifications vs the original:
+    - Added {% generation %} / {% endgeneration %} around assistant message output to support
+      assistant-only loss masking in SFT training.
+
   In addition to the normal inputs of `messages` and `tools`, this template also accepts the
   following kwargs:
   - "builtin_tools": A list, can contain "browser" and/or "python".        
@@ -270,6 +275,7 @@
                 {{- raise_exception("You have passed a message containing <|channel|> tags in the thinking field. Instead of doing this, you should pass analysis messages (the string between '<|message|>' and '<|end|>') in the 'thinking' field, and final messages (the string between '<|message|>' and '<|end|>') in the 'content' field.") }}
             {%- endif %}
         {%- endif %}
+        {%- generation %}
         {%- if "tool_calls" in message %}
             {#- We need very careful handling here - we want to drop the tool call analysis message if the model #}
             {#- has output a later <|final|> message, but otherwise we want to retain it. This is the only case #}
@@ -314,6 +320,7 @@
             {{- "<|start|>assistant<|channel|>final<|message|>" + message.content + "<|end|>" }}
             {%- set last_tool_call.name = none %}
         {%- endif %}
+        {%- endgeneration %}
     {%- elif message.role == 'tool' -%}
         {%- if last_tool_call.name is none %}
             {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }}

No prefix-preservation fixes needed, tested per Narrow prefix-preserving check to the actual requirement #5458
TestGetTrainingChatTemplate suite passes

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

No AI usage: the PR was written entirely by a human.
AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

👋 @qgallouedec I was waiting for your #5470 to be merged, to get a clean template for this PR.

Note

Medium Risk
Adds a new GPT-OSS training chat template and updates template-selection logic, which can change rendered prompts and assistant token masks during SFT/GRPO when the patch is applied. Risk is limited to GPT-OSS/Qwen3 identity-matched templates but could affect training correctness if the template diverges from the original.

Overview
Adds GPT-OSS support to get_training_chat_template by introducing a new gptoss_training.jinja template and returning it when the tokenizer’s template matches gptoss.jinja.

The new GPT-OSS training template wraps assistant rendering with {% generation %} / {% endgeneration %} to enable correct return_assistant_tokens_mask=True behavior for assistant-only loss. Documentation is updated to describe the new training template, and the existing TestGetTrainingChatTemplate parametrization is extended to cover GPT-OSS.

^{Reviewed by Cursor Bugbot for commit ece2009. Bugbot is set up for automated code reviews on this repo. Configure here.}

casinca · 2026-04-09T19:37:12Z

Note

On {% generation %} placement (see the diff in the initial PR description):

The Qwen3 training template is cleaner, the role (<|im_start|>assistant\n) is easily placed outside the {% generation %} block.
In GPT-OSS, the <|start|>assistant prefix is included inside the block, for simplicity, because it uses varying prefixes across >branches (<|channel|>analysis, <|channel|>final...). Rigorously matching 1:1 Qwen3 would require refactoring each branch.

I preferred to keep things simple in the first place but if necessary I'll refactor.

qgallouedec · 2026-04-09T22:03:29Z

lgtm!

@codex review

chatgpt-codex-connector · 2026-04-09T22:08:27Z

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

qgallouedec · 2026-04-09T22:16:24Z

Note

On {% generation %} placement (see the diff in the initial PR description):

The Qwen3 training template is cleaner, the role (<|im_start|>assistant\n) is easily placed outside the {% generation %} block. In GPT-OSS, the <|start|>assistant prefix is included inside the block, for simplicity, because it uses varying prefixes across >branches (<|channel|>analysis, <|channel|>final...). Rigorously matching 1:1 Qwen3 would require refactoring each branch.
I preferred to keep things simple in the first place but if necessary I'll refactor.

Yes, I just checked the template, we would need a good amount of refactoring and I don't think this a something we want. You'll get one or two unwanted tokens in the loss, I think that should be fine

HuggingFaceDocBuilderDev · 2026-04-09T22:18:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

casinca added 5 commits April 9, 2026 13:31

init

b339a00

add gptoss_training.ninja

c067339

load+add gpt-oss training template

da2c14c

added gptoss_training.jinja diffs

63c3bcb

added gpt entry for TestGetTrainingChatTemplate

cb5e7aa

casinca changed the title ~~feat(gpt-oss): Add {% generation %} chat template~~ feat(gpt-oss): Add {% generation %} markers for training chat template Apr 9, 2026

casinca marked this pull request as ready for review April 9, 2026 13:23

Merge branch 'main' into gpt-oss-gen-support-chat-template

ece2009

qgallouedec approved these changes Apr 10, 2026

View reviewed changes

qgallouedec merged commit 6f6440b into huggingface:main Apr 10, 2026
10 of 12 checks passed

casinca deleted the gpt-oss-gen-support-chat-template branch April 10, 2026 09:37

qgallouedec mentioned this pull request Apr 10, 2026

Tracking: Add {% generation %} chat templates for common model families #5471

Open

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gpt-oss): Add `{% generation %}` markers for training chat template#5484

feat(gpt-oss): Add `{% generation %}` markers for training chat template#5484
qgallouedec merged 6 commits intohuggingface:mainfrom
casinca:gpt-oss-gen-support-chat-template

casinca commented Apr 9, 2026 •

edited by cursor bot

Loading

Uh oh!

casinca commented Apr 9, 2026

Uh oh!

qgallouedec commented Apr 9, 2026

Uh oh!

chatgpt-codex-connector bot commented Apr 9, 2026

Uh oh!

qgallouedec commented Apr 9, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

casinca commented Apr 9, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

AI writing disclosure

Who can review?

Uh oh!

casinca commented Apr 9, 2026

Uh oh!

qgallouedec commented Apr 9, 2026

Uh oh!

chatgpt-codex-connector bot commented Apr 9, 2026

Uh oh!

qgallouedec commented Apr 9, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

casinca commented Apr 9, 2026 •

edited by cursor bot

Loading