feat(gpt-oss): Add {% generation %} markers for training chat template#5484
Conversation
{% generation %} chat template{% generation %} markers for training chat template
|
Note On The Qwen3 training template is cleaner, the role ( I preferred to keep things simple in the first place but if necessary I'll refactor. |
|
lgtm! @codex review |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Yes, I just checked the template, we would need a good amount of refactoring and I don't think this a something we want. You'll get one or two unwanted tokens in the loss, I think that should be fine |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
This PR aims to add
{% generation %}tags/markers for gpt-oss: part of #5471gptoss_training.ninjais just copied from the existinggptoss.jinjawith wrapped{% generation %} / {%- endgeneration %}changes, just like qwen3.Diff:
gptoss.jinjavsgptoss_training.jinjaTestGetTrainingChatTemplatesuite passesBefore submitting
AI writing disclosure
We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
👋 @qgallouedec I was waiting for your #5470 to be merged, to get a clean template for this PR.
Note
Medium Risk
Adds a new GPT-OSS training chat template and updates template-selection logic, which can change rendered prompts and assistant token masks during SFT/GRPO when the patch is applied. Risk is limited to GPT-OSS/Qwen3 identity-matched templates but could affect training correctness if the template diverges from the original.
Overview
Adds GPT-OSS support to
get_training_chat_templateby introducing a newgptoss_training.jinjatemplate and returning it when the tokenizer’s template matchesgptoss.jinja.The new GPT-OSS training template wraps assistant rendering with
{% generation %}/{% endgeneration %}to enable correctreturn_assistant_tokens_mask=Truebehavior for assistant-only loss. Documentation is updated to describe the new training template, and the existingTestGetTrainingChatTemplateparametrization is extended to cover GPT-OSS.Reviewed by Cursor Bugbot for commit ece2009. Bugbot is set up for automated code reviews on this repo. Configure here.