Skip to content

Hugging Face Chat Template Missing <|constrain|> Token for Tool Calls #91

@JackLingjie

Description

@JackLingjie

Background

While using the gpt-oss model series, I discovered an inconsistency between the official OpenAI Harmony rendering logic and the chat_template.jinja provided in the Hugging Face repository.

According to the OpenAI Harmony specification (as referenced in Harmony Issue #58), structural outputs and tool calls require a specific content_type set to <|constrain|>json. However, the current Jinja template in the HF tokenizer_config.json does not correctly render this token, even when tools are provided to the template.

The Problem

  • Inconsistent Inference: The model is trained to expect and produce <|constrain|>json for tool arguments. When the chat template omits this token during few-shot prompting or conversation history reconstruction, it breaks the model's structural consistency and may degrade tool-calling performance.

  • Template Implementation Gap: The current Jinja implementation appears to hardcode json or omit the <|constrain|> modifier entirely, which is inconsistent with the Harmony framework's requirements for proper tool execution formatting.

Steps to Reproduce

Run the following script with openai/gpt-oss-20b (or gpt-oss-120b):

from transformers import AutoTokenizer
import json

# Using the official GPT-OSS tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b", use_fast=False)

obj = {
    "tools": [
        {
            "function": {
                "name": "python_executor",
                "description": "Executes Python code and returns the output or error message.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "code": {"type": "string", "description": "Python code to execute"}
                    },
                    "required": ["code"]
                }
            }
        }
    ],
    "messages": [
        {"role": "system", "content": "You are Qwen, a helpful AI assistant that can execute Python code using available tools."},
        {"role": "user", "content": "Please run this code: print(2 + 3)"},
        {
            "role": "assistant",
            "tool_calls": [
                {
                    "function": {
                        "name": "python_executor",
                        "arguments": {"code": "print(2 + 3)"}
                    }
                }
            ]
        },
        {"role": "tool", "content": "{\"stdout\": \"5\\n\", \"stderr\": \"\"}"}
    ]
}

text = tokenizer.apply_chat_template(
    obj["messages"],
    tokenize=False,
    add_generation_prompt=True,
    tools=obj["tools"]
)
print(text)

Actual vs Expected Output

Actual Output (Current Template)

...<|start|>assistant|>functions.python_executor<|channel|>commentary json<|message|>{"code": "print(2 + 3)"}<|call|>...

Expected Output (Per Harmony Spec)

...<|start|>assistant|>functions.python_executor<|channel|>commentary <|constrain|>json<|message|>{"code": "print(2 + 3)"}<|call|>...

Key Difference

Actual: <|channel|>commentary json
Expected: <|channel|>commentary <|constrain|>json

The <|constrain|> special token is missing between commentary and json.

Impact

This discrepancy means:

  1. Tool-calling examples in conversation history are not properly formatted
  2. Few-shot prompting with tools may not work as intended
  3. The model may not correctly recognize tool call formatting during inference
  4. Behavior diverges from the official OpenAI Harmony implementation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions