-
-
Notifications
You must be signed in to change notification settings - Fork 287
Description
Describe the bug
When Codex upstream responses omit output_tokens_details.reasoning_tokens, the reasoning token count is not included in the translated OpenAI response, even though reasoning content was received and streamed to the client.
CLI Type
Codex
Model Name
gpt-5.1-codex-max, gpt-5.1-codex-mini (any Codex model with reasoning)
LLM Client
Irrelevant
Request Information
Any Codex request that returns reasoning content but where upstream omits the output_tokens_details.reasoning_tokens field in the usage response.
Expected behavior
The usage.completion_tokens_details.reasoning_tokens field should be populated with an estimated value based on the reasoning content that was streamed/received.
OS Type
- Irrelevant
Additional context
This affects both streaming and non-streaming responses. The reasoning content is successfully passed through, but token accounting is incomplete when upstream doesn't provide the breakdown.
Root Cause
The ConvertCodexResponseToOpenAI and ConvertCodexResponseToOpenAINonStream functions in internal/translator/codex/openai/chat-completions/codex_openai_response.go only set reasoning_tokens when upstream provides the value. They don't track or estimate tokens from the actual reasoning content received.
Proposed Fix
- Add
AccumulatedReasoningLenfield toConvertCliToOpenAIParamsto track reasoning content length during streaming - When upstream omits
reasoning_tokens, estimate from accumulated content (~4 chars per token) - Apply same estimation logic for non-streaming responses using the reasoning text length