[WireApi::Chat] Token usage not updated correctly, causing infinite function_call loop

### What version of Codex is running?

0.56.0

### What subscription do you have?

no subscription 

### Which model were you using?

glm4.6, deepseek-v3.2

### What platform is your computer?

Microsoft Windows NT 10.0.26100.0 x64

### What issue are you seeing?

## Summary 

When using Codex to initialize my workspace (generate AGENTS.md) the run hung for ~10 minutes with no progress and repeatedly retried a function_call. I cancelled the request. The logs show an endless loop: after the LLM returns a function_call with invalid params (parse error), Codex keeps calling the LLM, the LLM keeps returning the same invalid function_call, and this repeats until my subscribed tokens are exhausted (~10M tokens consumed). Switching to another model (DeepSeek) completed the same job successfully.

## Root cause analysis (what I found)

- The Response SSE parser in [codex-rs/core/src/client::process_sse](https://github.com/openai/codex/blob/main/codex-rs/core/src/client.rs#L884) uses serde_json::from_value::<ResponseCompleted> to deserialize Completed events; this path retains token_usage properly.
- The Chat API SSE handler in [codex-rs/core/src/chat_completions::process_chat_sse](https://github.com/openai/codex/blob/main/codex-rs/core/src/chat_completions.rs#L494) manually parses SSE events and currently hardcodes token_usage: None for Completed events. Because token_usage is None upstream, token accounting and usage-limit protection do not run.
- With token usage not being set, the system will not halt on token-usage limits and will continue retrying the LLM after parse failures, causing a runaway loop when the model (glm4.6 in this case) keeps returning malformed function_call data.

————

```log
{"timestamp":"2025-11-16T12:04:49.622Z","type":"event_msg","payload":{"type":"token_count","info":null,"rate_limits":null}}
{"timestamp":"2025-11-16T12:04:49.622Z","type":"response_item","payload":{"type":"function_call","name":"shell","arguments":"{\"command\":\"find . -type f -name\"}","call_id":"call_45cc0a4361d64dd783b51bb1"}}
{"timestamp":"2025-11-16T12:04:49.622Z","type":"response_item","payload":{"type":"function_call_output","call_id":"call_45cc0a4361d64dd783b51bb1","output":"failed to parse function arguments: Error(\"invalid type: string \\\"find . -type f -name\\\", expected a sequence\", line: 1, column: 33)"}}
{"timestamp":"2025-11-16T12:04:49.622Z","type":"response_item","payload":{"type":"message","role":"assistant","content":[{"type":"output_text","text":"\n"}]}}
{"timestamp":"2025-11-16T12:04:49.622Z","type":"turn_context","payload":{"cwd":"/home/robby/workspace/a2a-java/server-common","approval_policy":"on-request","sandbox_policy":{"mode":"workspace-write","network_access":false,"exclude_tmpdir_env_var":false,"exclude_slash_tmp":false},"model":"glm-4.6","summary":"auto"}}
{"timestamp":"2025-11-16T12:04:50.585Z","type":"event_msg","payload":{"type":"agent_message","message":"\n"}}
{"timestamp":"2025-11-16T12:04:50.585Z","type":"event_msg","payload":{"type":"token_count","info":null,"rate_limits":null}}
{"timestamp":"2025-11-16T12:04:50.585Z","type":"response_item","payload":{"type":"function_call","name":"shell","arguments":"{\"command\":\"find . -type f -name\"}","call_id":"call_0cdfc98564e94ef185db5cc5"}}
{"timestamp":"2025-11-16T12:04:50.585Z","type":"response_item","payload":{"type":"function_call_output","call_id":"call_0cdfc98564e94ef185db5cc5","output":"failed to parse function arguments: Error(\"invalid type: string \\\"find . -type f -name\\\", expected a sequence\", line: 1, column: 33)"}}
{"timestamp":"2025-11-16T12:04:50.586Z","type":"response_item","payload":{"type":"message","role":"assistant","content":[{"type":"output_text","text":"\n"}]}}
{"timestamp":"2025-11-16T12:04:50.586Z","type":"turn_context","payload":{"cwd":"/home/robby/workspace/a2a-java/server-common","approval_policy":"on-request","sandbox_policy":{"mode":"workspace-write","network_access":false,"exclude_tmpdir_env_var":false,"exclude_slash_tmp":false},"model":"glm-4.6","summary":"auto"}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"event_msg","payload":{"type":"agent_message","message":"\n"}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"event_msg","payload":{"type":"token_count","info":null,"rate_limits":null}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"response_item","payload":{"type":"function_call","name":"shell","arguments":"{\"command\":\"find . -type f -name\"}","call_id":"call_1246ce81e1044a38a772af3e"}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"response_item","payload":{"type":"function_call_output","call_id":"call_1246ce81e1044a38a772af3e","output":"failed to parse function arguments: Error(\"invalid type: string \\\"find . -type f -name\\\", expected a sequence\", line: 1, column: 33)"}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"response_item","payload":{"type":"message","role":"assistant","content":[{"type":"output_text","text":"\n"}]}}
{"timestamp":"2025-11-16T12:04:51.549Z","type":"turn_context","payload":{"cwd":"/home/robby/workspace/a2a-java/server-common","approval_policy":"on-request","sandbox_policy":{"mode":"workspace-write","network_access":false,"exclude_tmpdir_env_var":false,"exclude_slash_tmp":false},"model":"glm-4.6","summary":"auto"}}
{"timestamp":"2025-11-16T12:04:52.507Z","type":"event_msg","payload":{"type":"agent_message","message":"\n"}}
```

### What steps can reproduce the bug?

Reproduction steps

- Use Codex to init a workspace (generate AGENTS.md) with the glm4.6 model via the Chat API.
- Observe the LLM returning a function_call whose parameters do not conform to the expected schema.
- Codex logs a parse error for the function_call parameters, then retries calling the LLM.
- The LLM returns the same malformed function_call and the loop repeats until tokens are exhausted or the request is cancelled.

### What is the expected behavior?

When the LLM returns a malformed function_call, Codex should:
- Stop retrying indefinitely.
- Surface a clear error to the user after a bounded number of retries.
- Parse and propagate token_usage from the model response so token accounting and usage-limit protections work correctly.

### Additional information

<img width="1864" height="841" alt="Image" src="https://github.com/user-attachments/assets/8d001ed5-fc62-49c2-ba19-a4ce66b61ecc" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WireApi::Chat] Token usage not updated correctly, causing infinite function_call loop #6834

What version of Codex is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

Summary

Root cause analysis (what I found)

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[WireApi::Chat] Token usage not updated correctly, causing infinite function_call loop #6834

Description

What version of Codex is running?

What subscription do you have?

Which model were you using?

What platform is your computer?

What issue are you seeing?

Summary

Root cause analysis (what I found)

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions