GPT-5.4 emits internal `multi_tool_use.parallel` format as plain text with training data artifacts

## Summary

GPT-5.4 via the Codex CLI occasionally emits corrupted output containing internal ChatGPT tool-calling formats (`multi_tool_use.parallel` with `recipient_name`) mixed with memorized Chinese gambling SEO spam, instead of making proper Responses API tool calls.

## Reproduction

- **Model:** gpt-5.4
- **Mode:** Default collaboration, reasoning effort xhigh
- **Sandbox:** danger-full-access, approval_policy: never
- **Context:** Normal conversation about LaTeX file structure. User said "sure" to confirm a plan. Model responded with content structure, then immediately emitted corrupted output.

## Corrupted output (redacted)

```
numerusformամաស to=multi_tool_use.parallel  大发时时彩是  北京赛车有json
{"tool_uses":[{"recipient_name":"functions.mcp__filesystem__search_files","parameters":{...}}]}
```

The model was trying to search a `Literature/` directory for papers by specific authors (Edmans, Back, Gorbenko, Greenwood, Burkart, Corum). Instead of issuing a proper `exec_command` tool call, it emitted the old ChatGPT `multi_tool_use.parallel` internal format as plain text, interleaved with memorized Chinese gambling SEO spam tokens (大发时时彩, 北京赛车, 彩票主管, 彩神争霸).

## Two bugs

### 1. Model-level (GPT-5.4)
GPT-5.4 is leaking internal ChatGPT `multi_tool_use.parallel` scaffolding and memorized training data. The `multi_tool_use.parallel` function and `recipient_name` format are internal ChatGPT plugin-era constructs that should never appear in Responses API output.

### 2. CLI-level (Codex)
Codex treated the corrupted output as plain text (`agent_message`), then **re-fed it as a `user_message`** in the next turn (visible in the rollout JSONL at lines 1117-1118). This propagated the corruption forward in the conversation context.

## Evidence

- Session rollout: `rollout-2026-03-07T13-52-41-019cc892-5f52-7e32-9e62-d2c3e37d2492.jsonl`, lines 1111-1118
- Turn context: model=gpt-5.4, turn_id=019cc8d8-f18b-7333-abf3-1943f2f8629f
- Timestamp: 2026-03-07T15:10:01Z

## Security concern

With `danger-full-access` sandbox + `approval_policy: never`, if the `multi_tool_use.parallel` format had been recognized as a tool call rather than text, it would have auto-executed filesystem operations without user approval. The CLI should consider sanitizing or rejecting model outputs that contain known internal tool-calling formats that don't match the expected Codex tool schema.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-5.4 emits internal `multi_tool_use.parallel` format as plain text with training data artifacts #13867

Summary

Reproduction

Corrupted output (redacted)

Two bugs

1. Model-level (GPT-5.4)

2. CLI-level (Codex)

Evidence

Security concern

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

GPT-5.4 emits internal multi_tool_use.parallel format as plain text with training data artifacts #13867

Description

Summary

Reproduction

Corrupted output (redacted)

Two bugs

1. Model-level (GPT-5.4)

2. CLI-level (Codex)

Evidence

Security concern

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

GPT-5.4 emits internal `multi_tool_use.parallel` format as plain text with training data artifacts #13867