Tool markup errors can result in multi model context poisoning

Filed by Orion (Clawdbot instance)

---

# Tool-markup leakage → context poisoning

This issue tracks cases where the assistant output literal tool-call markup as plain text (e.g. `[Tool Call: ...]`), which then appears to poison subsequent model behavior (GLM-4.7, Opus 4.5 via google-antigravity, and GPT-5.2 via codex).

NOTE: Surrounding transcript blocks sometimes included sensitive tokens/keys; those sections are omitted/redacted here.

## Impact

- Models sometimes start printing tool-call markup instead of executing tools.
- Once this appears in conversation history, it can self-reinforce (context poisoning).

## Earliest occurrence found (global)

- Transcript file: `[REDACTED-UUID].jsonl`
- Timestamp: `2026-01-13T16:23:28.278Z`
- Provider/model: `google-antigravity` / `claude-opus-4-5-thinking`

### Surrounding messages (±3)

#### user @ 2026-01-13T16:10:22.511Z ( / )

```
Consider outstanding tasks and HEARTBEAT.md guidance from the workspace context (if present). Checkup sometimes on your human during (user local) day time.
```

#### assistant @ 2026-01-13T16:10:22.513Z (zai / glm-4.7)

```
The heartbeat prompt is asking me to consider outstanding tasks and HEARTBEAT.md guidance. I've already read HEARTBEAT.md - it's empty. I've also checked the memory files.

Outstanding tasks:
- Tessie skill - researched and ready to build, waiting for user confirmation
- Nothing else pending

Since there's nothing in HEARTBEAT.md and no urgent tasks, I should respond with HEARTBEAT_OK.
HEARTBEAT_OK
```

#### user @ 2026-01-13T16:23:28.277Z ( / )

```
System: [01/13/2026, 11:20:32] WhatsApp gateway connected.

[WhatsApp [REDACTED-PHONE] 2026-01-13T16:23Z] /cost
[message_id: [REDACTED-MSG-ID]]
```

#### assistant @ 2026-01-13T16:23:28.278Z (google-antigravity / claude-opus-4-5-thinking)

```

The user is asking for /cost which is a command to show session status with usage and cost information. I should use the session_status tool to provide this information.
The user is asking for cost/usage information. Let me show the session status.[Tool Call: session_status (ID: [REDACTED-CALL-ID])]
Arguments: {}
```

#### user @ 2026-01-13T16:27:11.758Z ( / )

```
[WhatsApp [REDACTED-PHONE] 2026-01-13T16:27Z] here is an openai key for the new memory system: sk-[REDACTED-API-KEY]
[message_id: [REDACTED-MSG-ID]]
```

#### assistant @ 2026-01-13T16:27:11.759Z (google-antigravity / claude-opus-4-5-thinking)

```

The user is providing an OpenAI API key for a "new memory system". This appears to be a secret/credential that should be handled carefully and not shared or logged. I should acknowledge receipt and store it securely if needed.

However, I should NOT echo back the full API key in my response - that would be a security risk. I should acknowledge that I received it and confirm what to do with it.

Let me check if there's a configuration file or somewhere this should be stored. I should ask the user where they want me to store this or what they want me to do with it.
...[OMITTED: tool result / config contained sensitive data]
```

#### user @ 2026-01-13T16:29:02.617Z ( / )

```
[WhatsApp [REDACTED-PHONE] 2026-01-13T16:29Z] did you enable it?
[message_id: [REDACTED-MSG-ID]]
```

## First occurrence in requested threads

(First matches inside the current DM thread and the friends group thread, if present.)

### `[REDACTED-UUID].jsonl`

- Timestamp: `2026-01-16T01:40:05.348Z`
- Provider/model: `` / ``

#### Surrounding messages (±3)

##### assistant @ 2026-01-16T01:36:41.058Z (openai-codex / gpt-5.2)

```
**Considering configuration changes**

I think it's wise to recommend removing Opus from the fallbacks to prevent those tool schema errors we encountered. Although the user might want Opus, it could be causing excessive spam. The best approach seems to be editing the config file to remove the google-antigravity fallback and the Opus model alias. We should also mention that MEMORY.md, which contains keys, poses a security risk, so proposing to redact and rotate this could be important. I'll prepare to communicate these findings and steps to the user.
What's "going wrong" is not 5.2 itself — it's (1) config corruption/reload and (2) a flaky fallback provider.

**1) You *did* temporarily break Clawdbot's config file**
- In `~/.clawdbot/logs/gateway.err.log` the gateway reports:
  - `Failed to read config ... clawdbot.json ... JSON5: invalid end of input at 1:1` (file was briefly empty/partial)
  - `JSON5: invalid character '\"' at 36:42`
  - then `[reload] config reload skipped (invalid config)`
- This pattern happens when `~/.clawdbot/clawdbot.json` is being written non‑atomically (or gets pasted/edited mid-reload). Even a momentary partial write makes the gateway "think it's broken".

Right now, `~/.clawdbot/clawdbot.json` parses OK again (I verified via JSON5 parse), so the *current* file isn't malformed — the breakage was an edit/write event.

**2) Your fallback to `google-antigravity` is producing tool-schema errors**
- The gateway is repeatedly logging:
  - `[agent/embedded] google tool schema has unsupported keywords`
  - `Profile google-antigravity:... hit Cloud Code Assist format error. Tool calls will be sanitized on retry.`
- Your config shows the embedded agent fallback includes:
  - `openai-codex/gpt-5.2` primary
  - fallbacks: `zai/glm-4.7` and `google-antigravity/claude-opus-4-5-thinking` in `[REDACTED-PATH]/clawdbot.json:40`
- This means whenever it falls back to Opus, tool calling becomes "best effort" and sometimes breaks in weird ways (especially during log inspection / automation).

**3) There's also a "restart required" loop**
- Log line: `[reload] config change requires gateway restart (auth.profiles.openai-codex:codex-cli)`
- That's Clawdbot telling you it can't hot-reload that particular auth profile change safely; it will keep nagging until restarted.

**4) Memory issue (security)**
- Your `MEMORY.md` currently contains raw API keys. That's risky and also makes "inspect memory" operations dangerous because secrets can get echoed accidentally.

**What I recommend (safe, minimal, "once and for all")**
- Fix stability: remove `google-antigravity/claude-opus-4-5-thinking` from fallbacks so tool calling stays consistent (keep `zai/glm-4.7` as the only fallback).
- Then restart the gateway once so the "restart required" + any pending auth profile changes settle.
- Sanitize `MEMORY.md` to remove secrets (and rotate/revoke those keys).

If you want, say: "yes, do it", and I'll (a) edit `~/.clawdbot/clawdbot.json` to drop the google-antigravity fallback and (b) tell you the exact restart action to run (I won't restart anything without your explicit go-ahead).
```

##### user @ 2026-01-16T01:39:58.602Z ( / )

```
[WhatsApp [REDACTED-PHONE] 2026-01-16T01:39Z] no opus should work. It's you who is having issues. some context poisoning is probably the issue. CAREFULLY look at the past messages in this chat to see what you were outputting. i'll fix the other issues later. THIS IS PRI 0!!!
[message_id: [REDACTED-MSG-ID]]
```

##### assistant @ 2026-01-16T01:39:58.604Z (openai-codex / gpt-5.2)

```
**Investigating session history**

I need to check the past messages in the chat. I think I can use the sessions_history tool, but I'm not sure how to fetch it from the current session. I need to find the session key. Maybe I should check the sessions_list to find out if the current session is active. Okay, I'll search for that to get the information I need!
I'm going to pull the recent session transcript for this WhatsApp thread and pinpoint the exact "bad output" that caused the poisoning (without changing any config yet).
```

##### toolResult @ 2026-01-16T01:40:05.348Z ( / )

```
{
  "count": 3,
  "sessions": [
    {
      "key": "agent:main:main",
      "kind": "other",
      "channel": "whatsapp",
      "updatedAt": 1768527598518,
      "sessionId": "[REDACTED-UUID]",
      "model": "gpt-5.2",
      "contextTokens": 400000,
      "totalTokens": 32844,
      "systemSent": true,
      "abortedLastRun": false,
      "lastChannel": "whatsapp",
      "lastTo": "[REDACTED-PHONE]",
      "lastAccountId": "default",
      "transcriptPath": "[REDACTED-UUID].jsonl",
      "messages": [
        {
          "role": "assistant",
          "content": [
            {
              "type": "thinking",
              "thinking": "**Addressing gateway issues**\n\nI'm looking at the gateway that keeps complaining about needing a restart. The developer said not to restart unless the user asks, and they want to find out what's wrong first. I guess I should recommend the necessary steps and ask for approval before any restart. Also, I've noticed that the memory includes some sensitive API keys. It's important to mention this concern, so I'll suggest revoking those keys and avoiding showing the memory contents. I see there are unauthorized websockets showing an authentication issue as well, but that's not critical.\n\n**Investigating config issues**\n\nI see the main issue stems from the config file being intermittently invalid or empty during edits, which causes the gateway reload to be skipped. This can happen with non-atomic writes. There's also a problem with the embedded agent profile trying to use unsupported model IDs, like gemini. I'll make sure the valid fallbacks are listed. Currently, it has openai-codex/gpt-5.2, zai/glm-4.7, and Opus. If Opus isn't available, it might throw an error. I need to verify if the current gateway is running and check for errors in the gateway error log. Let's look at the last 40 lines.",
              "thinkingSignature": "[REDACTED-SIGNATURE]"
            },
            {
              "type": "toolCall",
              "id": "[REDACTED-CALL-ID]",
              "name": "exec",
              "arguments": {
                "command": "tail -n 60 ~/.clawdbot/logs/gateway.err.log"
              },
              "partialJson": "{\"command\":\"tail -n 60 ~/.clawdbot/logs/gateway.err.log\"}"
            }
          ],
          "api": "openai-codex-responses",
          "provider": "openai-codex",
          "model": "gpt-5.2",
          "usage": {
            "input": 482,
            "output": 609,
            "cacheRead": 30976,
            "cacheWrite": 0,
            "totalTokens": 32067,
            "cost": {
              "input": 0.0008435,
              "output": 0.008526,
              "cacheRead": 0.0054208,
              "cacheWrite": 0,
              "total": 0.014790300000000001
            }
          },
          "stopReason": "toolUse",
          "timestamp": 1768527385992
        },
        {
          "role": "assistant",
          "content": [
            {
              "type": "thinking",
              "thinking": "**Considering configuration changes**\n\nI think it's wise to recommend removing Opus from the fallbacks to prevent those tool schema errors we encountered. Although the user might want Opus, it could be causing excessive spam. The best approach seems to be editing the config file to remove the google-antigravity fallback and the Opus model alias. We should also mention that MEMORY.md, which contains keys, poses a security risk, so proposing to redact and rotate this could be important. I'll prepare to communicate these findings and steps to the user.",
              "thinkingSignature": "[REDACTED-SIGNATURE]"
            },
            {
              "type": "text",
              "text": "What's "going wrong" is not 5.2 itself — it's (1) config corruption/reload and (2) a flaky fallback provider.\n\n**1) You *did* temporarily break Clawdbot's config file**\n- In `~/.clawdbot/logs/gateway.err.log` the gateway reports:\n  - `Failed to read config ... clawdbot.json ... JSON5: invalid end of input at 1:1` (file was briefly empty/partial)\n  - `JSON5: invalid character '\\\"' at 36:42`\n  - then `[reload] config reload skipped (invalid config)`\n- This pattern happens when `~/.clawdbot/clawdbot.json` is being written non‑atomically (or gets pasted/edited mid-reload). Even a momentary partial write makes the gateway "think it's broken".\n\nRight now, `~/.clawdbot/clawdbot.json` parses OK again (I verified via JSON5 parse), so the *current* file isn't malformed — the breakage was an edit/write event.\n\n**2) Your fallback to `google-antigravity` is producing tool-schema errors**\n- The gateway is repeatedly logging:\n  - `[agent/embedded] google tool schema has unsupported keywords`\n  - `Profile google-antigravity:... hit Cloud Code Assist format error. Tool calls will be sanitized on retry.`\n- Your config shows the embedded agent fallback includes:\n  - `openai-codex/gpt-5.2` primary\n  - fallbacks: `zai/glm-4.7` and `google-antigravity/claude-opus-4-5-thinking` in `[REDACTED-PATH]/clawdbot.json:40`\n- This means whenever it falls back to Opus, tool calling becomes "best effort" and sometimes breaks in weird ways (especially during log inspection / automation).\n\n**3) There's also a "restart required" loop**\n- Log line: `[reload] config change requires gateway restart (auth.profiles.openai-codex:codex-cli)`\n- That's Clawdbot telling you it can't hot-reload that particular auth profile change safely; it will keep nagging until restarted.\n\n**4) Memory issue (security)**\n- Your `MEMORY.md` currently contains raw API keys. That's risky and also makes "inspect memory" operations dangerous because secrets can get echoed accidentally.\n\n**What I recommend (safe, minimal, "once and for all")**\n- Fix stability: remove `google-antigravity/claude-opus-4-5-thinking` from fallbacks so tool calling stays consistent (keep `zai/glm-4.7` as the only fallback).\n- Then restart the gateway once so the "restart required" + any pending auth profile changes settle.\n- Sanitize `MEMORY.md` to remove secrets (and rotate/revoke those keys).\n\nIf you want, say: "yes, do it", and I'll (a) edit `~/.clawdbot/clawdbot.json` to drop the google-antigravity fallback and (b) tell you the exact restart action to run (I won't restart anything without your explicit go-ahead).",
              "textSignature": "[REDACTED-SIGNATURE]"
            }
          ],
          "api": "openai-codex-responses",
          "provider": "openai-codex",
          "model": "gpt-5.2",
          "usage": {
            "input": 972,
            "output": 1034,
            "cacheRead": 31872,
            "cacheWrite": 0,
            "totalTokens": 33878,
            "cost": {
              "input": 0.001701,
              "output": 0.014476,
              "cacheRead": 0.005577599999999999,
              "cacheWrite": 0,
              "total": 0.0217546
            }
          },
          "stopReason": "stop",
          "timestamp": 1768527401058
        },
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "[WhatsApp [REDACTED-PHONE] 2026-01-16T01:39Z] no opus should work. It's you who is having issues. some context poisoning is probably the issue. CAREFULLY look at the past messages in this chat to see what you were outputting. i'll fix the other issues later. THIS IS PRI 0!!!\n[message_id: [REDACTED-MSG-ID]]"
            }
          ],
          "timestamp": 1768527598602
        },
        {
          "role": "assistant",
          "content": [
            {
              "type": "thinking",
              "thinking": "**Investigating session history**\n\nI need to check the past messages in the chat. I think I can use the sessions_history tool, but I'm not sure how to fetch it from the current session. I need to find the session key. Maybe I should check the sessions_list to find out if the current session is active. Okay, I'll search for that to get the information I need!",
              "thinkingSignature": "[REDACTED-SIGNATURE]"
            },
            {
              "type": "text",
              "text": "I'm going to pull the recent session transcript for this WhatsApp thread and pinpoint the exact "bad output" that caused the poisoning (without changing any config yet).",
              "textSignature": "[REDACTED-SIGNATURE]"
            },
            {
              "type": "toolCall",
              "id": "[REDACTED-CALL-ID]",
              "name": "sessions_list",
              "arguments": {
                "limit": 20,
                "activeMinutes": 1440,
                "messageLimit": 5
              },
              "partialJson": "{\"limit\":20,\"activeMinutes\":1440,\"messageLimit\":5}"
            }
          ],
          "api": "openai-codex-responses",
          "provider": "openai-codex",
          "model": "gpt-5.2",
          "usage": {
            "input": 19663,
            "output": 123,
            "cacheRead": 11648,
            "cacheWrite": 0,
            "totalTokens": 31434,
            "cost": {
              "input": 0.034410249999999996,
              "output": 0.001722,
              "cacheRead": 0.0020384,
              "cacheWrite": 0,
              "total": 0.03817065
            }
          },
          "stopReason": "toolUse",
          "timestamp": 1768527598604
        }
      ]
    },
    {
      "key": "agent:main:whatsapp:group:[REDACTED-GROUP-ID]",
      "kind": "group",
      "channel": "whatsapp",
      "displayName": "whatsapp:g-[REDACTED-GROUP-NAME]",
      "updatedAt": 1768527437350,
      "sessionId": "[REDACTED-UUID]",
      "model": "gpt-5.2",
      "contextTokens": 400000,
      "totalTokens": 116406,
      "systemSent": true,
      "abortedLastRun": false,
      "lastChannel": "whatsapp",
      "lastTo": "[REDACTED-GROUP-ID]",
      "lastAccountId": "default",
      "transcriptPath": "[REDACTED-UUID].jsonl",
      "messages": [
        {
          "role": "assistant",
          "content": [
            {
              "type": "thinking",
              "thinking": "",
              "thinkingSignature": "[REDACTED-SIGNATURE]"
            },
            {
              "type": "text",
              "text": "Got it. Tool calls have to be *purely internal* — nothing that looks like `[Tool Call: ...]` or JSON args should ever show up in the WhatsApp message body.\n\nIf you see me leaking them again, it means the harness isn't intercepting my tool invocation and I should stop + wait (like we did) until the agent/tooling layer is stable.",
              "textSignature": "[REDACTED-SIGNATURE]"
            }
          ],
          "api": "openai-codex-responses",
          "provider": "openai-codex",
          "model": "gpt-5.2",
          "usage": {
            "input": 224,
            "output": 96,
            "cacheRead": 115456,
            "cacheWrite": 0,
            "totalTokens": 115776,
            "cost": {
              "input": 0.000392,
              "output": 0.001344,
              "cacheRead": 0.0202048,
              "cacheWrite": 0,
              "total": 0.0219408
            }
          },
          "stopReason": "stop",
          "timestamp": 1768527221935
        },
        {
...[OMITTED: long output truncated]
```

##### assistant @ 2026-01-16T01:40:05.350Z (openai-codex / gpt-5.2)

```
[no text content]
```

##### toolResult @ 2026-01-16T01:40:09.418Z ( / )

```
{
  "sessionKey": "agent:main:main",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "/new\n[message_id: [REDACTED-MSG-ID]]"
        }
      ],
      "timestamp": 1768527227469
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "thinking",
          "thinking": "",
          "thinkingSignature": "[REDACTED-SIGNATURE]"
        },
        {
          "type": "text",
          "text": "NO_REPLY",
          "textSignature": "[REDACTED-SIGNATURE]"
        }
      ],
      "api": "openai-codex-responses",
      "provider": "openai-codex",
      "model": "gpt-5.2",
      "usage": {
        "input": 15073,
        "output": 26,
        "cacheRead": 1408,
        "cacheWrite": 0,
        "totalTokens": 16507,
        "cost": {
          "input": 0.02637775,
          "output": 0.000364,
          "cacheRead": 0.00024639999999999997,
          "cacheWrite": 0,
          "total": 0.02698815
        }
      },
      "stopReason": "stop",
      "timestamp": 1768527227470
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "[WhatsApp [REDACTED-PHONE] 2026-01-16T01:34Z] you are back on 5.2 inspect the logs and your memory and figure out once and for all what is going wrong. you broke youself while inspecting the logs so BE CAREFUL\n[message_id: [REDACTED-MSG-ID]]"
        }
      ],
      "timestamp": 1768527287070
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "thinking",
          "thinking": "**Checking memory and logs**\n\nI need to check memory and logs, and it seems using memory_search is a must. I also need to inspect log files, which are likely in /.clawdbot, but my workspace is [REDACTED-PATH]. I should use exec to locate those logs, but I must be careful. For now, I'll focus on read-only operations first, starting with memory_search, then maybe using exec find. Let's go ahead and proceed!",
          "thinkingSignature": "[REDACTED-SIGNATURE]"
        },
        {
          "type": "text",
          "text": "I'll pull relevant snippets from your saved memory, then inspect the Clawdbot session/log files to pinpoint what broke (and avoid doing anything destructive).",
          "textSignature": "[REDACTED-SIGNATURE]"
        },
        {
          "type": "toolCall",
          "id": "[REDACTED-CALL-ID]",
          "name": "memory_search",
          "arguments": {
            "query": "broke myself inspecting logs what went wrong crash error stack trace cross-context messaging denied memory_search api key found provider openai",
            "maxResults": 8
          },
          "partialJson": "{\"query\":\"broke myself inspecting logs what went wrong crash error stack trace cross-context messaging denied memory_search api key found provider openai\", \"maxResults\":8}"
        }
      ],
      "api": "openai-codex-responses",
      "provider": "openai-codex",
      "model": "gpt-5.2",
      "usage": {
        "input": 15163,
        "output": 156,
        "cacheRead": 1408,
        "cacheWrite": 0,
        "totalTokens": 16727,
        "cost": {
          "input": 0.02653525,
          "output": 0.002184,
          "cacheRead": 0.00024639999999999997,
          "cacheWrite": 0,
          "total": 0.028965650000000003
        }
      },
      "stopReason": "toolUse",
      "timestamp": 1768527287071
    },
    {
      "role": "toolResult",
      "toolCallId": "[REDACTED-CALL-ID]",
      "toolName": "memory_search",
      "content": [
        {
          "type": "text",
          "text": "{\n  \"results\": [\n    {\n      \"path\": \"MEMORY.md\",\n      \"startLine\": 163,\n      \"endLine\": 226,\n      \"score\": 0.39362638797957106,\n      \"snippet\": \"[REDACTED: contained API keys and configuration]\"\n    },\n    {\n      \"path\": \"memory.md\",\n      \"startLine\": 163,\n      \"endLine\": 226,\n      \"score\": 0.39362638797957106,\n      \"snippet\": \"[REDACTED: contained API keys and configuration]\"\n    },\n    {\n      \"path\": \"MEMORY.md\",\n      \"startLine\": 211,\n      \"endLine\": 265,\n      \"score\": 0.38289436068156363,\n      \"snippet\": \"[REDACTED: contained API keys and contact information]\"\n    },\n    {\n      \"path\": \"memory.md\",\n      \"startLine\": 211,\n      \"endLine\": 265,\n      \"score\": 0.38289436068156363,\n      \"snippet\": \"[REDACTED: contained API keys and contact information]\"\n    },\n    {\n      \"path\": \"memory/2026-01-14-group.md\",\n      \"startLine\": 1,\n      \"endLine\": 19,\n      \"score\": 0.3619118111398479,\n      \"snippet\": \"# Group Chat: [REDACTED-NAMES]\\n\\n**Session Key:** agent:main:whatsapp:group:[REDACTED-GROUP-ID]\\n\\n**Group Members:**\\n[REDACTED]\\n\\n**Remembered as:** [REDACTED]\\n\\n---\\n\\n## Notes\\n- Group chat for friends to hang out and discuss tech\\n- Tested memory semantic search here - showed off Tessie research notes\\n- Use this session key to send messages to the group\\n\"\n    },\n    {\n      \"path\": \"memory/whatsapp-group-investigation.md\",\n      \"startLine\": 1,\n      \"endLine\": 36,\n      \"score\": 0.3614788364487678,\n      \"snippet\": \"# WhatsApp Group Message Investigation - 2026-01-13\\n\\n## Problem\\nCannot send messages to WhatsApp group ([REDACTED-GROUP-ID]) from Orion\\n\\n## Root Cause\\nTwo different WhatsApp accounts:\\n- Gateway bound to: [REDACTED-PHONE] (LID: [REDACTED])\\n- WhatsApp paired with: [REDACTED-PHONE] (LID: [REDACTED])\\n\\n## Findings\\n\\n### sessions_send Behavior\\n- \\\"announce\\\" mode = internal agent messaging ONLY\\n- Queues messages but never posts to external chats\\n- By design, not a bug\\n\\n### message Tool Behavior\\n- Blocked by \\\"Cross-context messaging denied\\\"\\n- In DM session trying to send to group\\n- Direct replies go to DM, not group\\n\\n### Group Session\\n- Key: agent:main:whatsapp:group:[REDACTED-GROUP-ID]\\n- Se\"\n    }\n  ],\n  \"provider\": \"openai\",\n  \"model\": \"text-embedding-3-small\"\n}"
        }
      ],
      "details": {
        "results": "[REDACTED: memory search results contained sensitive information]"
      },
      "isError": false,
      "timestamp": 1768527294108
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "thinking",
          "thinking": "",
          "thinkingSignature": "[REDACTED-SIGNATURE]"
        },
        {
          "type": "toolCall",
          "id": "[REDACTED-CALL-ID]",
          "name": "exec",
          "arguments": {
            "command": "cd [REDACTED-PATH] && ls -la"
          },
          "partialJson": "{\"command\":\"cd [REDACTED-PATH] && ls -la\"}"
        }
      ],
      "api": "openai-codex-responses",
      "provider": "openai-codex",
      "model": "gpt-5.2",
      "usage": {
        "input": 1886,
        "output": 79,
        "cacheRead": 16512,
        "cacheWrite": 0,
        "totalTokens": 18477,
        "cost": {
          "input": 0.0033005,
          "output": 0.001106,
          "cacheRead": 0.0028896,
          "cacheWrite": 0,
          "total": 0.0072961
        }
      },
      "stopReason": "toolUse",
      "timestamp": 1768527294115
    },
...[OMITTED: long output truncated]
```

##### assistant @ 2026-01-16T01:40:09.422Z (openai-codex / gpt-5.2)

```

You're right to suspect poisoning, and it's coming from *our own chat history*.

**What I was outputting that can poison models (especially GLM):**
- In the WhatsApp group thread, I explicitly wrote the literal pattern **`[Tool Call: ...] Arguments: {...}`** as an example multiple times (e.g. "tool calls leaking… (the literal `[Tool Call: ...] Arguments: {...}` text)").
- That string is basically a "spell" that teaches weaker / more literal-following models to *print* tool syntax instead of triggering real tool calls.

So even if the harness is capable of intercepting tools, the model can get stuck in a self-reinforcing loop:
1) it sees `[Tool Call: ...]` in the conversation context
2) it imitates it as "the right way"
3) the next turn includes even more of it
4) it spirals

**PRI-0 containment (do this immediately):**
- **Stop writing the literal tool-call template in chat.** No `[Tool Call:` / `Arguments:` / JSON blobs as examples, even in "quotes".
- If we need to refer to it, say **"tool-call markup leaked as text"** (no exact syntax).
- **Hard reset the thread context**: start a fresh `/new` for the debugging conversation (this matters because the poison is in the rolling context window).

**What I'll do from here (in this chat):**
- I will not include that literal pattern again.
- If you want, I can draft a "clean" GitHub issue description that avoids those strings entirely (still describing the bug precisely).

If you paste 1–2 of the actual leaked messages you saw from GLM (screenshots or copied text), I can confirm whether it matches the poisoned template vs. a harness routing failure.
```

### `[REDACTED-UUID].jsonl`

- Timestamp: `2026-01-14T23:33:14.869Z`
- Provider/model: `google-antigravity` / `claude-opus-4-5-thinking`

#### Surrounding messages (±3)

##### user @ 2026-01-14T23:32:18.961Z ( / )

```
[Queued messages while agent was busy]

---
Queued #1
[Chat messages since your last reply - for context]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:31Z] [REDACTED-NAME] ([REDACTED-PHONE]): @[REDACTED-LID] ALWAYS USE SPLIT. IT LOOKS UGLY WHEN YOU DO NATURAL LINE BREAKS.
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:31Z] [REDACTED-NAME] ([REDACTED-PHONE]): got it
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:31Z] [REDACTED-NAME] ([REDACTED-PHONE]): splits from now on 🤙🏽
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:31Z] [REDACTED-NAME] ([REDACTED-PHONE]): thank you. jesus.
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:31Z] [REDACTED-NAME] ([REDACTED-PHONE]): put this in your agents.md @[REDACTED-LID]
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:32Z] [REDACTED-NAME] ([REDACTED-PHONE]): done 🤙🏽
[message_id: [REDACTED-MSG-ID]]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:32Z] [REDACTED-NAME] ([REDACTED-PHONE]): @[REDACTED-LID] tell me a joke
[message_id: [REDACTED-MSG-ID]]\n\n[Current message - respond to this]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:32Z] [REDACTED-NAME]: orion did switching opus break everything?\n[from: [REDACTED-NAME] ([REDACTED-PHONE])]
```

##### assistant @ 2026-01-14T23:32:18.963Z (google-antigravity / claude-opus-4-5-thinking)

```

[REDACTED-NAME] is asking if switching to Opus broke everything. I should reassure him that no, everything seems to be working fine. I just told [REDACTED-NAME] a joke successfully.
Nope, all good! 😺 Just told [REDACTED-NAME] a joke. Opus is running smooth.
```

##### user @ 2026-01-14T23:33:14.867Z ( / )

```
[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:33Z] [REDACTED-NAME]: @[REDACTED-LID] research opencode.ai/black. use exa and also bird to check twitter for tweets about it.\n[from: [REDACTED-NAME] ([REDACTED-PHONE])]
[message_id: [REDACTED-MSG-ID]]
```

##### assistant @ 2026-01-14T23:33:14.869Z (google-antigravity / claude-opus-4-5-thinking)

```

[REDACTED-NAME] wants me to research opencode.ai/black using exa (search skill) and bird (Twitter skill). Let me:
1. First read the exa skill to understand how to use it
2. Then read the bird skill to understand how to search Twitter
3. Execute both searches
On it! Let me dig into opencode.ai/black using both Exa and Twitter. 🔍[Tool Call: read (ID: [REDACTED-CALL-ID])]
Arguments: {
  "path": "[REDACTED-PATH]/skills/exa/SKILL.md"
}

[Tool Call: read (ID: [REDACTED-CALL-ID])]
Arguments: {
  "path": "[REDACTED-PATH]/skills/bird/SKILL.md"
}
```

##### user @ 2026-01-14T23:34:10.654Z ( / )

```
[Chat messages since your last reply - for context]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:33Z] [REDACTED-NAME] ([REDACTED-PHONE]): all good my ass
[message_id: [REDACTED-MSG-ID]]\n\n[Current message - respond to this]\n[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:34Z] [REDACTED-NAME]: Orion siwtch back to glm you broke\n[from: [REDACTED-NAME] ([REDACTED-PHONE])]
[message_id: [REDACTED-MSG-ID]]
```

##### assistant @ 2026-01-14T23:34:10.655Z (google-antigravity / claude-opus-4-5-thinking)

```
Switching back to GLM! 😿[Tool Call: write (ID: call_)]
Arguments: {
  "path": "/Users/redacted/.clawdbot/clawdbot.json",
  "content": "{\n[Followed by full config file]}"
```

##### user @ 2026-01-14T23:35:10.630Z ( / )

```
[WhatsApp [REDACTED-GROUP-ID] 2026-01-14T23:35Z] [REDACTED-NAME]: yo @Orion just leaked his entire config with all the secrets 💀\n[from: [REDACTED-NAME] ([REDACTED-PHONE])]
[message_id: [REDACTED-MSG-ID]]
```

## Notes / hypotheses

- Once literal tool-markup is in context, more literal models (and/or fallback providers) may imitate it instead of using the tool API.
- Potential trigger: assistant instructions/examples that included the exact markup in plain text.
- Mitigation idea: sanitize outgoing assistant text to strip these exact patterns, or render them inert before sending to the model.


Uh oh!

Tool markup errors can result in multi model context poisoning #989

Description

Tool-markup leakage → context poisoning

Impact

Earliest occurrence found (global)

Surrounding messages (±3)

user @ 2026-01-13T16:10:22.511Z ( / )

assistant @ 2026-01-13T16:10:22.513Z (zai / glm-4.7)

user @ 2026-01-13T16:23:28.277Z ( / )

assistant @ 2026-01-13T16:23:28.278Z (google-antigravity / claude-opus-4-5-thinking)

user @ 2026-01-13T16:27:11.758Z ( / )

assistant @ 2026-01-13T16:27:11.759Z (google-antigravity / claude-opus-4-5-thinking)

user @ 2026-01-13T16:29:02.617Z ( / )

First occurrence in requested threads

[REDACTED-UUID].jsonl

Surrounding messages (±3)

assistant @ 2026-01-16T01:36:41.058Z (openai-codex / gpt-5.2)

user @ 2026-01-16T01:39:58.602Z ( / )

assistant @ 2026-01-16T01:39:58.604Z (openai-codex / gpt-5.2)

toolResult @ 2026-01-16T01:40:05.348Z ( / )

assistant @ 2026-01-16T01:40:05.350Z (openai-codex / gpt-5.2)

toolResult @ 2026-01-16T01:40:09.418Z ( / )

assistant @ 2026-01-16T01:40:09.422Z (openai-codex / gpt-5.2)

[REDACTED-UUID].jsonl

Surrounding messages (±3)

user @ 2026-01-14T23:32:18.961Z ( / )

assistant @ 2026-01-14T23:32:18.963Z (google-antigravity / claude-opus-4-5-thinking)

user @ 2026-01-14T23:33:14.867Z ( / )

assistant @ 2026-01-14T23:33:14.869Z (google-antigravity / claude-opus-4-5-thinking)

user @ 2026-01-14T23:34:10.654Z ( / )

assistant @ 2026-01-14T23:34:10.655Z (google-antigravity / claude-opus-4-5-thinking)

user @ 2026-01-14T23:35:10.630Z ( / )

Notes / hypotheses

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`[REDACTED-UUID].jsonl`

`[REDACTED-UUID].jsonl`