Skip to content

Empty content on tool-call envelopes to stop model echo#240

Merged
lezama merged 1 commit into
mainfrom
fix/tool-call-empty-content
May 29, 2026
Merged

Empty content on tool-call envelopes to stop model echo#240
lezama merged 1 commit into
mainfrom
fix/tool-call-empty-content

Conversation

@lezama
Copy link
Copy Markdown
Contributor

@lezama lezama commented May 29, 2026

Problem

WP_Agent_Conversation_Loop records each tool call in the transcript as an envelope whose content is the human-readable string "Calling <tool_name>". Adapters that flatten the transcript to provider-side text (e.g. openclawp's WP AI Client adapter, which projects {role, content} per message and drops empty-content entries) pass that string back to the model on subsequent turns as a previous assistant text turn.

On long multi-tool conversations — especially with cheaper / faster models like Gemini Flash — the model pattern-matches the recurring "Calling X" precedent and begins emitting the literal string as its own text output instead of issuing a real function call or producing a final answer. The user sees the debug string as the bot's reply.

Repro (production-observed)

Surfaced in wp-carpeta (WhatsApp bot channel, Gemini gemini-3.5-flash, 5-turn conversation, 4 prior search-* tool calls):

turn 1: tool_call_count=2  (find-lote, search-datos)
turn 2: tool_call_count=1  (search-wa-messages)
turn 3: tool_call_count=1  (search-wa-messages)
turn 4: tool_call_count=1  (search-wa-messages)
turn 5: tool_call_count=0  output: "Calling carpeta__search-wa-messages"   ← regurgitated as text

OpenclaWP_Message_Adapter::last_assistant_text() then returns that string as the channel reply. The bot delivers "Calling carpeta__search-wa-messages" to the user.

Fix

Pass '' as the envelope content. The structured tool-call data (tool_name + parameters + tool_call_id + turn) already lives in the envelope's metadata/payload, which is what providers should be using to construct the actual function_call structure. Adapters that need a human-readable label for log/UI surfaces can derive one from metadata.tool_name.

This means downstream adapters that previously skipped empty-text messages (if ( '' === \$text ) continue;) will now naturally exclude these envelopes from the LLM-facing transcript — which is the correct behavior, since the function-calling pathway uses the structured tool declarations + responses, not in-transcript text.

Tests

All existing smoke tests pass — no assertion in the suite relied on the "Calling X" content string.

$ composer test
... (full suite green)

Compat notes

  • Provider adapters that already use envelope metadata for tool-call construction (the canonical path) are unaffected.
  • Adapters that displayed the human-readable label can derive it locally from metadata.tool_name (e.g. "Calling " . \$msg['metadata']['tool_name']).
  • Transcript persisters / replayers receive an envelope with the same type=tool_call + payload — only the rendered content changes.

🤖 Generated with Claude Code

When a tool call is recorded in the conversation transcript, the
envelope's `content` field carried a human-readable string ("Calling
<tool_name>"). Adapters that flatten the transcript to provider text
(e.g. openclawp's WP AI Client adapter) pass that string back to the
model on subsequent turns as a previous assistant text turn.

On long multi-tool conversations — especially with smaller / cheaper
models like Gemini Flash — the model pattern-matches that recurring
"Calling X" precedent and starts emitting the literal string as its
own text output instead of issuing a real function call or producing
a final answer. The user sees the debug string as the bot's reply.

Concrete failure mode observed on a wp-carpeta deployment (Gemini
gemini-3.5-flash, 5-turn convo with 4 search-* tool calls): turn 5
emitted `"Calling carpeta__search-wa-messages"` as plain text with
`tool_call_count=0`, breaking the channel response.

Fix: pass `''` as content. The structured tool-call data
(tool_name + parameters + tool_call_id + turn) is already in the
envelope's metadata/payload, which is what providers should be using
to construct the actual function_call structure. Adapters that need
a human-readable label for log/UI surfaces can derive one from
`metadata.tool_name`.

Smoke tests pass unchanged — no assertion in the suite relied on the
"Calling X" content string.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@lezama lezama merged commit f0879d6 into main May 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant