Skip to content

fix: normalize message trimming behavior across all model providers#2516

Merged
simonferquel merged 5 commits intodocker:mainfrom
simonferquel-clanker:fix/message-trimming-consistency
Apr 27, 2026
Merged

fix: normalize message trimming behavior across all model providers#2516
simonferquel merged 5 commits intodocker:mainfrom
simonferquel-clanker:fix/message-trimming-consistency

Conversation

@simonferquel-clanker
Copy link
Copy Markdown
Contributor

Summary

Fixes #2515.

Ensures strings.TrimSpace() is used only as a skip guard — never to mutate the content that reaches the provider API. The original, untrimmed value is always forwarded when the content is non-whitespace.

Changes by provider

Anthropic (standard — pkg/model/provider/anthropic/client.go)

  • Single-part user message: if txt := strings.TrimSpace(msg.Content); txt != ""if strings.TrimSpace(msg.Content) != "", now sends msg.Content.
  • convertUserMultiContent text parts: same pattern — sends part.Text.
  • convertToolResultBlock single-string path: sends original msg.Content; whitespace-only is normalized to "" (not skipped — tool_result must be present for every preceding tool_use).
  • convertToolResultBlock multi-content text parts: sends part.Text.

Anthropic Beta (pkg/model/provider/anthropic/beta_converter.go)

  • Same fixes as standard, applied to convertBetaMessages, convertBetaUserMultiContent, and convertBetaToolResultBlock.

OpenAI Chat Completions / oaistream (pkg/model/provider/oaistream/messages.go)

  • ConvertMultiContent: switched from pre-allocated slice to append-based; whitespace-only / empty text parts are now skipped (previously a zero-value ChatCompletionContentPartUnionParam was included).
  • Tool-role multi-content text parts: added TrimSpace == "" skip guard.

OpenAI Responses API (pkg/model/provider/openai/client.go)

  • Multi-part user message text parts in convertMessagesToResponseInput: added TrimSpace == "" skip guard.
  • Note: single-part user messages have no guard (consistent with previous behavior; the single-part path is unchanged).

Google Gemini (pkg/model/provider/gemini/client.go)

  • Single-part user/assistant messages: guard strengthened from msg.Content != "" to strings.TrimSpace(msg.Content) != "" — whitespace-only messages (e.g. " ") are now dropped rather than forwarded.
  • convertMultiContent text parts: added TrimSpace == "" skip guard.

Tests

  • Updated oaistream/messages_test.go: the existing "image without URL" test was asserting the old broken behavior (zero-value entry for nil-URL image); corrected to wantCount: 0 and renamed to "image with nil URL produces no part".
  • Added "whitespace-only text part skipped" test case to ConvertMultiContent.
  • go build ./pkg/model/provider/... and go test ./pkg/model/provider/... all pass.

Ensure strings.TrimSpace() is used only as a skip guard, never to mutate
the content that reaches the provider API. Fixes inconsistencies across
Anthropic, OpenAI, Bedrock, Gemini, and oaistream-based providers.

Changes:
- Anthropic (standard + Beta): single-part user messages, multi-part text
  parts, and tool-result content now send the original untrimmed value.
  Tool-result blocks are normalized to empty string (not skipped) when
  whitespace-only, because every tool_use must have a corresponding
  tool_result.
- OpenAI oaistream / Chat Completions: ConvertMultiContent switched from
  pre-allocated slice to append-based; adds TrimSpace guard on text parts
  so whitespace-only and empty parts are not forwarded. Same guard added
  to tool-role multi-content text parts.
- OpenAI Responses API: adds TrimSpace guard on multi-part user message
  text parts in convertMessagesToResponseInput.
- Gemini: strengthens single-part guard from != "" to TrimSpace != "";
  adds TrimSpace guard in convertMultiContent for multi-part text parts.
- Test: updates the oaistream ConvertMultiContent test to assert the
  corrected behavior (nil-URL image part produces no output part, not a
  zero-value entry); adds whitespace-only text part test case.

Fixes docker#2515

Assisted-By: docker-agent
@simonferquel-clanker simonferquel-clanker requested a review from a team as a code owner April 27, 2026 07:49
…opic API

The previous Anthropic provider code trimmed trailing whitespace before
sending content to the API. Now that the original untrimmed content is sent,
cassettes need to reflect the actual bodies:

- TestDebug_Title/Anthropic*: sessiontitle generator builds prompts with
  fmt.Fprintf('%d. %s\n', ...) + a trailing '\n\n' in the format string,
  producing three trailing newlines. The old TrimSpace stripped these; the
  new code sends them as-is.

- TestExec_Anthropic_ToolCall: list_directory tool output ends with '\n'
  (from fmt.Fprintf 'FILE %s\n'). The old TrimSpace on tool results stripped
  this; the new code sends the original content.

Assisted-By: docker-agent
trungutt
trungutt previously approved these changes Apr 27, 2026
@rumpl
Copy link
Copy Markdown
Member

rumpl commented Apr 27, 2026

Is it possible to normalize the behaviour in only one place? In the place where we make the slice of messages of the session?

…Messages

Addresses reviewer feedback: 'Is it possible to normalise the behaviour in only
one place? In the place where we make the slice of messages of the session?'

Add normalizeMessageContent() to session.GetMessages as the single, authoritative
place where whitespace-only content is pruned before any provider converter sees
it. This replaces the patchwork of per-provider TrimSpace skip-guards.

Rules applied by normalizeMessageContent:
- Tool-result messages (Role=tool) are always forwarded, even when empty, because
  every tool_use block must have a corresponding tool_result.
- Assistant messages with ToolCalls or FunctionCall are always forwarded even when
  Content is empty — the content field is optional when tool calls are present.
- Any other message whose Content is whitespace-only (and has no MultiContent) is
  dropped.
- Within MultiContent, text parts whose Text is whitespace-only are stripped;
  non-text parts (images, files) are preserved as-is. If all parts are stripped the
  whole message is dropped.

Provider-converter changes:
- All per-converter TrimSpace skip-guards for user/assistant/system messages
  removed (they are now redundant): Anthropic std+Beta convertMessages,
  oaistream ConvertMultiContent + tool-role multi-content, OpenAI Responses API
  user multi-content, Gemini convertMessagesToGemini + convertMultiContent.
- Pre-existing assistant empty-skip guards in oaistream and openai/client.go are
  kept: they handle streaming artifacts (max_tokens assistant deltas) that are
  produced outside of GetMessages and may not pass through normalizeMessageContent.
- Anthropic system blocks (extractSystemBlocks / extractBetaSystemBlocks) continue
  to TrimSpace because YAML literal-block instructions (instruction: |) always
  append a trailing newline, and the system-block builder is the right place to
  normalise that for the Anthropic API payload.
- The Anthropic send-original correctness fix (tool_result and conversation-message
  content: send original string, not trimmed value) is unchanged.

Tests updated throughout to document the new 'converter passes through, session
layer filters' contract.

Assisted-By: docker-agent
@simonferquel-clanker
Copy link
Copy Markdown
Contributor Author

Done — added normalizeMessageContent() to session.GetMessages as the single authoritative place.

What it does:

  • Drops messages whose Content is whitespace-only and have no MultiContent, ToolCalls, or FunctionCall
  • Strips whitespace-only text parts from MultiContent slices (preserves image/file parts); drops the message if all parts are stripped
  • Always forwards Role=tool messages (even empty) — every tool_use must have a corresponding tool_result
  • Always forwards assistant messages that have ToolCalls — content is optional when tool calls are present

What stayed in the providers:

  • extractSystemBlocks / extractBetaSystemBlocks still trim system-message text, because YAML instruction: | blocks always append a trailing newline and the system-block builder is the right place to handle that
  • The two pre-existing assistant empty-skip guards in oaistream and openai/client.go are kept: they handle streaming artifacts (empty assistant deltas from max_tokens cutoffs) that can be stored in the session and must still be discarded at conversion time
  • Anthropic's tool-result normalization (whitespace → empty string, not skip) is kept because tool results cannot be dropped

All per-provider skip-guards for user/assistant/multi-content text parts have been removed.

@simonferquel simonferquel merged commit 04ee63f into docker:main Apr 27, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent message/part trimming behavior across model providers

4 participants