fix(tools): truncate MCP tool output and handle unsupported content types#1843
fix(tools): truncate MCP tool output and handle unsupported content types#1843
Conversation
…ypes MCP tools (e.g. Playwright) can return arbitrarily large output (500KB+ DOM, multi-MB base64 screenshots) that bypasses the 50K char limit enforced on built-in tools via ToolResultBuilder. This causes context overflow (API 400) or wire pipe stalls. Add a 100K character budget to convert_mcp_tool_result(): - Text parts are truncated when the cumulative budget is exceeded - A truncation notice with pagination guidance is appended - Unsupported content types (ValueError from convert_mcp_content) are caught and replaced with a placeholder instead of crashing The 100K char limit aligns with Claude Code's DEFAULT_MAX_MCP_OUTPUT_TOKENS (25K tokens × 4 chars/token = 100K chars). Codex CLI has no MCP truncation at all. Tested on Linux (Docker, Python 3.14.4) and Windows (SSH, Python 3.14.2).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d03315c7d6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| converted = TextPart(text=f"[Unsupported content: {exc}]") | ||
|
|
||
| # Apply character budget to text parts | ||
| if isinstance(converted, TextPart): |
There was a problem hiding this comment.
Enforce output cap for non-text MCP parts
The new budget check only runs when converted is a TextPart, so large ImageContent/AudioContent blocks converted to data URLs bypass MCP_MAX_OUTPUT_CHARS entirely. In practice, a single multi-megabyte base64 image from an MCP tool (e.g. Playwright screenshot) will still be forwarded unbounded, which can trigger the same request-size/context overflow and wire-stall failures this change is trying to prevent.
Useful? React with 👍 / 👎.
- Add isinstance narrowing via _text() helper for ContentPart.text access - Wrap uri string literal with AnyUrl() for BlobResourceContents
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 83f37663df
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if isinstance(converted, TextPart): | ||
| if char_budget <= 0: | ||
| truncated = True | ||
| continue | ||
| if len(converted.text) > char_budget: |
There was a problem hiding this comment.
Cap non-text MCP output before appending
The new truncation logic only runs for TextPart, so ImageContent/AudioContent converted by convert_mcp_content into data: URLs are appended unbounded. In MCP responses that include large base64 blobs (for example Playwright screenshots), this still allows multi-megabyte payloads to enter tool messages, which can overflow model context/request limits and recreate the same turn-stall failures this change is intended to prevent.
Useful? React with 👍 / 👎.
| converted = TextPart(text=f"[Unsupported content: {exc}]") | ||
|
|
||
| # Apply character budget to text parts | ||
| if isinstance(converted, TextPart): |
There was a problem hiding this comment.
This budget only applies to TextParts, but convert_mcp_content() maps MCP images to ImageURLPart data URLs without truncation (packages/kosong/src/kosong/tooling/mcp.py#L15-L17). Those image payloads are then forwarded to providers unchanged (packages/kosong/src/kosong/contrib/chat_provider/openai_responses.py#L408-L410, packages/kosong/src/kosong/contrib/chat_provider/anthropic.py#L500-L523). That means multi-MB screenshots still bypass the new 100K cap, so the context-overflow / wire-stall failure mode described in the PR remains for non-text MCP results.
…udio/video) The previous fix only applied the 100K char budget to TextPart, leaving inline data: URLs from ImageContent/AudioContent/EmbeddedResource uncapped. A Playwright screenshot (10-50MB base64) would pass through unchanged and blow up the context. Now all content types share the same budget: - Text parts: truncated in-place (unchanged) - Media parts (image/audio/video): measured by data URL length; dropped entirely when they exceed remaining budget - Truncation notice appended when any content is dropped Add 4 regression tests for oversized media scenarios.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d555acac0c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "Skipping unsupported MCP content part: {error}", | ||
| error=exc, | ||
| ) | ||
| converted = TextPart(text=f"[Unsupported content: {exc}]") |
There was a problem hiding this comment.
Use fixed placeholder for unsupported MCP content
convert_mcp_tool_result currently builds the fallback text as "[Unsupported content: {exc}]", but convert_mcp_content raises ValueError(f"Unsupported MCP tool result part: {part}") for unknown blocks (see packages/kosong/src/kosong/tooling/mcp.py). If an unsupported block contains large inline data (e.g., an embedded text/blob resource), this placeholder can consume most or all of the 100K budget and crowd out later valid parts, while also leaking raw payload text to the model instead of a small diagnostic. Please emit a bounded placeholder (e.g., type-only) rather than echoing the full exception string.
Useful? React with 👍 / 👎.
Summary
convert_mcp_tool_result()— MCP tools like Playwright can return 500KB+ DOM or multi-MB base64, bypassing the 50K limit on built-in tools (ToolResultBuilder), causing context overflow (API 400) or wire pipe stallsValueErrorfromconvert_mcp_content()for unsupported content types/MIME — previously crashed the entire turn, now replaced with[Unsupported content: ...]placeholderChanges
src/kimi_cli/soul/toolset.py:650MCP_MAX_OUTPUT_CHARS = 100_000constantsrc/kimi_cli/soul/toolset.py:656_media_part_size()helper for image/audio/video budgetsrc/kimi_cli/soul/toolset.py:667-735convert_mcp_tool_result()with unified char budget + ValueError catchtests/tools/test_mcp_tool_result.pyTest plan
tests/tools/+tests/core/+tests/hooks/— 980 passedruff check+ruff format --check+pyright— 0 errors