Skip to content

Deal with large tool results, enforce size limit, trigger compaction#94

Merged
stippi merged 2 commits intomainfrom
tool-result-size-limit
Mar 11, 2026
Merged

Deal with large tool results, enforce size limit, trigger compaction#94
stippi merged 2 commits intomainfrom
tool-result-size-limit

Conversation

@stippi
Copy link
Owner

@stippi stippi commented Mar 11, 2026

No description provided.

stippi added 2 commits March 10, 2026 16:28
Two-layer defense against tool results that blow up the LLM context:

Layer 1: read_files pre-read size check
- Files over 200KB are rejected with an error showing exact size and line count
- Detects single-line/minified files and mentions it in the error message
- New 'ignore_size_limit' parameter lets the LLM explicitly opt in to large reads
- Line-range reads always bypass the limit since the user chose a specific range

Layer 2: Prompt-too-long error recovery in agent loop
- Catches 'prompt too long' errors from all LLM providers (Anthropic, OpenAI, etc.)
- Replaces large tool results (>50KB) from the current turn with
  PromptTooLongError placeholders containing actionable suggestions
- Scoped to current turn only — older tool results are not touched
- Fallback when no single result is large enough: drops the last
  assistant/tool-result exchange and forces context compaction
- Works generically for all tools (read_files, execute_command, web_fetch, etc.)

New type: PromptTooLongError — a ToolResult placeholder with tool-specific
guidance (line ranges, search_files, head/tail, CSS selectors).

Includes 5 new tests for the read_files size check covering: rejection,
ignore_size_limit bypass, line-range bypass, line count reporting, and
minified file detection.
UI updates:
- replace_large_tool_results() now returns Vec<(tool_id, error_message)>
  instead of bool, enabling the caller to send UpdateToolStatus events
- Caller sends ToolStatus::Error with 'Prompt Too Long' message for each
  replaced tool, updating both GPUI and terminal UI tool blocks

Integration tests with mocked LLM provider:
- test_prompt_too_long_replaces_large_tool_results: verifies the full
  flow of tool execution -> prompt-too-long error -> result replacement
  -> successful retry, including UI event assertions
- test_prompt_too_long_fallback_drops_exchange_and_compacts: verifies
  the fallback path when no tool result exceeds the 50KB replacement
  threshold — exchange is dropped and compaction is forced
@stippi stippi merged commit 3837af5 into main Mar 11, 2026
1 check passed
@stippi stippi deleted the tool-result-size-limit branch March 11, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant