-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Description
Problem
Tool outputs (nmap, ffuf, sqlmap, etc.) are currently kept in conversation history and sent to the LLM on every subsequent iteration until memory compression triggers at 90K tokens.
Current behavior:
Iteration 1: [system] + [nmap_output] → 50KB sent
Iteration 2: [system] + [nmap_output] + [response] → 55KB sent
Iteration 3: [system] + [nmap_output] + [...] → 60KB sent
...
Iteration 50: Still sending that nmap output → 150KB sent
This leads to:
- Wasted tokens: Same output sent 50+ times before compression
- Increased latency: Larger context = slower LLM responses
- Higher costs: Paying for redundant input tokens
- Late compression: 90K threshold means ~50-80 iterations of waste
Current Mitigations (Insufficient)
- Truncation (
executor.py:187-190): Outputs > 10K chars truncated to 8K - Memory compression (
memory_compressor.py): Summarizes at 90K tokens
These help but don't address the core issue: tool outputs remain in context long after the agent has processed them.
Proposed Solution: File-Backed Tool Results
Concept
Store large tool outputs to disk immediately, include only a reference + summary in conversation history.
Implementation Sketch
# In executor.py - process_tool_invocations()
async def _format_tool_result(tool_name: str, result: str, run_dir: Path) -> str:
"""Format tool result, backing large outputs to file."""
INLINE_THRESHOLD = 2000 # chars
if len(result) <= INLINE_THRESHOLD:
return f"<tool_result><tool_name>{tool_name}</tool_name><result>{result}</result></tool_result>"
# Store full output to file
tool_results_dir = run_dir / "tool_results"
tool_results_dir.mkdir(exist_ok=True)
result_id = f"{tool_name}_{uuid4().hex[:8]}"
result_file = tool_results_dir / f"{result_id}.txt"
result_file.write_text(result)
# Generate summary for context
summary = await _summarize_tool_output(tool_name, result) # Or use heuristics
return f"""<tool_result>
<tool_name>{tool_name}</tool_name>
<result_file>{result_file}</result_file>
<summary>{summary}</summary>
<hint>Use read_tool_result("{result_id}") to access full output if needed</hint>
</tool_result>"""New Tool: read_tool_result
Allow agents to retrieve full output when needed:
@register_tool
async def read_tool_result(result_id: str, lines: str = "all") -> str:
"""Retrieve stored tool output.
Args:
result_id: ID from tool_result reference
lines: "all", "first:100", "last:50", "grep:pattern", etc.
"""
...Benefits
| Metric | Current | With File-Backing |
|---|---|---|
| Context per iteration | 50KB+ | ~2KB (summary only) |
| Token cost (100 iter) | ~500K input tokens | ~50K input tokens |
| Full data accessible | ✅ Always in context | ✅ On-demand via tool |
| Agent autonomy | N/A | Can request details when needed |
Alternative Approaches
1. Immediate Summarization
Summarize tool outputs right after execution instead of at 90K threshold.
- Pro: Simpler, no new tool needed
- Con: Lossy, agent can't access original details
2. Sliding Window
Only keep last N tool outputs in context.
- Pro: Simple to implement
- Con: May lose relevant earlier outputs
3. RAG-based Retrieval
Embed tool outputs, retrieve semantically relevant chunks.
- Pro: Smart retrieval based on current task
- Con: Complex, adds latency, embedding costs
Recommendation
File-backed with summaries (Option 1 above) provides the best balance:
- Preserves full data (no information loss)
- Drastically reduces context size
- Gives agent control over when to access details
- Simple implementation with clear mental model
Implementation Considerations
- Threshold tuning: 2KB inline vs file-backed
- Summary generation: LLM-based vs heuristic (first/last N lines + stats)
- File cleanup: Delete after run or retain for debugging
- Tool schema for
read_tool_resultwith filtering options - Backwards compatibility: Existing tool handlers unchanged
Related
- Memory compression:
strix/llm/memory_compressor.py - Tool execution:
strix/tools/executor.py - Agent state:
strix/agents/state.py
Metadata
Metadata
Assignees
Labels
No labels