-
Notifications
You must be signed in to change notification settings - Fork 327
Description
Problem Statement
The current strand-agents SDK provides a SummarizingConversationManager
that compresses conversation history into a single summary message. While effective for general context reduction, this approach has significant limitations:
-
Loss of Message Structure: Summarization collapses multiple messages into one, losing the conversational flow and individual message context that may be important for the model's understanding.
-
Inefficient for Large Tool Results: When agents process large API responses, database queries, or file contents as intermediate steps, the entire large response gets summarized even though only the final conclusion may be relevant.
-
Poor Granular Control: Users cannot selectively preserve important messages while aggressively compressing less relevant ones.
-
Tool Result Bloat: Large tool results that are no longer needed (e.g., raw data that has been processed) continue to consume context space unnecessarily.
-
Conversation Flow Disruption: Summarization can break the natural question-answer flow that models rely on for context understanding.
Consider this scenario: An agent processes a 50KB JSON API response, extracts key insights, and provides a summary to the user. The raw JSON is no longer needed, but the current system would either keep the entire response or summarize the entire conversation, potentially losing the structured interaction pattern.
Proposed Solution
Implement a PruningConversationManager that selectively compresses or removes individual messages while preserving the overall conversation structure and flow. Unlike summarization, pruning returns a list of messages where some have been compressed, removed, or truncated while others remain intact.
Core Components
1. Pruning Strategy Interface
Define strategies for selective message compression:
from abc import ABC, abstractmethod
from typing import List, Optional, Dict, Any
from ...types.content import Message, Messages
class PruningStrategy(ABC):
"""Abstract interface for message pruning strategies."""
@abstractmethod
def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
"""Determine if a message should be pruned."""
pass
@abstractmethod
def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
"""Prune a message, returning the compressed version or None to remove."""
pass
class ToolResultPruningStrategy(PruningStrategy):
"""Prune large tool results while preserving tool use context."""
def __init__(self, max_tool_result_tokens: int = 500):
self.max_tool_result_tokens = max_tool_result_tokens
def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
# Check if message contains large tool results
for content in message.get("content", []):
if "toolResult" in content:
result_size = self._estimate_tool_result_tokens(content["toolResult"])
if result_size > self.max_tool_result_tokens:
return True
return False
def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
# Compress large tool results while preserving structure
pruned_message = message.copy()
for content in pruned_message["content"]:
if "toolResult" in content:
content["toolResult"] = self._compress_tool_result(
content["toolResult"], agent
)
return pruned_message
2. PruningConversationManager Implementation
Extend the conversation manager pattern:
class PruningConversationManager(ConversationManager):
"""Conversation manager that selectively prunes messages."""
def __init__(
self,
pruning_strategies: List[PruningStrategy],
preserve_recent_messages: int = 10,
max_pruning_ratio: float = 0.6,
enable_proactive_pruning: bool = True,
pruning_threshold: float = 0.7,
):
"""Initialize the pruning conversation manager.
Args:
pruning_strategies: List of strategies to apply for message pruning.
preserve_recent_messages: Number of recent messages to never prune.
max_pruning_ratio: Maximum percentage of messages that can be pruned.
enable_proactive_pruning: Whether to prune proactively based on threshold.
pruning_threshold: Context usage threshold to trigger proactive pruning.
"""
super().__init__()
self.pruning_strategies = pruning_strategies
self.preserve_recent_messages = preserve_recent_messages
self.max_pruning_ratio = max_pruning_ratio
self.enable_proactive_pruning = enable_proactive_pruning
self.pruning_threshold = pruning_threshold
def apply_management(self, agent: "Agent", **kwargs: Any) -> None:
"""Apply pruning management strategy."""
if self.enable_proactive_pruning and self._should_prune_proactively(agent):
self.reduce_context(agent, **kwargs)
def reduce_context(self, agent: "Agent", e: Optional[Exception] = None, **kwargs: Any) -> None:
"""Reduce context through selective message pruning."""
original_messages = agent.messages.copy()
pruned_messages = self._prune_messages(agent.messages, agent)
# Validate that pruning actually reduced token usage
if self._validate_pruning_effectiveness(original_messages, pruned_messages, agent):
agent.messages[:] = pruned_messages
self.removed_message_count += len(original_messages) - len(pruned_messages)
else:
# Fallback to more aggressive pruning or raise exception
self._handle_pruning_failure(agent, e)
3. Built-in Pruning Strategies
Provide common pruning strategies out of the box:
class LargeToolResultPruningStrategy(PruningStrategy):
"""Compress large tool results to summaries."""
def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
# Use LLM to summarize large tool results
return self._llm_compress_tool_result(message, agent)
class OldMessageRemovalStrategy(PruningStrategy):
"""Remove very old messages that are likely irrelevant."""
def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
message_age = context.get("message_age", 0)
return message_age > self.max_message_age
class DuplicateContentPruningStrategy(PruningStrategy):
"""Remove or compress messages with duplicate or similar content."""
def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
# Use similarity detection to identify duplicate content
return self._detect_content_similarity(message, context)
class IntermediateStepPruningStrategy(PruningStrategy):
"""Compress intermediate reasoning steps while preserving conclusions."""
def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
# Identify and compress intermediate reasoning
return self._compress_intermediate_steps(message, agent)
4. Pruning Context and Metadata
Track pruning decisions and provide transparency:
class PruningContext:
"""Context information for pruning decisions."""
def __init__(self, messages: Messages, agent: "Agent"):
self.messages = messages
self.agent = agent
self.message_ages = self._calculate_message_ages()
self.token_counts = self._calculate_token_counts()
self.tool_usage_map = self._build_tool_usage_map()
def get_message_context(self, index: int) -> Dict[str, Any]:
"""Get context information for a specific message."""
return {
"message_age": self.message_ages[index],
"token_count": self.token_counts[index],
"has_tool_use": self._has_tool_use(self.messages[index]),
"has_tool_result": self._has_tool_result(self.messages[index]),
"is_recent": index >= len(self.messages) - self.preserve_recent_messages,
}
Use Case
1. Data Processing Workflows
# Agent that processes large datasets
agent = Agent(
model=model,
tools=[data_processor, api_client, file_reader],
conversation_manager=PruningConversationManager(
pruning_strategies=[
LargeToolResultPruningStrategy(max_tool_result_tokens=1000),
IntermediateStepPruningStrategy(),
],
preserve_recent_messages=5,
pruning_threshold=0.6
)
)
# Large API response gets pruned after processing
result = agent("Fetch user data from API and analyze patterns")
# Raw API response is compressed, analysis remains intact
2. Long-Running Research Sessions
# Research agent with selective memory
research_agent = Agent(
conversation_manager=PruningConversationManager(
pruning_strategies=[
OldMessageRemovalStrategy(max_age_messages=50),
DuplicateContentPruningStrategy(similarity_threshold=0.8),
LargeToolResultPruningStrategy(max_tool_result_tokens=500),
],
preserve_recent_messages=15,
max_pruning_ratio=0.7
)
)
# Maintains research flow while pruning redundant information
for topic in research_topics:
result = research_agent(f"Research {topic} and provide key insights")
3. Multi-Step Problem Solving
# Problem-solving agent that preserves solution structure
solver_agent = Agent(
conversation_manager=PruningConversationManager(
pruning_strategies=[
IntermediateStepPruningStrategy(),
LargeToolResultPruningStrategy(max_tool_result_tokens=800),
],
preserve_recent_messages=8
)
)
# Keeps problem-solution structure while compressing intermediate work
result = solver_agent("Solve this complex optimization problem step by step")
4. Custom Pruning Strategy
class BusinessLogicPruningStrategy(PruningStrategy):
"""Custom strategy for business-specific content pruning."""
def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
# Custom business logic for identifying pruneable content
return self._contains_temporary_data(message)
def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
# Custom compression logic
return self._compress_business_data(message)
# Use custom strategy
agent = Agent(
conversation_manager=PruningConversationManager(
pruning_strategies=[
BusinessLogicPruningStrategy(),
LargeToolResultPruningStrategy(),
]
)
)
Alternatives Solutions
1. Enhanced Summarization
- Extend
SummarizingConversationManager
with selective summarization - Pros: Builds on existing architecture, familiar pattern
- Cons: Still loses message structure, limited granular control
2. Hierarchical Compression
- Implement multi-level compression with different strategies per level
- Pros: Very flexible, can optimize for different content types
- Cons: Complex configuration, potential over-engineering
3. Content-Aware Sliding Window
- Enhance
SlidingWindowConversationManager
with content-aware trimming - Pros: Simple conceptual model, predictable behavior
- Cons: Less flexible than pruning, may remove important content
4. Hybrid Pruning-Summarization
- Combine pruning and summarization in a single manager
- Pros: Best of both approaches, maximum flexibility
- Cons: Increased complexity, potential conflicts between strategies
Additional Context
No response