[FEATURE] PruningConversationManager for Selective Message Compression

### Problem Statement

The current strand-agents SDK provides a [`SummarizingConversationManager`](src/strands/agent/conversation_manager/summarizing_conversation_manager.py:45) that compresses conversation history into a single summary message. While effective for general context reduction, this approach has significant limitations:

1. **Loss of Message Structure**: Summarization collapses multiple messages into one, losing the conversational flow and individual message context that may be important for the model's understanding.

2. **Inefficient for Large Tool Results**: When agents process large API responses, database queries, or file contents as intermediate steps, the entire large response gets summarized even though only the final conclusion may be relevant.

3. **Poor Granular Control**: Users cannot selectively preserve important messages while aggressively compressing less relevant ones.

4. **Tool Result Bloat**: Large tool results that are no longer needed (e.g., raw data that has been processed) continue to consume context space unnecessarily.

5. **Conversation Flow Disruption**: Summarization can break the natural question-answer flow that models rely on for context understanding.

Consider this scenario: An agent processes a 50KB JSON API response, extracts key insights, and provides a summary to the user. The raw JSON is no longer needed, but the current system would either keep the entire response or summarize the entire conversation, potentially losing the structured interaction pattern.

### Proposed Solution

Implement a **PruningConversationManager** that selectively compresses or removes individual messages while preserving the overall conversation structure and flow. Unlike summarization, pruning returns a list of messages where some have been compressed, removed, or truncated while others remain intact.

### Core Components

#### 1. Pruning Strategy Interface
Define strategies for selective message compression:

```python
from abc import ABC, abstractmethod
from typing import List, Optional, Dict, Any
from ...types.content import Message, Messages

class PruningStrategy(ABC):
    """Abstract interface for message pruning strategies."""
    
    @abstractmethod
    def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
        """Determine if a message should be pruned."""
        pass
    
    @abstractmethod
    def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
        """Prune a message, returning the compressed version or None to remove."""
        pass

class ToolResultPruningStrategy(PruningStrategy):
    """Prune large tool results while preserving tool use context."""
    
    def __init__(self, max_tool_result_tokens: int = 500):
        self.max_tool_result_tokens = max_tool_result_tokens
    
    def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
        # Check if message contains large tool results
        for content in message.get("content", []):
            if "toolResult" in content:
                result_size = self._estimate_tool_result_tokens(content["toolResult"])
                if result_size > self.max_tool_result_tokens:
                    return True
        return False
    
    def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
        # Compress large tool results while preserving structure
        pruned_message = message.copy()
        for content in pruned_message["content"]:
            if "toolResult" in content:
                content["toolResult"] = self._compress_tool_result(
                    content["toolResult"], agent
                )
        return pruned_message
```

#### 2. PruningConversationManager Implementation
Extend the conversation manager pattern:

```python
class PruningConversationManager(ConversationManager):
    """Conversation manager that selectively prunes messages."""
    
    def __init__(
        self,
        pruning_strategies: List[PruningStrategy],
        preserve_recent_messages: int = 10,
        max_pruning_ratio: float = 0.6,
        enable_proactive_pruning: bool = True,
        pruning_threshold: float = 0.7,
    ):
        """Initialize the pruning conversation manager.
        
        Args:
            pruning_strategies: List of strategies to apply for message pruning.
            preserve_recent_messages: Number of recent messages to never prune.
            max_pruning_ratio: Maximum percentage of messages that can be pruned.
            enable_proactive_pruning: Whether to prune proactively based on threshold.
            pruning_threshold: Context usage threshold to trigger proactive pruning.
        """
        super().__init__()
        self.pruning_strategies = pruning_strategies
        self.preserve_recent_messages = preserve_recent_messages
        self.max_pruning_ratio = max_pruning_ratio
        self.enable_proactive_pruning = enable_proactive_pruning
        self.pruning_threshold = pruning_threshold
    
    def apply_management(self, agent: "Agent", **kwargs: Any) -> None:
        """Apply pruning management strategy."""
        if self.enable_proactive_pruning and self._should_prune_proactively(agent):
            self.reduce_context(agent, **kwargs)
    
    def reduce_context(self, agent: "Agent", e: Optional[Exception] = None, **kwargs: Any) -> None:
        """Reduce context through selective message pruning."""
        original_messages = agent.messages.copy()
        pruned_messages = self._prune_messages(agent.messages, agent)
        
        # Validate that pruning actually reduced token usage
        if self._validate_pruning_effectiveness(original_messages, pruned_messages, agent):
            agent.messages[:] = pruned_messages
            self.removed_message_count += len(original_messages) - len(pruned_messages)
        else:
            # Fallback to more aggressive pruning or raise exception
            self._handle_pruning_failure(agent, e)
```

#### 3. Built-in Pruning Strategies
Provide common pruning strategies out of the box:

```python
class LargeToolResultPruningStrategy(PruningStrategy):
    """Compress large tool results to summaries."""
    
    def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
        # Use LLM to summarize large tool results
        return self._llm_compress_tool_result(message, agent)

class OldMessageRemovalStrategy(PruningStrategy):
    """Remove very old messages that are likely irrelevant."""
    
    def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
        message_age = context.get("message_age", 0)
        return message_age > self.max_message_age

class DuplicateContentPruningStrategy(PruningStrategy):
    """Remove or compress messages with duplicate or similar content."""
    
    def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
        # Use similarity detection to identify duplicate content
        return self._detect_content_similarity(message, context)

class IntermediateStepPruningStrategy(PruningStrategy):
    """Compress intermediate reasoning steps while preserving conclusions."""
    
    def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
        # Identify and compress intermediate reasoning
        return self._compress_intermediate_steps(message, agent)
```

#### 4. Pruning Context and Metadata
Track pruning decisions and provide transparency:

```python
class PruningContext:
    """Context information for pruning decisions."""
    
    def __init__(self, messages: Messages, agent: "Agent"):
        self.messages = messages
        self.agent = agent
        self.message_ages = self._calculate_message_ages()
        self.token_counts = self._calculate_token_counts()
        self.tool_usage_map = self._build_tool_usage_map()
    
    def get_message_context(self, index: int) -> Dict[str, Any]:
        """Get context information for a specific message."""
        return {
            "message_age": self.message_ages[index],
            "token_count": self.token_counts[index],
            "has_tool_use": self._has_tool_use(self.messages[index]),
            "has_tool_result": self._has_tool_result(self.messages[index]),
            "is_recent": index >= len(self.messages) - self.preserve_recent_messages,
        }
```

### Use Case

### 1. Data Processing Workflows
```python
# Agent that processes large datasets
agent = Agent(
    model=model,
    tools=[data_processor, api_client, file_reader],
    conversation_manager=PruningConversationManager(
        pruning_strategies=[
            LargeToolResultPruningStrategy(max_tool_result_tokens=1000),
            IntermediateStepPruningStrategy(),
        ],
        preserve_recent_messages=5,
        pruning_threshold=0.6
    )
)

# Large API response gets pruned after processing
result = agent("Fetch user data from API and analyze patterns")
# Raw API response is compressed, analysis remains intact
```

### 2. Long-Running Research Sessions
```python
# Research agent with selective memory
research_agent = Agent(
    conversation_manager=PruningConversationManager(
        pruning_strategies=[
            OldMessageRemovalStrategy(max_age_messages=50),
            DuplicateContentPruningStrategy(similarity_threshold=0.8),
            LargeToolResultPruningStrategy(max_tool_result_tokens=500),
        ],
        preserve_recent_messages=15,
        max_pruning_ratio=0.7
    )
)

# Maintains research flow while pruning redundant information
for topic in research_topics:
    result = research_agent(f"Research {topic} and provide key insights")
```

### 3. Multi-Step Problem Solving
```python
# Problem-solving agent that preserves solution structure
solver_agent = Agent(
    conversation_manager=PruningConversationManager(
        pruning_strategies=[
            IntermediateStepPruningStrategy(),
            LargeToolResultPruningStrategy(max_tool_result_tokens=800),
        ],
        preserve_recent_messages=8
    )
)

# Keeps problem-solution structure while compressing intermediate work
result = solver_agent("Solve this complex optimization problem step by step")
```

### 4. Custom Pruning Strategy
```python
class BusinessLogicPruningStrategy(PruningStrategy):
    """Custom strategy for business-specific content pruning."""
    
    def should_prune_message(self, message: Message, context: Dict[str, Any]) -> bool:
        # Custom business logic for identifying pruneable content
        return self._contains_temporary_data(message)
    
    def prune_message(self, message: Message, agent: "Agent") -> Optional[Message]:
        # Custom compression logic
        return self._compress_business_data(message)

# Use custom strategy
agent = Agent(
    conversation_manager=PruningConversationManager(
        pruning_strategies=[
            BusinessLogicPruningStrategy(),
            LargeToolResultPruningStrategy(),
        ]
    )
)
```

### Alternatives Solutions


### 1. **Enhanced Summarization**
- Extend [`SummarizingConversationManager`](src/strands/agent/conversation_manager/summarizing_conversation_manager.py:45) with selective summarization
- **Pros**: Builds on existing architecture, familiar pattern
- **Cons**: Still loses message structure, limited granular control

### 2. **Hierarchical Compression**
- Implement multi-level compression with different strategies per level
- **Pros**: Very flexible, can optimize for different content types
- **Cons**: Complex configuration, potential over-engineering

### 3. **Content-Aware Sliding Window**
- Enhance [`SlidingWindowConversationManager`](src/strands/agent/conversation_manager/sliding_window_conversation_manager.py:16) with content-aware trimming
- **Pros**: Simple conceptual model, predictable behavior
- **Cons**: Less flexible than pruning, may remove important content

### 4. **Hybrid Pruning-Summarization**
- Combine pruning and summarization in a single manager
- **Pros**: Best of both approaches, maximum flexibility
- **Cons**: Increased complexity, potential conflicts between strategies


### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] PruningConversationManager for Selective Message Compression #556

Problem Statement

Proposed Solution

Core Components

1. Pruning Strategy Interface

2. PruningConversationManager Implementation

3. Built-in Pruning Strategies

4. Pruning Context and Metadata

Use Case

1. Data Processing Workflows

2. Long-Running Research Sessions

3. Multi-Step Problem Solving

4. Custom Pruning Strategy

Alternatives Solutions

1. Enhanced Summarization

2. Hierarchical Compression

3. Content-Aware Sliding Window

4. Hybrid Pruning-Summarization

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] PruningConversationManager for Selective Message Compression #556

Description

Problem Statement

Proposed Solution

Core Components

1. Pruning Strategy Interface

2. PruningConversationManager Implementation

3. Built-in Pruning Strategies

4. Pruning Context and Metadata

Use Case

1. Data Processing Workflows

2. Long-Running Research Sessions

3. Multi-Step Problem Solving

4. Custom Pruning Strategy

Alternatives Solutions

1. Enhanced Summarization

2. Hierarchical Compression

3. Content-Aware Sliding Window

4. Hybrid Pruning-Summarization

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions