# Filtering and Trimming Messages - Practice Exercises

## Overview
This notebook provides hands-on exercises to practice message management in LangGraph. You'll learn to filter, trim, and manipulate message histories for efficient conversational AI systems.

## Learning Objectives
By the end of these exercises, you will:
- Understand MessagesState and message handling in LangGraph
- Know how to implement message filtering based on various criteria
- Practice message trimming to control context window size
- Use RemoveMessage for selective message deletion
- Build efficient conversation management systems
- Handle token limits and context optimization

## Prerequisites
- Completed the trim-filter-messages.ipynb tutorial
- Understanding of LangChain message types
- Basic knowledge of conversational AI concepts

In [None]:
%%capture --no-stderr
%pip install --quiet -U langchain_core langgraph langchain_openai

In [None]:
import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

# Optional: Set up LangSmith for tracing
# _set_env("LANGSMITH_API_KEY")
# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_PROJECT"] = "langchain-academy"

## Exercise 1: Basic Message Management

### Task
Create a simple chatbot that demonstrates basic message handling with MessagesState.

### TODO: Set up basic message-based chatbot

In [None]:
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import MessagesState, StateGraph, START, END
from IPython.display import Image, display

# TODO: Initialize the chat model
model = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# TODO: Implement basic chat node
def chat_node(state: MessagesState):
    print(f"Processing {len(state['messages'])} messages")
    # TODO: Call the model with the messages and return the response
    pass

# TODO: Implement message inspector node
def inspect_messages(state: MessagesState):
    print("\n=== Message Inspection ===")
    for i, msg in enumerate(state['messages']):
        print(f"{i+1}. {type(msg).__name__}: {msg.content[:50]}...")
    print(f"Total messages: {len(state['messages'])}")
    return {}  # No state updates, just inspection

In [None]:
# TODO: Build basic chat graph
builder_basic = StateGraph(MessagesState)
# TODO: Add chat_node and inspect_messages
# TODO: Create flow: START -> chat_node -> inspect_messages -> END

graph_basic = builder_basic.compile()
display(Image(graph_basic.get_graph().draw_mermaid_png()))

In [None]:
# TODO: Test basic message handling
initial_messages = [
    SystemMessage(content="You are a helpful assistant that explains concepts clearly."),
    HumanMessage(content="What is LangGraph?"),
]

result_basic = graph_basic.invoke({"messages": initial_messages})
print("\nFinal conversation:")
for msg in result_basic['messages']:
    print(f"{type(msg).__name__}: {msg.content}")

## Exercise 2: Message Filtering by Type and Content

### Task
Implement various message filtering strategies to manage conversation context.

### TODO: Implement message filtering functions

In [None]:
from typing import List
from langchain_core.messages import BaseMessage

# TODO: Implement filter to keep only specific message types
def filter_by_type(messages: List[BaseMessage], keep_types: List[str]) -> List[BaseMessage]:
    """Filter messages to keep only specified types."""
    # TODO: Return only messages whose type names are in keep_types
    pass

# TODO: Implement filter to remove messages with specific content
def filter_by_content(messages: List[BaseMessage], exclude_keywords: List[str]) -> List[BaseMessage]:
    """Filter out messages containing specific keywords."""
    # TODO: Remove messages that contain any of the exclude_keywords
    pass

# TODO: Implement filter to keep only recent messages
def filter_recent_messages(messages: List[BaseMessage], max_messages: int) -> List[BaseMessage]:
    """Keep only the most recent N messages."""
    # TODO: Return the last max_messages from the list
    pass

# TODO: Implement smart filter that preserves system messages
def smart_filter_preserve_system(messages: List[BaseMessage], max_messages: int) -> List[BaseMessage]:
    """Filter messages but always preserve system messages."""
    # TODO: Keep all system messages + most recent non-system messages up to max_messages total
    pass

In [None]:
# TODO: Create test conversation with various message types
test_messages = [
    SystemMessage(content="You are a helpful coding assistant."),
    HumanMessage(content="Help me debug this Python code."),
    AIMessage(content="I'd be happy to help! Please share your code."),
    HumanMessage(content="Here's my buggy code: print('hello world')"),
    AIMessage(content="That code looks fine to me. What error are you seeing?"),
    HumanMessage(content="Never mind, I found the issue. Thanks!"),
    AIMessage(content="Great! Happy to help anytime."),
    HumanMessage(content="Actually, can you help with a different debugging issue?"),
    AIMessage(content="Of course! What's the new issue you're facing?"),
]

# TODO: Test your filtering functions
print("Original messages:", len(test_messages))

# Test filter_by_type
human_ai_only = filter_by_type(test_messages, ["HumanMessage", "AIMessage"])
print(f"Human + AI only: {len(human_ai_only)} messages")

# Test filter_by_content  
no_debug_messages = filter_by_content(test_messages, ["debug", "buggy"])
print(f"Excluding debug-related: {len(no_debug_messages)} messages")

# Test filter_recent_messages
recent_only = filter_recent_messages(test_messages, 4)
print(f"Recent 4 messages: {len(recent_only)} messages")

# Test smart_filter_preserve_system
smart_filtered = smart_filter_preserve_system(test_messages, 5)
print(f"Smart filtered (preserve system): {len(smart_filtered)} messages")

## Exercise 3: Token-Based Message Trimming

### Task
Implement intelligent message trimming based on token counts to stay within model context limits.

### TODO: Implement token-based trimming

In [None]:
import tiktoken
from typing import Tuple

# TODO: Initialize tokenizer
tokenizer = tiktoken.encoding_for_model("gpt-4")

# TODO: Implement token counting function
def count_message_tokens(message: BaseMessage) -> int:
    """Count tokens in a message."""
    # TODO: Count tokens in message content
    pass

# TODO: Implement function to count total tokens in message list
def count_total_tokens(messages: List[BaseMessage]) -> int:
    """Count total tokens in a list of messages."""
    # TODO: Sum up tokens from all messages
    pass

# TODO: Implement smart trimming function
def trim_messages_by_tokens(messages: List[BaseMessage], max_tokens: int) -> Tuple[List[BaseMessage], int]:
    """Trim messages to fit within token limit, preserving system messages."""
    # TODO: Implement smart trimming:
    # 1. Always keep system messages
    # 2. Keep as many recent messages as possible within token limit
    # 3. Return (trimmed_messages, total_tokens)
    pass

# TODO: Implement function that estimates tokens without tokenizer
def estimate_tokens(text: str) -> int:
    """Quick token estimation (roughly 4 chars = 1 token)."""
    # TODO: Implement rough estimation
    pass

In [None]:
# TODO: Create test conversation with varying message lengths
long_conversation = [
    SystemMessage(content="You are an expert software architect with 20 years of experience in distributed systems, microservices, and cloud architecture."),
    HumanMessage(content="I need help designing a scalable e-commerce platform."),
    AIMessage(content="I'd be happy to help you design a scalable e-commerce platform. Let's break this down into key components: user management, product catalog, shopping cart, order processing, payment systems, inventory management, and analytics. Each of these needs to handle high traffic and scale independently."),
    HumanMessage(content="What technologies would you recommend for the backend services?"),
    AIMessage(content="For backend services, I recommend a microservices architecture using containerized applications. Consider using Node.js or Python with FastAPI for API services, PostgreSQL for transactional data, Redis for caching and sessions, RabbitMQ or Apache Kafka for message queuing, and Elasticsearch for search functionality. Deploy using Kubernetes on AWS, GCP, or Azure."),
    HumanMessage(content="How should we handle user authentication and authorization?"),
    AIMessage(content="Implement OAuth 2.0 with JWT tokens for authentication. Use a dedicated identity service like Auth0, AWS Cognito, or build your own using libraries like Passport.js. Implement role-based access control (RBAC) with permissions stored in your user service. Consider multi-factor authentication for enhanced security."),
]

# TODO: Test token counting and trimming
print("Token Analysis:")
for i, msg in enumerate(long_conversation):
    tokens = count_message_tokens(msg)
    print(f"{i+1}. {type(msg).__name__}: {tokens} tokens")

total_tokens = count_total_tokens(long_conversation)
print(f"\nTotal tokens: {total_tokens}")

# Test trimming with different limits
for limit in [200, 400, 600]:
    trimmed, actual_tokens = trim_messages_by_tokens(long_conversation, limit)
    print(f"\nLimit: {limit} tokens")
    print(f"Kept: {len(trimmed)} messages, {actual_tokens} tokens")
    for msg in trimmed:
        print(f"  - {type(msg).__name__}: {msg.content[:30]}...")

## Exercise 4: Using RemoveMessage for Selective Deletion

### Task
Implement selective message removal using LangGraph's RemoveMessage functionality.

### TODO: Implement selective message removal

In [None]:
from langchain_core.messages import RemoveMessage
from langgraph.checkpoint.memory import MemorySaver

# TODO: Create state with message management
class ConversationState(MessagesState):
    message_count: int
    last_cleanup: str

# TODO: Implement chat node with message tracking
def tracked_chat_node(state: ConversationState):
    print(f"Processing conversation (current: {state.get('message_count', 0)} messages)")
    # TODO: Call model and increment message count
    response = model.invoke(state['messages'])
    # TODO: Return response and updated message_count
    pass

# TODO: Implement selective cleanup node
def cleanup_old_messages(state: ConversationState):
    print("Cleaning up old messages...")
    messages_to_remove = []
    
    # TODO: Identify messages to remove based on criteria:
    # - Remove messages older than the last 4 messages (keep system messages)
    # - Don't remove system messages
    # - Create RemoveMessage objects for identified messages
    
    current_time = "2024-01-01 12:00:00"  # Placeholder timestamp
    
    # TODO: Return RemoveMessage objects and update last_cleanup
    pass

# TODO: Implement conditional cleanup based on message count
def should_cleanup(state: ConversationState) -> str:
    # TODO: Return "cleanup" if message_count > 8, otherwise "continue"
    pass

In [None]:
# TODO: Build conversation graph with cleanup
builder_cleanup = StateGraph(ConversationState)
# TODO: Add nodes: tracked_chat_node, cleanup_old_messages
# TODO: Add conditional edges based on message count
# TODO: Create flow with cleanup when needed

# Add memory for persistence
memory = MemorySaver()
graph_cleanup = builder_cleanup.compile(checkpointer=memory)
display(Image(graph_cleanup.get_graph().draw_mermaid_png()))

In [None]:
# TODO: Test selective message removal
config = {"configurable": {"thread_id": "cleanup_test"}}

# Start with system message
initial_state = {
    "messages": [SystemMessage(content="You are a helpful assistant.")],
    "message_count": 1,
    "last_cleanup": "never"
}

# Simulate a long conversation
conversation_turns = [
    "Hi there!",
    "How are you doing today?", 
    "What's the weather like?",
    "Tell me a joke.",
    "What's 2+2?",
    "Explain quantum computing.",
    "What's your favorite color?",
    "How do neural networks work?"
]

state = initial_state
for turn in conversation_turns:
    print(f"\n--- Adding: '{turn}' ---")
    state = {**state, "messages": state["messages"] + [HumanMessage(content=turn)]}
    result = graph_cleanup.invoke(state, config)
    
    print(f"Messages in conversation: {len(result['messages'])}")
    print(f"Last cleanup: {result['last_cleanup']}")
    
    state = result

## Exercise 5: Advanced Message Management System

### Task
Build a sophisticated message management system that combines filtering, trimming, and selective removal with different strategies based on conversation context.

### TODO: Implement advanced message management

In [None]:
from datetime import datetime
from typing import Dict, Any, Optional
import json

# TODO: Create advanced conversation state
class AdvancedConversationState(MessagesState):
    conversation_type: str  # "casual", "technical", "support"
    priority_topics: List[str]
    token_budget: int
    cleanup_strategy: str
    conversation_summary: str

# TODO: Implement context-aware message analyzer
def analyze_conversation_context(state: AdvancedConversationState):
    print("Analyzing conversation context...")
    messages = state['messages']
    
    # TODO: Analyze conversation to determine:
    # - conversation_type based on content
    # - priority_topics from recent messages
    # - appropriate token_budget based on type
    # - optimal cleanup_strategy
    
    # Simple analysis logic (enhance this)
    technical_keywords = ["code", "algorithm", "programming", "debug", "API"]
    support_keywords = ["help", "issue", "problem", "error", "fix"]
    
    # TODO: Implement analysis and return updated state
    pass

# TODO: Implement adaptive message manager
def adaptive_message_management(state: AdvancedConversationState):
    print(f"Managing messages with {state['cleanup_strategy']} strategy")
    
    messages = state['messages']
    strategy = state['cleanup_strategy']
    budget = state['token_budget']
    
    # TODO: Apply different strategies based on cleanup_strategy:
    # - "aggressive": Keep only recent + important messages
    # - "conservative": Trim only when necessary
    # - "topic_focused": Keep messages related to priority_topics
    # - "token_optimized": Optimize for token budget
    
    managed_messages = messages  # TODO: Apply your strategy
    
    # TODO: Return updated messages and any cleanup actions
    pass

# TODO: Implement intelligent response node
def intelligent_chat_response(state: AdvancedConversationState):
    print(f"Generating {state['conversation_type']} response")
    
    # TODO: Customize system prompt based on conversation context
    context_prompts = {
        "technical": "You are a technical expert. Provide detailed, accurate information.",
        "support": "You are a helpful support agent. Focus on solving the user's problem.",
        "casual": "You are a friendly conversational partner. Keep responses natural and engaging."
    }
    
    # TODO: Add context-appropriate system message and get response
    pass

In [None]:
# TODO: Build advanced conversation management graph
builder_advanced = StateGraph(AdvancedConversationState)
# TODO: Add nodes in sequence:
# analyze_conversation_context -> adaptive_message_management -> intelligent_chat_response

graph_advanced = builder_advanced.compile()
display(Image(graph_advanced.get_graph().draw_mermaid_png()))

In [None]:
# TODO: Test advanced message management with different conversation types
test_scenarios = [
    {
        "name": "Technical Discussion",
        "messages": [
            HumanMessage(content="I need help debugging my Python code."),
            AIMessage(content="I'd be happy to help with debugging! What error are you seeing?"),
            HumanMessage(content="I'm getting a KeyError when accessing a dictionary."),
            AIMessage(content="KeyError usually means the key doesn't exist. Can you show me the code?"),
            HumanMessage(content="Here's the code: data['user_id'] where data comes from an API."),
        ]
    },
    {
        "name": "Support Request",
        "messages": [
            HumanMessage(content="I'm having trouble logging into my account."),
            AIMessage(content="I can help with login issues. What happens when you try to log in?"),
            HumanMessage(content="It says 'invalid credentials' but I'm sure my password is correct."),
        ]
    },
    {
        "name": "Casual Chat", 
        "messages": [
            HumanMessage(content="What's your favorite movie?"),
            AIMessage(content="I enjoy discussing films! What genres do you like?"),
            HumanMessage(content="I love sci-fi movies, especially ones about AI."),
        ]
    }
]

for scenario in test_scenarios:
    print(f"\n=== Testing: {scenario['name']} ===")
    
    initial_advanced_state = {
        "messages": scenario["messages"],
        "conversation_type": "unknown",
        "priority_topics": [],
        "token_budget": 1000,
        "cleanup_strategy": "conservative",
        "conversation_summary": ""
    }
    
    result_advanced = graph_advanced.invoke(initial_advanced_state)
    
    print(f"Detected type: {result_advanced['conversation_type']}")
    print(f"Priority topics: {result_advanced['priority_topics']}")
    print(f"Cleanup strategy: {result_advanced['cleanup_strategy']}")
    print(f"Messages managed: {len(result_advanced['messages'])} total")

## Challenge Exercise: Dynamic Context Window Management

### Task
Create a system that dynamically adjusts context window size based on conversation complexity and user preferences, implementing multiple trimming strategies.

### TODO: Implement dynamic context management

In [None]:
# TODO: Define dynamic context management state
class DynamicContextState(MessagesState):
    max_context_tokens: int
    context_strategy: str  # "sliding_window", "importance_based", "topic_clustering"
    user_preferences: Dict[str, Any]
    conversation_complexity: float  # 0.0 to 1.0
    message_importance_scores: List[float]

# TODO: Implement conversation complexity analyzer
def analyze_complexity(state: DynamicContextState) -> float:
    """Analyze conversation complexity based on various factors."""
    messages = state['messages']
    
    # TODO: Calculate complexity based on:
    # - Average message length
    # - Technical vocabulary usage
    # - Question complexity
    # - Topic diversity
    pass

# TODO: Implement importance scorer
def score_message_importance(message: BaseMessage, context: List[BaseMessage]) -> float:
    """Score the importance of a message in context."""
    # TODO: Score based on:
    # - Message type (system messages = high importance)
    # - Recency
    # - Content relevance to current topic
    # - User indicators (questions, requests)
    pass

# TODO: Implement dynamic trimming strategies
def apply_dynamic_trimming(state: DynamicContextState) -> List[BaseMessage]:
    """Apply dynamic trimming based on state configuration."""
    messages = state['messages']
    strategy = state['context_strategy']
    max_tokens = state['max_context_tokens']
    
    if strategy == "sliding_window":
        # TODO: Implement sliding window approach
        pass
    elif strategy == "importance_based":
        # TODO: Implement importance-based filtering
        pass
    elif strategy == "topic_clustering":
        # TODO: Implement topic-based clustering
        pass
    
    return messages  # TODO: Return trimmed messages

print("Dynamic context management system defined - implement the analysis logic!")

## Summary

In these exercises, you've practiced:
- Basic message handling with MessagesState in LangGraph
- Implementing various message filtering strategies (type, content, recency)
- Token-based message trimming to manage context window limits
- Using RemoveMessage for selective message deletion
- Building sophisticated message management systems with context awareness
- Creating adaptive strategies based on conversation characteristics

Key takeaways:
- **MessagesState**: LangGraph's built-in state for handling conversation messages
- **Message Filtering**: Essential for managing conversation context and relevance
- **Token Management**: Critical for staying within model context limits
- **RemoveMessage**: Powerful tool for selective message cleanup in persistent conversations
- **Adaptive Strategies**: Different conversation types benefit from different management approaches
- **Context Optimization**: Balance between preserving important information and managing resource constraints

These message management techniques are crucial for building production-ready conversational AI systems that can handle long-running conversations efficiently while maintaining context relevance.

Next, continue with the chatbot-summarization exercises to learn about creating conversational summaries for long-term memory!