![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# Context Pruning: Intelligent Memory Cleanup

## Learning Objectives (30 minutes)
By the end of this notebook, you will be able to:
1. **Understand** why context accumulates "cruft" and degrades performance
2. **Implement** relevance scoring for memory records and conversations
3. **Create** intelligent pruning strategies for different types of context
4. **Design** automated cleanup processes for your Agent Memory Server
5. **Measure** the impact of pruning on agent performance and accuracy

## Prerequisites
- Completed previous notebooks in Section 5
- Understanding of Agent Memory Server and Redis
- Familiarity with your Redis University Class Agent

---

## Introduction

**Context Pruning** is the practice of intelligently removing irrelevant, outdated, or redundant information from your agent's memory to maintain optimal context quality. Like pruning a garden, removing the dead branches helps the healthy parts flourish.

### The Context Accumulation Problem

Over time, agents accumulate "context cruft":
- **Outdated preferences**: "I prefer morning classes" (from 2 semesters ago)
- **Irrelevant conversations**: Course browsing mixed with career planning
- **Redundant information**: Multiple similar course searches
- **Stale data**: Old course availability or requirements

### Our Solution: Intelligent Pruning

We'll implement:
1. **Relevance scoring** for memory records
2. **Time-based decay** for aging information
3. **Semantic deduplication** for redundant content
4. **Context health monitoring** for proactive cleanup

## Environment Setup

In [None]:
# Environment setup
import os
import asyncio
import json
from typing import List, Dict, Any, Optional, Tuple
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from enum import Enum
import math
import hashlib
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
AGENT_MEMORY_URL = os.getenv("AGENT_MEMORY_URL", "http://localhost:8088")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

print("🔧 Environment Setup")
print("=" * 30)
print(f"Redis URL: {REDIS_URL}")
print(f"Agent Memory URL: {AGENT_MEMORY_URL}")
print(f"OpenAI API Key: {'✅ Set' if OPENAI_API_KEY else '❌ Not set'}")

In [None]:
# Import required modules
try:
    import redis
    from redis_context_course.models import StudentProfile
    from redis_context_course.course_manager import CourseManager
    from redis_context_course.redis_config import redis_config
    
    # Redis connection
    redis_client = redis.from_url(REDIS_URL)
    if redis_config.health_check():
        print("✅ Redis connection healthy")
    else:
        print("❌ Redis connection failed")
    
    print("✅ Core modules imported successfully")
    
except ImportError as e:
    print(f"❌ Import failed: {e}")
    print("Please ensure you've completed the setup from previous sections.")

## Memory Record and Relevance Framework

Let's create a framework for tracking and scoring memory relevance:

In [None]:
class MemoryType(Enum):
    """Types of memory records."""
    CONVERSATION = "conversation"
    PREFERENCE = "preference"
    COURSE_INTERACTION = "course_interaction"
    ACADEMIC_PROGRESS = "academic_progress"
    CAREER_INTEREST = "career_interest"
    SEARCH_HISTORY = "search_history"

@dataclass
class MemoryRecord:
    """Represents a memory record with relevance metadata."""
    id: str
    memory_type: MemoryType
    content: str
    timestamp: datetime
    student_id: str
    namespace: str = "default"
    
    # Relevance scoring factors
    access_count: int = 0
    last_accessed: Optional[datetime] = None
    relevance_score: float = 1.0
    importance_weight: float = 1.0
    
    # Content metadata
    content_hash: Optional[str] = None
    related_records: List[str] = field(default_factory=list)
    tags: List[str] = field(default_factory=list)
    
    def __post_init__(self):
        if self.content_hash is None:
            self.content_hash = self._calculate_content_hash()
        if self.last_accessed is None:
            self.last_accessed = self.timestamp
    
    def _calculate_content_hash(self) -> str:
        """Calculate hash for content deduplication."""
        content_normalized = self.content.lower().strip()
        return hashlib.md5(content_normalized.encode()).hexdigest()[:16]
    
    def update_access(self):
        """Update access tracking."""
        self.access_count += 1
        self.last_accessed = datetime.now()
    
    def age_in_days(self) -> float:
        """Calculate age of record in days."""
        return (datetime.now() - self.timestamp).total_seconds() / 86400
    
    def days_since_access(self) -> float:
        """Calculate days since last access."""
        if self.last_accessed:
            return (datetime.now() - self.last_accessed).total_seconds() / 86400
        return self.age_in_days()

class RelevanceScorer:
    """Calculates relevance scores for memory records."""
    
    def __init__(self):
        # Scoring weights for different factors
        self.weights = {
            "recency": 0.3,      # How recent is the memory?
            "frequency": 0.25,   # How often is it accessed?
            "importance": 0.25,  # How important is the content type?
            "relevance": 0.2     # How relevant to current context?
        }
        
        # Importance weights by memory type
        self.type_importance = {
            MemoryType.ACADEMIC_PROGRESS: 1.0,
            MemoryType.PREFERENCE: 0.8,
            MemoryType.CAREER_INTEREST: 0.7,
            MemoryType.COURSE_INTERACTION: 0.6,
            MemoryType.CONVERSATION: 0.4,
            MemoryType.SEARCH_HISTORY: 0.3
        }
    
    def calculate_relevance_score(self, record: MemoryRecord, current_context: Optional[str] = None) -> float:
        """Calculate overall relevance score for a memory record."""
        
        # 1. Recency score (exponential decay)
        age_days = record.age_in_days()
        recency_score = math.exp(-age_days / 30)  # 30-day half-life
        
        # 2. Frequency score (logarithmic)
        frequency_score = math.log(record.access_count + 1) / math.log(10)  # Log base 10
        frequency_score = min(frequency_score, 1.0)  # Cap at 1.0
        
        # 3. Importance score (by type)
        importance_score = self.type_importance.get(record.memory_type, 0.5)
        importance_score *= record.importance_weight
        
        # 4. Context relevance score
        context_score = self._calculate_context_relevance(record, current_context)
        
        # Combine scores
        total_score = (
            self.weights["recency"] * recency_score +
            self.weights["frequency"] * frequency_score +
            self.weights["importance"] * importance_score +
            self.weights["relevance"] * context_score
        )
        
        return min(total_score, 1.0)  # Cap at 1.0
    
    def _calculate_context_relevance(self, record: MemoryRecord, current_context: Optional[str]) -> float:
        """Calculate relevance to current context."""
        if not current_context:
            return 0.5  # Neutral score
        
        # Simple keyword matching (in real implementation, use embeddings)
        context_words = set(current_context.lower().split())
        record_words = set(record.content.lower().split())
        
        if not context_words or not record_words:
            return 0.5
        
        # Calculate Jaccard similarity
        intersection = len(context_words & record_words)
        union = len(context_words | record_words)
        
        return intersection / union if union > 0 else 0.0

# Initialize the relevance scorer
relevance_scorer = RelevanceScorer()

print("✅ Memory record and relevance framework initialized")

## Context Pruning Engine

Now let's create the main pruning engine that implements different cleanup strategies:

In [None]:
class PruningStrategy(Enum):
    """Different pruning strategies."""
    RELEVANCE_THRESHOLD = "relevance_threshold"  # Remove below threshold
    TOP_K_RETENTION = "top_k_retention"          # Keep only top K records
    TIME_BASED = "time_based"                    # Remove older than X days
    DEDUPLICATION = "deduplication"              # Remove duplicate content
    HYBRID = "hybrid"                            # Combination of strategies

@dataclass
class PruningConfig:
    """Configuration for pruning operations."""
    strategy: PruningStrategy
    relevance_threshold: float = 0.3
    max_records_per_type: int = 100
    max_age_days: int = 90
    enable_deduplication: bool = True
    preserve_important: bool = True

class ContextPruner:
    """Intelligent context pruning engine."""
    
    def __init__(self, relevance_scorer: RelevanceScorer):
        self.relevance_scorer = relevance_scorer
        self.pruning_stats = {
            "total_pruned": 0,
            "by_strategy": {},
            "by_type": {}
        }
    
    async def prune_memory_records(self, 
                                 records: List[MemoryRecord], 
                                 config: PruningConfig,
                                 current_context: Optional[str] = None) -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Prune memory records based on configuration."""
        
        original_count = len(records)
        pruned_records = records.copy()
        pruning_report = {
            "original_count": original_count,
            "strategy": config.strategy.value,
            "operations": []
        }
        
        # Update relevance scores
        for record in pruned_records:
            record.relevance_score = self.relevance_scorer.calculate_relevance_score(record, current_context)
        
        # Apply pruning strategy
        if config.strategy == PruningStrategy.RELEVANCE_THRESHOLD:
            pruned_records, operation_report = self._prune_by_relevance(pruned_records, config)
            pruning_report["operations"].append(operation_report)
        
        elif config.strategy == PruningStrategy.TOP_K_RETENTION:
            pruned_records, operation_report = self._prune_by_top_k(pruned_records, config)
            pruning_report["operations"].append(operation_report)
        
        elif config.strategy == PruningStrategy.TIME_BASED:
            pruned_records, operation_report = self._prune_by_age(pruned_records, config)
            pruning_report["operations"].append(operation_report)
        
        elif config.strategy == PruningStrategy.DEDUPLICATION:
            pruned_records, operation_report = self._prune_duplicates(pruned_records, config)
            pruning_report["operations"].append(operation_report)
        
        elif config.strategy == PruningStrategy.HYBRID:
            # Apply multiple strategies in sequence
            strategies = [
                (self._prune_duplicates, "deduplication"),
                (self._prune_by_age, "time_based"),
                (self._prune_by_relevance, "relevance_threshold")
            ]
            
            for prune_func, strategy_name in strategies:
                pruned_records, operation_report = prune_func(pruned_records, config)
                operation_report["strategy"] = strategy_name
                pruning_report["operations"].append(operation_report)
        
        # Final statistics
        final_count = len(pruned_records)
        pruning_report["final_count"] = final_count
        pruning_report["pruned_count"] = original_count - final_count
        pruning_report["retention_rate"] = final_count / original_count if original_count > 0 else 1.0
        
        # Update global stats
        self.pruning_stats["total_pruned"] += pruning_report["pruned_count"]
        
        return pruned_records, pruning_report
    
    def _prune_by_relevance(self, records: List[MemoryRecord], config: PruningConfig) -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Prune records below relevance threshold."""
        original_count = len(records)
        
        # Keep records above threshold or marked as important
        kept_records = [
            record for record in records
            if record.relevance_score >= config.relevance_threshold or 
               (config.preserve_important and record.importance_weight > 0.8)
        ]
        
        return kept_records, {
            "operation": "relevance_threshold",
            "threshold": config.relevance_threshold,
            "original_count": original_count,
            "kept_count": len(kept_records),
            "pruned_count": original_count - len(kept_records)
        }
    
    def _prune_by_top_k(self, records: List[MemoryRecord], config: PruningConfig) -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Keep only top K records by relevance score."""
        original_count = len(records)
        
        # Group by memory type and keep top K for each type
        records_by_type = {}
        for record in records:
            if record.memory_type not in records_by_type:
                records_by_type[record.memory_type] = []
            records_by_type[record.memory_type].append(record)
        
        kept_records = []
        for memory_type, type_records in records_by_type.items():
            # Sort by relevance score and keep top K
            type_records.sort(key=lambda r: r.relevance_score, reverse=True)
            kept_records.extend(type_records[:config.max_records_per_type])
        
        return kept_records, {
            "operation": "top_k_retention",
            "max_per_type": config.max_records_per_type,
            "original_count": original_count,
            "kept_count": len(kept_records),
            "pruned_count": original_count - len(kept_records)
        }
    
    def _prune_by_age(self, records: List[MemoryRecord], config: PruningConfig) -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Prune records older than max age."""
        original_count = len(records)
        
        # Keep records newer than max age or marked as important
        kept_records = [
            record for record in records
            if record.age_in_days() <= config.max_age_days or
               (config.preserve_important and record.importance_weight > 0.8)
        ]
        
        return kept_records, {
            "operation": "time_based",
            "max_age_days": config.max_age_days,
            "original_count": original_count,
            "kept_count": len(kept_records),
            "pruned_count": original_count - len(kept_records)
        }
    
    def _prune_duplicates(self, records: List[MemoryRecord], config: PruningConfig) -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Remove duplicate records based on content hash."""
        original_count = len(records)
        
        # Group by content hash
        hash_groups = {}
        for record in records:
            if record.content_hash not in hash_groups:
                hash_groups[record.content_hash] = []
            hash_groups[record.content_hash].append(record)
        
        # Keep the most relevant record from each group
        kept_records = []
        for hash_value, group_records in hash_groups.items():
            if len(group_records) == 1:
                kept_records.append(group_records[0])
            else:
                # Keep the most relevant record
                best_record = max(group_records, key=lambda r: r.relevance_score)
                kept_records.append(best_record)
        
        return kept_records, {
            "operation": "deduplication",
            "original_count": original_count,
            "kept_count": len(kept_records),
            "pruned_count": original_count - len(kept_records),
            "duplicate_groups": len([g for g in hash_groups.values() if len(g) > 1])
        }
    
    def get_pruning_statistics(self) -> Dict[str, Any]:
        """Get overall pruning statistics."""
        return self.pruning_stats.copy()

# Initialize the context pruner
context_pruner = ContextPruner(relevance_scorer)

print("✅ Context pruning engine initialized")

## Demonstration: Context Pruning in Action

Let's create some sample memory records and see how different pruning strategies work:

In [None]:
# Create sample memory records for demonstration
def create_sample_memory_records() -> List[MemoryRecord]:
    """Create sample memory records for testing pruning."""
    
    base_time = datetime.now()
    records = []
    
    # Recent academic progress (high importance)
    records.append(MemoryRecord(
        id="prog_001",
        memory_type=MemoryType.ACADEMIC_PROGRESS,
        content="Completed CS201 with grade A, now eligible for CS301",
        timestamp=base_time - timedelta(days=5),
        student_id="test_student",
        access_count=8,
        importance_weight=1.0
    ))
    
    # Old preference (should be pruned)
    records.append(MemoryRecord(
        id="pref_001",
        memory_type=MemoryType.PREFERENCE,
        content="I prefer morning classes",
        timestamp=base_time - timedelta(days=120),
        student_id="test_student",
        access_count=1,
        importance_weight=0.5
    ))
    
    # Recent preference (should be kept)
    records.append(MemoryRecord(
        id="pref_002",
        memory_type=MemoryType.PREFERENCE,
        content="I prefer online courses due to work schedule",
        timestamp=base_time - timedelta(days=10),
        student_id="test_student",
        access_count=5,
        importance_weight=0.8
    ))
    
    # Duplicate course searches
    for i in range(3):
        records.append(MemoryRecord(
            id=f"search_{i:03d}",
            memory_type=MemoryType.SEARCH_HISTORY,
            content="searched for machine learning courses",  # Same content
            timestamp=base_time - timedelta(days=15 + i),
            student_id="test_student",
            access_count=1,
            importance_weight=0.3
        ))
    
    # Various course interactions
    course_interactions = [
        "Viewed details for CS401: Machine Learning",
        "Checked prerequisites for MATH301",
        "Added CS402 to wishlist",
        "Compared CS401 and CS402 courses",
        "Asked about CS401 difficulty level"
    ]
    
    for i, interaction in enumerate(course_interactions):
        records.append(MemoryRecord(
            id=f"course_{i:03d}",
            memory_type=MemoryType.COURSE_INTERACTION,
            content=interaction,
            timestamp=base_time - timedelta(days=20 + i * 5),
            student_id="test_student",
            access_count=2 + i,
            importance_weight=0.6
        ))
    
    # Old conversations (low relevance)
    old_conversations = [
        "Asked about general course catalog",
        "Inquired about registration deadlines",
        "General questions about university policies"
    ]
    
    for i, conv in enumerate(old_conversations):
        records.append(MemoryRecord(
            id=f"conv_{i:03d}",
            memory_type=MemoryType.CONVERSATION,
            content=conv,
            timestamp=base_time - timedelta(days=60 + i * 10),
            student_id="test_student",
            access_count=1,
            importance_weight=0.4
        ))
    
    # Career interests
    records.append(MemoryRecord(
        id="career_001",
        memory_type=MemoryType.CAREER_INTEREST,
        content="Interested in AI and machine learning careers",
        timestamp=base_time - timedelta(days=30),
        student_id="test_student",
        access_count=4,
        importance_weight=0.9
    ))
    
    return records

# Create sample data
sample_records = create_sample_memory_records()

print(f"📚 Created {len(sample_records)} sample memory records")
print("\n📋 Record Distribution:")
type_counts = {}
for record in sample_records:
    type_counts[record.memory_type] = type_counts.get(record.memory_type, 0) + 1

for memory_type, count in type_counts.items():
    print(f"   • {memory_type.value}: {count} records")

# Show some sample records
print("\n🔍 Sample Records:")
for i, record in enumerate(sample_records[:5]):
    print(f"   {i+1}. [{record.memory_type.value}] {record.content[:50]}... (Age: {record.age_in_days():.1f} days)")

## Testing Different Pruning Strategies

Let's test each pruning strategy and see how they affect our memory records:

In [None]:
# Test different pruning strategies
print("🧪 Testing Different Pruning Strategies")
print("=" * 60)

# Current context for relevance scoring
current_context = "I want to take machine learning courses and plan my AI career path"

# Test configurations
test_configs = [
    {
        "name": "Relevance Threshold",
        "config": PruningConfig(
            strategy=PruningStrategy.RELEVANCE_THRESHOLD,
            relevance_threshold=0.4
        )
    },
    {
        "name": "Top-K Retention",
        "config": PruningConfig(
            strategy=PruningStrategy.TOP_K_RETENTION,
            max_records_per_type=2
        )
    },
    {
        "name": "Time-Based",
        "config": PruningConfig(
            strategy=PruningStrategy.TIME_BASED,
            max_age_days=45
        )
    },
    {
        "name": "Deduplication",
        "config": PruningConfig(
            strategy=PruningStrategy.DEDUPLICATION
        )
    },
    {
        "name": "Hybrid Strategy",
        "config": PruningConfig(
            strategy=PruningStrategy.HYBRID,
            relevance_threshold=0.3,
            max_age_days=60,
            max_records_per_type=3
        )
    }
]

# Test each strategy
for test_case in test_configs:
    print(f"\n🎯 Testing: {test_case['name']}")
    print("-" * 40)
    
    # Apply pruning
    pruned_records, report = await context_pruner.prune_memory_records(
        sample_records.copy(),
        test_case['config'],
        current_context
    )
    
    # Display results
    print(f"📊 Results:")
    print(f"   Original: {report['original_count']} records")
    print(f"   Kept: {report['final_count']} records")
    print(f"   Pruned: {report['pruned_count']} records")
    print(f"   Retention Rate: {report['retention_rate']:.1%}")
    
    # Show operations performed
    if report['operations']:
        print(f"\n🔧 Operations:")
        for op in report['operations']:
            print(f"   • {op['operation']}: {op['pruned_count']} records removed")
    
    # Show what was kept by type
    kept_by_type = {}
    for record in pruned_records:
        kept_by_type[record.memory_type] = kept_by_type.get(record.memory_type, 0) + 1
    
    print(f"\n📋 Kept by Type:")
    for memory_type, count in kept_by_type.items():
        print(f"   • {memory_type.value}: {count} records")

print("\n" + "=" * 60)

## Relevance Score Analysis

Let's analyze how relevance scores are calculated and what factors influence them:

In [None]:
# Analyze relevance scores
print("📊 Relevance Score Analysis")
print("=" * 50)

# Calculate relevance scores for all records
current_context = "machine learning courses and AI career planning"

scored_records = []
for record in sample_records:
    score = relevance_scorer.calculate_relevance_score(record, current_context)
    scored_records.append((record, score))

# Sort by relevance score
scored_records.sort(key=lambda x: x[1], reverse=True)

print(f"📝 Context: '{current_context}'")
print("\n🏆 Top 10 Most Relevant Records:")
print("Rank | Score | Type | Age | Access | Content")
print("-" * 80)

for i, (record, score) in enumerate(scored_records[:10], 1):
    content_preview = record.content[:40] + "..." if len(record.content) > 40 else record.content
    print(f"{i:4d} | {score:.3f} | {record.memory_type.value[:12]:12s} | {record.age_in_days():4.0f}d | {record.access_count:6d} | {content_preview}")

print("\n📉 Bottom 5 Least Relevant Records:")
print("Rank | Score | Type | Age | Access | Content")
print("-" * 80)

for i, (record, score) in enumerate(scored_records[-5:], len(scored_records)-4):
    content_preview = record.content[:40] + "..." if len(record.content) > 40 else record.content
    print(f"{i:4d} | {score:.3f} | {record.memory_type.value[:12]:12s} | {record.age_in_days():4.0f}d | {record.access_count:6d} | {content_preview}")

# Analyze score distribution
scores = [score for _, score in scored_records]
print(f"\n📈 Score Statistics:")
print(f"   Average: {sum(scores)/len(scores):.3f}")
print(f"   Highest: {max(scores):.3f}")
print(f"   Lowest: {min(scores):.3f}")
print(f"   Above 0.5: {len([s for s in scores if s > 0.5])} records")
print(f"   Below 0.3: {len([s for s in scores if s < 0.3])} records")

## 🧪 Hands-on Exercise: Design Your Pruning Strategy

Now it's your turn to experiment with context pruning:

In [None]:
# Exercise: Create your own pruning strategy
print("🧪 Exercise: Design Your Context Pruning Strategy")
print("=" * 60)

# TODO: Create a custom pruning strategy
class CustomPruningStrategy:
    """Custom pruning strategy that combines multiple factors."""
    
    def __init__(self):
        self.name = "Smart Academic Pruning"
    
    def should_keep_record(self, record: MemoryRecord, current_context: str = "") -> bool:
        """Decide whether to keep a record based on custom logic."""
        
        # Always keep recent academic progress
        if (record.memory_type == MemoryType.ACADEMIC_PROGRESS and 
            record.age_in_days() <= 180):
            return True
        
        # Keep recent preferences that are frequently accessed
        if (record.memory_type == MemoryType.PREFERENCE and 
            record.age_in_days() <= 60 and 
            record.access_count >= 3):
            return True
        
        # Keep career interests if they're relevant to current context
        if record.memory_type == MemoryType.CAREER_INTEREST:
            if current_context and any(word in current_context.lower() 
                                     for word in ["career", "job", "work", "ai", "machine learning"]):
                return True
        
        # Keep course interactions if they're recent or frequently accessed
        if (record.memory_type == MemoryType.COURSE_INTERACTION and 
            (record.age_in_days() <= 30 or record.access_count >= 5)):
            return True
        
        # Prune old search history and conversations
        if record.memory_type in [MemoryType.SEARCH_HISTORY, MemoryType.CONVERSATION]:
            if record.age_in_days() > 30 and record.access_count <= 2:
                return False
        
        # Default: keep if relevance score is decent
        return record.relevance_score >= 0.4
    
    def prune_records(self, records: List[MemoryRecord], current_context: str = "") -> Tuple[List[MemoryRecord], Dict[str, Any]]:
        """Apply custom pruning logic."""
        original_count = len(records)
        
        kept_records = []
        pruning_reasons = {}
        
        for record in records:
            if self.should_keep_record(record, current_context):
                kept_records.append(record)
            else:
                # Track why it was pruned
                reason = self._get_pruning_reason(record, current_context)
                pruning_reasons[record.id] = reason
        
        return kept_records, {
            "strategy": self.name,
            "original_count": original_count,
            "kept_count": len(kept_records),
            "pruned_count": original_count - len(kept_records),
            "pruning_reasons": pruning_reasons
        }
    
    def _get_pruning_reason(self, record: MemoryRecord, current_context: str) -> str:
        """Get reason why record was pruned."""
        if record.memory_type in [MemoryType.SEARCH_HISTORY, MemoryType.CONVERSATION]:
            if record.age_in_days() > 30 and record.access_count <= 2:
                return "Old and rarely accessed"
        
        if record.relevance_score < 0.4:
            return "Low relevance score"
        
        return "Custom logic"

# Test your custom strategy
custom_strategy = CustomPruningStrategy()
current_context = "I want to plan my AI career and take machine learning courses"

print(f"\n🎯 Testing Custom Strategy: {custom_strategy.name}")
print(f"📝 Context: '{current_context}'")
print("-" * 50)

# Apply custom pruning
custom_kept, custom_report = custom_strategy.prune_records(sample_records.copy(), current_context)

print(f"📊 Results:")
print(f"   Original: {custom_report['original_count']} records")
print(f"   Kept: {custom_report['kept_count']} records")
print(f"   Pruned: {custom_report['pruned_count']} records")
print(f"   Retention Rate: {custom_report['kept_count']/custom_report['original_count']:.1%}")

# Show pruning reasons
if custom_report['pruning_reasons']:
    print(f"\n🗑️ Pruning Reasons:")
    reason_counts = {}
    for reason in custom_report['pruning_reasons'].values():
        reason_counts[reason] = reason_counts.get(reason, 0) + 1
    
    for reason, count in reason_counts.items():
        print(f"   • {reason}: {count} records")

# Compare with hybrid strategy
hybrid_config = PruningConfig(strategy=PruningStrategy.HYBRID, relevance_threshold=0.4)
hybrid_kept, hybrid_report = await context_pruner.prune_memory_records(
    sample_records.copy(), hybrid_config, current_context
)

print(f"\n🔄 Comparison with Hybrid Strategy:")
print(f"   Custom Strategy: {len(custom_kept)} records kept")
print(f"   Hybrid Strategy: {len(hybrid_kept)} records kept")
print(f"   Difference: {len(custom_kept) - len(hybrid_kept)} records")

print("\n🤔 Reflection Questions:")
print("1. Which strategy better preserves important academic information?")
print("2. How does context-awareness affect pruning decisions?")
print("3. What are the trade-offs between aggressive and conservative pruning?")
print("4. How would you adapt this strategy for different student types?")

print("\n🔧 Your Turn: Try These Modifications:")
print("   • Add student-specific pruning rules")
print("   • Implement seasonal pruning (end of semester cleanup)")
print("   • Create domain-specific relevance scoring")
print("   • Add user feedback to improve pruning decisions")

## Key Takeaways

From this exploration of context pruning, you've learned:

### 🎯 **Core Concepts**
- **Context accumulation** naturally leads to performance degradation
- **Relevance scoring** combines multiple factors (recency, frequency, importance, context)
- **Intelligent pruning** preserves important information while removing cruft
- **Multiple strategies** serve different use cases and requirements

### 🛠️ **Implementation Patterns**
- **Multi-factor scoring** for nuanced relevance assessment
- **Strategy composition** for hybrid approaches
- **Content deduplication** using hashing techniques
- **Preservation rules** for critical information types

### 📊 **Performance Benefits**
- **Reduced context noise** improves decision quality
- **Faster retrieval** with smaller memory footprint
- **Better relevance** through focused information
- **Proactive maintenance** prevents context degradation

### 🔄 **Pruning Strategies**
- **Relevance threshold**: Remove below quality bar
- **Top-K retention**: Keep only the best records
- **Time-based**: Remove outdated information
- **Deduplication**: Eliminate redundant content
- **Hybrid**: Combine multiple approaches

### 🚀 **Next Steps**
In the next notebook, we'll explore **Context Summarization** - how to compress accumulated context into concise summaries while preserving essential information for decision-making.

The pruning techniques you've learned provide the foundation for maintaining clean, relevant context that can be effectively summarized.

---

**Ready to continue?** Move on to `04_context_summarization.ipynb` to learn about intelligent context compression!