# DSPy Cache Management

This tutorial demonstrates how to use DSPy's caching capabilities to improve performance and reduce API costs.

## What is Caching in DSPy?

Caching in DSPy stores the results of language model calls to avoid redundant API requests. This is especially useful during development, testing, and when you have repeated queries.

## Benefits:
- **Faster development**: Avoid waiting for repeated API calls
- **Cost reduction**: Reduce API usage and costs
- **Consistent results**: Get the same output for identical inputs
- **Offline development**: Work with cached responses when offline

In [None]:
# Install required packages
import sys
import subprocess

def install_package(package):
    subprocess.check_call([sys.executable, "-m", "pip", "install", package])

try:
    import dspy
except ImportError:
    install_package("dspy")
    import dspy

import os
import time
import json
from pathlib import Path
import hashlib

## Basic Cache Configuration

DSPy provides built-in caching functionality that can be configured in several ways.

In [None]:
# Configure DSPy with caching
lm = dspy.LM('openai/gpt-4o-mini', api_key=os.getenv('OPENAI_API_KEY'))

# Enable built-in caching
cache_dir = Path("dspy_cache")
cache_dir.mkdir(exist_ok=True)

# Configure cache using DSPy's built-in cache function
dspy.configure(
    lm=lm,
    cache=True  # Enable caching
)

print("DSPy configured with caching enabled")
print(f"Cache directory: {cache_dir.absolute()}")

## Testing Cache Performance

Let's create a simple module to test caching performance.

In [None]:
class SimpleQASignature(dspy.Signature):
    """Answer a question concisely."""
    
    question: str = dspy.InputField(desc="The question to answer")
    answer: str = dspy.OutputField(desc="A concise answer")

class CachedQASystem(dspy.Module):
    """Simple QA system to test caching."""
    
    def __init__(self):
        super().__init__()
        self.qa = dspy.ChainOfThought(SimpleQASignature)
    
    def forward(self, question: str) -> dspy.Prediction:
        return self.qa(question=question)

# Create the system
qa_system = CachedQASystem()

# Test questions
test_questions = [
    "What is the capital of France?",
    "What is 2 + 2?",
    "Who wrote Romeo and Juliet?",
    "What is the largest planet in our solar system?"
]

print("Testing cache performance...")

# First run (no cache)
print("\nFirst run (populating cache):")
first_run_times = []

for question in test_questions:
    start_time = time.time()
    result = qa_system(question=question)
    end_time = time.time()
    
    duration = end_time - start_time
    first_run_times.append(duration)
    
    print(f"Q: {question}")
    print(f"A: {result.answer}")
    print(f"Time: {duration:.2f}s\n")

# Second run (with cache)
print("Second run (using cache):")
second_run_times = []

for question in test_questions:
    start_time = time.time()
    result = qa_system(question=question)
    end_time = time.time()
    
    duration = end_time - start_time
    second_run_times.append(duration)
    
    print(f"Q: {question}")
    print(f"A: {result.answer}")
    print(f"Time: {duration:.2f}s\n")

# Performance comparison
avg_first_run = sum(first_run_times) / len(first_run_times)
avg_second_run = sum(second_run_times) / len(second_run_times)
speedup = avg_first_run / avg_second_run if avg_second_run > 0 else float('inf')

print(f"Performance Summary:")
print(f"Average time (first run): {avg_first_run:.2f}s")
print(f"Average time (cached): {avg_second_run:.2f}s")
print(f"Speedup: {speedup:.1f}x")

## Advanced Cache Management

Let's implement more sophisticated cache management with custom cache handlers.

In [None]:
class AdvancedCacheManager:
    """Advanced cache manager with TTL, size limits, and statistics."""
    
    def __init__(self, cache_dir: str = "advanced_cache", 
                 max_size_mb: int = 100, default_ttl_hours: int = 24):
        self.cache_dir = Path(cache_dir)
        self.cache_dir.mkdir(exist_ok=True)
        self.max_size_bytes = max_size_mb * 1024 * 1024
        self.default_ttl_seconds = default_ttl_hours * 3600
        
        # Statistics
        self.stats = {
            'hits': 0,
            'misses': 0,
            'evictions': 0,
            'total_requests': 0
        }
        
        self.metadata_file = self.cache_dir / "cache_metadata.json"
        self.load_metadata()
    
    def _generate_cache_key(self, inputs: dict) -> str:
        """Generate a unique cache key from inputs."""
        # Create a consistent string representation
        sorted_inputs = json.dumps(inputs, sort_keys=True)
        return hashlib.md5(sorted_inputs.encode()).hexdigest()
    
    def load_metadata(self):
        """Load cache metadata."""
        if self.metadata_file.exists():
            with open(self.metadata_file, 'r') as f:
                self.metadata = json.load(f)
        else:
            self.metadata = {}
    
    def save_metadata(self):
        """Save cache metadata."""
        with open(self.metadata_file, 'w') as f:
            json.dump(self.metadata, f, indent=2)
    
    def get(self, inputs: dict) -> dict:
        """Get cached result if available and not expired."""
        self.stats['total_requests'] += 1
        
        cache_key = self._generate_cache_key(inputs)
        cache_file = self.cache_dir / f"{cache_key}.json"
        
        if not cache_file.exists():
            self.stats['misses'] += 1
            return None
        
        # Check if expired
        if cache_key in self.metadata:
            created_at = self.metadata[cache_key]['created_at']
            ttl = self.metadata[cache_key].get('ttl', self.default_ttl_seconds)
            
            if time.time() - created_at > ttl:
                # Expired, remove
                cache_file.unlink()
                del self.metadata[cache_key]
                self.save_metadata()
                self.stats['misses'] += 1
                return None
        
        # Load cached result
        try:
            with open(cache_file, 'r') as f:
                cached_data = json.load(f)
            
            self.stats['hits'] += 1
            
            # Update access time
            if cache_key in self.metadata:
                self.metadata[cache_key]['last_accessed'] = time.time()
                self.save_metadata()
            
            return cached_data
        
        except (json.JSONDecodeError, FileNotFoundError):
            self.stats['misses'] += 1
            return None
    
    def put(self, inputs: dict, result: dict, ttl_hours: int = None):
        """Cache a result."""
        cache_key = self._generate_cache_key(inputs)
        cache_file = self.cache_dir / f"{cache_key}.json"
        
        # Check cache size and evict if necessary
        self._enforce_size_limit()
        
        # Prepare cache data
        cache_data = {
            'inputs': inputs,
            'result': result,
            'cached_at': time.time()
        }
        
        # Save to file
        with open(cache_file, 'w') as f:
            json.dump(cache_data, f, indent=2)
        
        # Update metadata
        ttl_seconds = (ttl_hours * 3600) if ttl_hours else self.default_ttl_seconds
        self.metadata[cache_key] = {
            'created_at': time.time(),
            'last_accessed': time.time(),
            'ttl': ttl_seconds,
            'size_bytes': cache_file.stat().st_size
        }
        self.save_metadata()
    
    def _enforce_size_limit(self):
        """Enforce cache size limit by evicting least recently used items."""
        total_size = sum(
            meta['size_bytes'] for meta in self.metadata.values()
        )
        
        if total_size <= self.max_size_bytes:
            return
        
        # Sort by last accessed time (LRU)
        sorted_items = sorted(
            self.metadata.items(),
            key=lambda x: x[1]['last_accessed']
        )
        
        # Remove oldest items until under size limit
        for cache_key, meta in sorted_items:
            if total_size <= self.max_size_bytes:
                break
            
            # Remove file and metadata
            cache_file = self.cache_dir / f"{cache_key}.json"
            if cache_file.exists():
                cache_file.unlink()
            
            total_size -= meta['size_bytes']
            del self.metadata[cache_key]
            self.stats['evictions'] += 1
        
        self.save_metadata()
    
    def clear(self):
        """Clear all cache."""
        for cache_file in self.cache_dir.glob("*.json"):
            if cache_file.name != "cache_metadata.json":
                cache_file.unlink()
        
        self.metadata = {}
        self.save_metadata()
        
        # Reset stats
        self.stats = {
            'hits': 0,
            'misses': 0,
            'evictions': 0,
            'total_requests': 0
        }
    
    def get_stats(self) -> dict:
        """Get cache statistics."""
        hit_rate = (self.stats['hits'] / self.stats['total_requests'] 
                   if self.stats['total_requests'] > 0 else 0)
        
        total_size = sum(
            meta['size_bytes'] for meta in self.metadata.values()
        )
        
        return {
            **self.stats,
            'hit_rate': hit_rate,
            'cache_size_bytes': total_size,
            'cache_size_mb': total_size / (1024 * 1024),
            'cached_items': len(self.metadata)
        }

# Test advanced cache manager
advanced_cache = AdvancedCacheManager(
    cache_dir="advanced_cache",
    max_size_mb=10,
    default_ttl_hours=1
)

print("Advanced cache manager created")
print(f"Initial stats: {advanced_cache.get_stats()}")

## Cached DSPy Module

Let's create a DSPy module that uses our advanced cache.

In [None]:
class CachedModule(dspy.Module):
    """DSPy module with advanced caching capabilities."""
    
    def __init__(self, cache_manager: AdvancedCacheManager = None, use_cache: bool = True):
        super().__init__()
        self.cache_manager = cache_manager or AdvancedCacheManager()
        self.use_cache = use_cache
        self.qa = dspy.ChainOfThought(SimpleQASignature)
    
    def forward(self, question: str) -> dspy.Prediction:
        # Prepare cache inputs
        cache_inputs = {
            'question': question,
            'module': 'CachedModule',
            'signature': 'SimpleQASignature'
        }
        
        # Try to get from cache
        if self.use_cache:
            cached_result = self.cache_manager.get(cache_inputs)
            if cached_result:
                # Return cached result
                return dspy.Prediction(
                    answer=cached_result['result']['answer'],
                    cached=True
                )
        
        # Generate new result
        result = self.qa(question=question)
        
        # Cache the result
        if self.use_cache:
            cache_result = {
                'answer': result.answer
            }
            self.cache_manager.put(cache_inputs, cache_result)
        
        return dspy.Prediction(
            answer=result.answer,
            cached=False
        )

# Test cached module
cached_qa = CachedModule(advanced_cache)

# Test with repeated questions
test_questions = [
    "What is the speed of light?",
    "Who invented the telephone?",
    "What is the largest ocean?",
    "What is the speed of light?",  # Repeat
    "Who invented the telephone?",  # Repeat
]

print("Testing cached module:")
for i, question in enumerate(test_questions, 1):
    start_time = time.time()
    result = cached_qa(question=question)
    end_time = time.time()
    
    print(f"\n{i}. Q: {question}")
    print(f"   A: {result.answer}")
    print(f"   Cached: {result.cached}")
    print(f"   Time: {end_time - start_time:.3f}s")

# Show cache statistics
print(f"\nCache Statistics:")
stats = advanced_cache.get_stats()
for key, value in stats.items():
    if isinstance(value, float):
        print(f"  {key}: {value:.3f}")
    else:
        print(f"  {key}: {value}")

## Cache Strategies for Different Use Cases

In [None]:
class CacheStrategy:
    """Base class for cache strategies."""
    
    def should_cache(self, inputs: dict, result: dict) -> bool:
        """Determine if a result should be cached."""
        return True
    
    def get_ttl_hours(self, inputs: dict, result: dict) -> int:
        """Get TTL for this cache entry."""
        return 24

class DevelopmentCacheStrategy(CacheStrategy):
    """Aggressive caching for development."""
    
    def should_cache(self, inputs: dict, result: dict) -> bool:
        return True  # Cache everything
    
    def get_ttl_hours(self, inputs: dict, result: dict) -> int:
        return 168  # 1 week

class ProductionCacheStrategy(CacheStrategy):
    """Conservative caching for production."""
    
    def should_cache(self, inputs: dict, result: dict) -> bool:
        # Only cache if result seems high quality
        if 'confidence' in result:
            return float(result['confidence']) > 0.8
        return len(result.get('answer', '')) > 10  # Meaningful answers
    
    def get_ttl_hours(self, inputs: dict, result: dict) -> int:
        # Shorter TTL for production
        return 6  # 6 hours

class TestingCacheStrategy(CacheStrategy):
    """No caching for testing to ensure fresh results."""
    
    def should_cache(self, inputs: dict, result: dict) -> bool:
        return False  # Never cache during testing

class SmartCachedModule(dspy.Module):
    """Module with strategy-based caching."""
    
    def __init__(self, cache_manager: AdvancedCacheManager, 
                 cache_strategy: CacheStrategy):
        super().__init__()
        self.cache_manager = cache_manager
        self.cache_strategy = cache_strategy
        self.qa = dspy.ChainOfThought(SimpleQASignature)
    
    def forward(self, question: str) -> dspy.Prediction:
        cache_inputs = {
            'question': question,
            'module': 'SmartCachedModule'
        }
        
        # Try cache first
        cached_result = self.cache_manager.get(cache_inputs)
        if cached_result:
            return dspy.Prediction(
                answer=cached_result['result']['answer'],
                cached=True
            )
        
        # Generate new result
        result = self.qa(question=question)
        
        # Check if we should cache this result
        result_dict = {'answer': result.answer}
        
        if self.cache_strategy.should_cache(cache_inputs, result_dict):
            ttl_hours = self.cache_strategy.get_ttl_hours(cache_inputs, result_dict)
            self.cache_manager.put(cache_inputs, result_dict, ttl_hours)
        
        return dspy.Prediction(
            answer=result.answer,
            cached=False
        )

# Test different strategies
strategies = {
    'development': DevelopmentCacheStrategy(),
    'production': ProductionCacheStrategy(),
    'testing': TestingCacheStrategy()
}

print("Testing different cache strategies:")

for strategy_name, strategy in strategies.items():
    print(f"\n=== {strategy_name.upper()} STRATEGY ===")
    
    # Create fresh cache for each strategy
    strategy_cache = AdvancedCacheManager(
        cache_dir=f"cache_{strategy_name}",
        max_size_mb=5
    )
    strategy_cache.clear()  # Start fresh
    
    smart_module = SmartCachedModule(strategy_cache, strategy)
    
    # Test questions
    questions = [
        "What is artificial intelligence?",
        "Hi",  # Short answer - might not be cached in production
        "What is artificial intelligence?"  # Repeat
    ]
    
    for question in questions:
        result = smart_module(question=question)
        print(f"Q: {question}")
        print(f"A: {result.answer[:50]}...")
        print(f"Cached: {result.cached}")
    
    stats = strategy_cache.get_stats()
    print(f"Strategy stats: {stats['cached_items']} items, "
          f"{stats['hit_rate']:.1%} hit rate")

## Cache Monitoring and Maintenance

In [None]:
class CacheMonitor:
    """Monitor and maintain cache health."""
    
    def __init__(self, cache_manager: AdvancedCacheManager):
        self.cache_manager = cache_manager
    
    def health_check(self) -> dict:
        """Perform a comprehensive health check."""
        stats = self.cache_manager.get_stats()
        
        health = {
            'overall_status': 'healthy',
            'issues': [],
            'recommendations': []
        }
        
        # Check hit rate
        if stats['hit_rate'] < 0.3 and stats['total_requests'] > 10:
            health['issues'].append('Low cache hit rate')
            health['recommendations'].append('Consider adjusting TTL or cache strategy')
        
        # Check cache size
        if stats['cache_size_mb'] > 8:  # Assuming 10MB limit
            health['issues'].append('Cache approaching size limit')
            health['recommendations'].append('Consider increasing cache size or reducing TTL')
        
        # Check eviction rate
        if stats['evictions'] > stats['cached_items']:
            health['issues'].append('High eviction rate')
            health['recommendations'].append('Increase cache size or optimize cache strategy')
        
        # Overall status
        if health['issues']:
            health['overall_status'] = 'needs_attention' if len(health['issues']) < 3 else 'unhealthy'
        
        return health
    
    def cleanup_expired(self) -> int:
        """Clean up expired cache entries."""
        current_time = time.time()
        expired_keys = []
        
        for cache_key, meta in self.cache_manager.metadata.items():
            created_at = meta['created_at']
            ttl = meta.get('ttl', self.cache_manager.default_ttl_seconds)
            
            if current_time - created_at > ttl:
                expired_keys.append(cache_key)
        
        # Remove expired entries
        for cache_key in expired_keys:
            cache_file = self.cache_manager.cache_dir / f"{cache_key}.json"
            if cache_file.exists():
                cache_file.unlink()
            del self.cache_manager.metadata[cache_key]
        
        if expired_keys:
            self.cache_manager.save_metadata()
        
        return len(expired_keys)
    
    def generate_report(self) -> str:
        """Generate a comprehensive cache report."""
        stats = self.cache_manager.get_stats()
        health = self.health_check()
        
        report = f"""
CACHE PERFORMANCE REPORT
========================

Statistics:
  Total Requests: {stats['total_requests']}
  Cache Hits: {stats['hits']}
  Cache Misses: {stats['misses']}
  Hit Rate: {stats['hit_rate']:.1%}
  
Storage:
  Cached Items: {stats['cached_items']}
  Cache Size: {stats['cache_size_mb']:.2f} MB
  Evictions: {stats['evictions']}
  
Health Status: {health['overall_status'].upper()}
"""
        
        if health['issues']:
            report += "\nIssues:\n"
            for issue in health['issues']:
                report += f"  - {issue}\n"
        
        if health['recommendations']:
            report += "\nRecommendations:\n"
            for rec in health['recommendations']:
                report += f"  - {rec}\n"
        
        return report

# Test cache monitoring
monitor = CacheMonitor(advanced_cache)

# Generate some cache activity
test_module = CachedModule(advanced_cache)
for i in range(10):
    question = f"What is the answer to question {i % 3}?"  # Some repeats
    test_module(question=question)

# Check health and generate report
health = monitor.health_check()
print("Cache Health Check:")
print(json.dumps(health, indent=2))

print("\n" + monitor.generate_report())

# Clean up expired entries
expired_count = monitor.cleanup_expired()
print(f"\nCleaned up {expired_count} expired entries")

## Cache Best Practices and Configuration

Let's implement a configuration system for cache management in different environments.

In [None]:
class CacheConfig:
    """Configuration management for cache settings."""
    
    DEVELOPMENT = {
        'enabled': True,
        'max_size_mb': 100,
        'default_ttl_hours': 168,  # 1 week
        'strategy': 'aggressive',
        'monitor_interval_minutes': 60
    }
    
    PRODUCTION = {
        'enabled': True,
        'max_size_mb': 500,
        'default_ttl_hours': 6,
        'strategy': 'conservative',
        'monitor_interval_minutes': 15
    }
    
    TESTING = {
        'enabled': False,
        'max_size_mb': 10,
        'default_ttl_hours': 1,
        'strategy': 'none',
        'monitor_interval_minutes': 5
    }
    
    @classmethod
    def get_config(cls, environment: str) -> dict:
        """Get configuration for environment."""
        configs = {
            'development': cls.DEVELOPMENT,
            'production': cls.PRODUCTION,
            'testing': cls.TESTING
        }
        return configs.get(environment, cls.DEVELOPMENT)

class ConfigurableCacheManager:
    """Cache manager with environment-based configuration."""
    
    def __init__(self, environment: str = 'development'):
        self.environment = environment
        self.config = CacheConfig.get_config(environment)
        
        if self.config['enabled']:
            self.cache_manager = AdvancedCacheManager(
                cache_dir=f"cache_{environment}",
                max_size_mb=self.config['max_size_mb'],
                default_ttl_hours=self.config['default_ttl_hours']
            )
            self.monitor = CacheMonitor(self.cache_manager)
        else:
            self.cache_manager = None
            self.monitor = None
    
    def is_enabled(self) -> bool:
        """Check if caching is enabled."""
        return self.config['enabled']
    
    def get(self, inputs: dict):
        """Get from cache if enabled."""
        if not self.is_enabled():
            return None
        return self.cache_manager.get(inputs)
    
    def put(self, inputs: dict, result: dict, ttl_hours: int = None):
        """Put to cache if enabled."""
        if not self.is_enabled():
            return
        self.cache_manager.put(inputs, result, ttl_hours)
    
    def get_stats(self) -> dict:
        """Get cache statistics."""
        if not self.is_enabled():
            return {'enabled': False}
        
        stats = self.cache_manager.get_stats()
        stats.update({
            'environment': self.environment,
            'config': self.config
        })
        return stats
    
    def health_check(self) -> dict:
        """Perform health check."""
        if not self.is_enabled():
            return {'enabled': False, 'status': 'disabled'}
        return self.monitor.health_check()

# Test configurable cache manager
environments = ['development', 'production', 'testing']

print("Testing configurable cache manager:")

for env in environments:
    print(f"\n=== {env.upper()} ENVIRONMENT ===")
    
    cache_mgr = ConfigurableCacheManager(env)
    
    print(f"Cache enabled: {cache_mgr.is_enabled()}")
    
    if cache_mgr.is_enabled():
        # Test cache operations
        test_inputs = {'question': f'Test question for {env}'}
        test_result = {'answer': f'Test answer for {env}'}
        
        # Put and get
        cache_mgr.put(test_inputs, test_result)
        cached = cache_mgr.get(test_inputs)
        
        print(f"Cache test successful: {cached is not None}")
        
        # Show stats
        stats = cache_mgr.get_stats()
        print(f"Cache size limit: {stats['config']['max_size_mb']} MB")
        print(f"Default TTL: {stats['config']['default_ttl_hours']} hours")
    else:
        print("Caching disabled for this environment")

## Conclusion

This tutorial covered comprehensive caching strategies for DSPy applications:

### Key Features Demonstrated:

1. **Basic Caching**: Using DSPy's built-in cache functionality
2. **Advanced Cache Management**: Custom cache with TTL, size limits, and LRU eviction
3. **Cache Strategies**: Different approaches for development, production, and testing
4. **Monitoring**: Health checks and performance monitoring
5. **Configuration**: Environment-specific cache settings

### Best Practices:

1. **Environment-Specific Caching**:
   - Development: Aggressive caching for faster iteration
   - Production: Conservative caching with shorter TTL
   - Testing: Minimal or no caching for consistency

2. **Cache Management**:
   - Implement size limits to prevent disk space issues
   - Use TTL to ensure data freshness
   - Monitor hit rates and performance
   - Regular cleanup of expired entries

3. **Strategy Selection**:
   - Cache deterministic results aggressively
   - Be cautious with time-sensitive information
   - Consider result quality when caching
   - Implement cache invalidation for critical updates

4. **Performance Optimization**:
   - Use appropriate cache keys to maximize hit rates
   - Balance cache size with available resources
   - Monitor and adjust TTL based on usage patterns
   - Consider cache warming for critical paths

### Production Considerations:

- **Security**: Encrypt cached data if it contains sensitive information
- **Distributed Caching**: Consider Redis or similar for multi-instance deployments
- **Backup**: Regular backups of important cached data
- **Monitoring**: Real-time alerts for cache performance issues
- **Testing**: Validate cache behavior under load

Proper caching can significantly improve the performance and cost-effectiveness of your DSPy applications while maintaining result quality and consistency.