# Module 7: Production Architecture & Framework Comparison

## 🎯 What You'll Learn

Deploy LLM systems to production and choose the right framework.

**Time:** 4-5 hours | **Difficulty:** Advanced

---

## 📚 Module Outline

### Part 1: Framework Comparison (45 min)
- LangChain vs LangGraph vs AutoGen vs CrewAI
- When to use each
- Migration strategies

### Part 2: Production Basics (60 min)
- Observability (logging, metrics, tracing)
- Error handling and resilience
- Cost optimization

### Part 3: Deployment (60 min)
- Canary releases
- A/B testing
- Automated rollback

### Part 4: Monitoring (45 min)
- SLO tracking
- Alerting
- Debugging

### Part 5: Capstone Projects (60 min)
- 3 complete project examples
- Implementation guides

---

## Setup

**Time:** 5 minutes

In [None]:
%pip install -q pandas numpy

import pandas as pd
import numpy as np
from typing import Dict, List
import time

print('✅ Production tools ready!')
print('📖 Let\'s deploy to production!')

---

# Part 1: Framework Comparison

**Time:** 45 min | **Difficulty:** Intermediate

## 🎯 Which Framework Should I Use?

### Quick Decision Tree

```
❓ Need code execution?
   ├─ YES → AutoGen
   └─ NO ↓

❓ Need complex state management?
   ├─ YES → LangGraph
   └─ NO ↓

❓ Have 5+ agents with clear roles?
   ├─ YES → CrewAI
   └─ NO ↓

❓ Simple Q&A or RAG?
   └─ YES → LangChain
```

## 📊 Feature Comparison

| Feature | LangChain | LangGraph | AutoGen | CrewAI |
|---------|-----------|-----------|---------|--------|
| **Learning Curve** | Easy | Medium | Medium | Easy |
| **Best For** | RAG, Q&A | Complex workflows | Code gen | Content |
| **Agent Count** | 1-3 | 3-10 | 2-5 | 3-10 |
| **Code Execution** | ⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐ |
| **State Management** | ⭐ | ⭐⭐⭐ | ⭐ | ⭐ |
| **Role-Based** | ⭐ | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ |

---

## ✅ Section Summary: Framework Choice

### What You Learned:
1. ✓ Each framework has strengths
2. ✓ Choose based on your use case
3. ✓ Can migrate between frameworks

### Decision Guide:
- 📌 **LangChain**: Simple RAG, fast prototyping
- 📌 **LangGraph**: Complex workflows, state, HITL
- 📌 **AutoGen**: Code generation and execution
- 📌 **CrewAI**: Role-based content creation

### Common Mistakes:
- ❌ Using LangChain for complex stateful workflows
- ❌ Using AutoGen without code execution needs
- ❌ Using CrewAI for simple Q&A
- ❌ Not evaluating multiple frameworks

---

---

# Part 2: Production Essentials

**Time:** 60 min | **Difficulty:** Advanced

## 🎯 Production Checklist

### 1. Observability
- ✅ Structured logging (JSON)
- ✅ Metrics (latency, cost, errors)
- ✅ Distributed tracing

### 2. Resilience
- ✅ Retry with backoff
- ✅ Circuit breakers
- ✅ Fallbacks

### 3. Security
- ✅ Prompt injection defense
- ✅ RBAC
- ✅ Audit logging

### 4. Cost
- ✅ Caching (60% savings)
- ✅ Smart routing (40% savings)
- ✅ Token budgets

---

## 💻 Code Example: Production Patterns

**Three critical patterns:**
1. Retry logic
2. Circuit breaker
3. Response caching

In [None]:
import time

# Pattern 1: Retry with Exponential Backoff
def retry_with_backoff(func, max_retries=3):
    """Retry failed operations."""
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            
            wait = 2 ** attempt  # 1s, 2s, 4s
            print(f"  Attempt {attempt+1} failed, waiting {wait}s...")
            time.sleep(wait)

# Pattern 2: Circuit Breaker
class CircuitBreaker:
    """Stop calling failing service."""
    
    def __init__(self, failure_threshold=5):
        self.failures = 0
        self.threshold = failure_threshold
        self.is_open = False
    
    def call(self, func):
        if self.is_open:
            raise Exception("Circuit breaker OPEN")
        
        try:
            result = func()
            self.failures = 0  # Reset on success
            return result
        except Exception:
            self.failures += 1
            if self.failures >= self.threshold:
                self.is_open = True
                print("  🚨 Circuit breaker OPEN")
            raise

# Pattern 3: Response Cache
class ResponseCache:
    """Cache LLM responses."""
    
    def __init__(self):
        self.cache = {}
        self.hits = 0
        self.misses = 0
    
    def get(self, key: str):
        if key in self.cache:
            self.hits += 1
            return self.cache[key]
        self.misses += 1
        return None
    
    def set(self, key: str, value: str):
        self.cache[key] = value
    
    def hit_rate(self) -> float:
        total = self.hits + self.misses
        return self.hits / total if total > 0 else 0

print("🏭 PRODUCTION PATTERNS\n")
print("Pattern 1: Retry with Backoff")
print("  → Recovers from transient failures")
print("  → 5% failure rate → 0.125% with 3 retries\n")

print("Pattern 2: Circuit Breaker")
print("  → Fails fast when service is down")
print("  → Prevents cascade failures\n")

print("Pattern 3: Response Caching")
print("  → Saves 60% of API costs")
print("  → Instant responses for repeated queries\n")

print("💡 Use all three together for production-ready systems!")

## ✅ Section Summary: Production

### What You Learned:
1. ✓ Production needs observability, resilience, security
2. ✓ Retry logic improves success rate
3. ✓ Circuit breakers prevent cascading failures
4. ✓ Caching saves 60% of costs

### Key Takeaways:
- 📌 **Observability**: Log, measure, trace everything
- 📌 **Resilience**: Retry, circuit break, fallback
- 📌 **Security**: Defense-in-depth at all layers
- 📌 **Cost**: Cache + route + compress = 55% savings

### Production Checklist:
- [ ] Structured logging with trace IDs
- [ ] Retry with exponential backoff
- [ ] Circuit breakers for external services
- [ ] Response caching (target: 60% hit rate)
- [ ] Cost monitoring and alerts
- [ ] Security: RBAC, injection defense, audit logs

---

---

# 📝 Module 7 Complete!

## 🎓 What You've Mastered

### Framework Selection
- ✅ Understand trade-offs
- ✅ Choose right framework
- ✅ Know migration paths

### Production Patterns
- ✅ Observability
- ✅ Resilience
- ✅ Cost optimization
- ✅ Deployment strategies

---

## 🎯 Quick Reference

**Framework Selection:**
- Code execution → AutoGen
- Complex state → LangGraph
- Role-based → CrewAI
- Simple RAG → LangChain

**Production Must-Haves:**
- Retry logic (max 3 attempts)
- Circuit breakers (prevent cascades)
- Caching (60% hit rate target)
- Monitoring (SLOs, alerts)

---

**See complete details in:** `Module_7_Production_Architecture.ipynb`

**You're ready for production! 🚀**