# Module 3: LangChain - Chains, Agents, Evaluation

## 🎯 Learning Goals

Master LangChain for production systems: chains, agents, and systematic evaluation.

**Time:** 4-5 hours | **Difficulty:** Intermediate-Advanced

---

## 📚 What You'll Build

By the end of this module:
1. Robust chains with auto-retry
2. ReAct agents that use tools
3. Memory management systems
4. Complete evaluation framework

---

## 🗺️ Module Structure

### Part 1: Understanding Chains (30 min)
- What are chains
- When to use them
- Chain vs Agent

### Part 2: Building Robust Chains (60 min)
- Error handling
- Retry logic
- Validation

### Part 3: Agents (60 min)
- ReAct pattern
- Tool integration
- Multi-step reasoning

### Part 4: Memory (45 min)
- Managing conversation context
- Token budgets
- Memory strategies

### Part 5: Evaluation (60 min)
- Why evaluation matters
- Building test suites
- Regression detection

### Part 6: Practice (30 min)
- Exercises and review

---

## Setup

**Time:** 5 minutes

In [None]:
%pip install -q langchain langchain-community langchain-openai
%pip install -q pydantic pandas numpy

import json
import time
import pandas as pd
import numpy as np
from typing import Callable, Any, List, Dict
from pydantic import BaseModel, Field
from collections import defaultdict

print('✅ LangChain environment ready!')
print('📖 Ready to build chains and agents!')

---

# Part 1: Understanding Chains

**Time:** 30 min | **Difficulty:** Beginner

## 🎯 What is a Chain?

**Simple definition:** A chain connects multiple steps into a pipeline.

**Example pipeline:**
```
User Input → [Step 1: Translate] → [Step 2: Summarize] → [Step 3: Format] → Output
```

## 🤔 Chain vs Agent

**Chain:**
- Fixed sequence of steps
- Deterministic (same input → same path)
- Fast and predictable
- Use for: Well-defined workflows

**Agent:**
- Dynamically chooses actions
- Can use tools
- More flexible but slower
- Use for: Open-ended tasks

## 💡 When to Use Each

**Use Chain when:**
- You know the exact steps needed
- Order is fixed
- Need speed and predictability

**Use Agent when:**
- Steps depend on intermediate results
- Need to use tools dynamically
- Task is open-ended

---

---

# Part 2: Robust Chains

**Time:** 60 min | **Difficulty:** Intermediate

## 🎯 The Problem

**Issue:** LLMs sometimes return invalid output (wrong format, missing fields).

**Impact:**
- 15-30% parse failure rate (without retries)
- Production system breaks
- Poor user experience

## 🛡️ Solution: Retry with Feedback

**Strategy:**
1. Try to parse output
2. If fails → tell LLM what went wrong
3. LLM fixes the output
4. Repeat up to N times

**Result:** 15% failure → 1-2% failure

---

## 💻 Code Example: Chain with Auto-Retry

**What this does:**
- Calls LLM
- Validates output
- Retries with error feedback if needed
- Tracks metrics

**Study tip:** Focus on the retry logic (lines 20-35)

In [None]:
class RobustChain:
    """Production chain with automatic retry."""
    
    def __init__(self, llm_func: Callable, parser: Callable, max_retries=3):
        self.llm_func = llm_func
        self.parser = parser
        self.max_retries = max_retries
        self.metrics = {'calls': 0, 'retries': 0, 'failures': 0}
    
    def invoke(self, prompt: str) -> Any:
        """Execute with automatic retry on failure."""
        self.metrics['calls'] += 1
        
        for attempt in range(self.max_retries):
            try:
                # Call LLM
                response = self.llm_func(prompt)
                
                # Try to parse/validate
                result = self.parser(response)
                
                # Success!
                return result
                
            except Exception as e:
                self.metrics['retries'] += 1
                
                # Last attempt?
                if attempt == self.max_retries - 1:
                    self.metrics['failures'] += 1
                    raise RuntimeError(f'Failed after {self.max_retries} attempts')
                
                # Add error feedback for next attempt
                prompt += f"\n\n[ERROR: {e}. Please fix the output.]"
                time.sleep(0.5)
    
    def get_metrics(self):
        """Get chain performance."""
        success_rate = 1 - (self.metrics['failures'] / self.metrics['calls']) if self.metrics['calls'] > 0 else 0
        return {**self.metrics, 'success_rate': success_rate}

# Mock LLM and parser for demo
def mock_llm(prompt: str) -> str:
    """Simulate LLM that sometimes fails."""
    import random
    if random.random() < 0.3:  # 30% failure rate
        return '{"sentiment": "happy"}'  # Invalid value!
    return '{"sentiment": "positive", "confidence": 0.85}'

class SentimentOutput(BaseModel):
    sentiment: str  # Should be: positive, negative, or neutral
    confidence: float

def parser(response: str) -> SentimentOutput:
    data = json.loads(response)
    if data['sentiment'] not in ['positive', 'negative', 'neutral']:
        raise ValueError(f"Invalid sentiment: {data['sentiment']}")
    return SentimentOutput(**data)

# Test the chain
print("🔄 ROBUST CHAIN DEMONSTRATION\n")

chain = RobustChain(mock_llm, parser, max_retries=3)

print("Running 10 requests (30% initial failure rate)...\n")

for i in range(10):
    try:
        result = chain.invoke('Analyze sentiment')
        print(f"{i+1}. ✅ Success: {result.sentiment}")
    except:
        print(f"{i+1}. ❌ Failed (even after retries)")

print(f"\n📊 Metrics: {chain.get_metrics()}")
print(f"\n💡 Retry logic improved success rate significantly!")

## ✅ Section Summary: Robust Chains

### What You Learned:
1. ✓ LLMs can return invalid output (15-30% of the time)
2. ✓ Retry with error feedback improves success rate
3. ✓ Always validate output before using it
4. ✓ Track metrics to measure improvement

### Key Takeaways:
- 📌 **Always implement retry** (3 attempts recommended)
- 📌 **Provide error feedback** (tell LLM what went wrong)
- 📌 **Validate with Pydantic** (catch issues early)
- 📌 **Track success rate** (measure improvements)

### Impact:
- Without retry: 15-30% failure rate
- With retry (3x): 1-2% failure rate
- **Improvement: 90%+ reduction in failures**

---

---

# 📝 Module 3 Complete!

## 🎓 What You've Mastered

### Chains
- ✅ Building robust chains
- ✅ Error handling and retry
- ✅ Performance tracking

### Agents (See full cells in original notebook)
- ✅ ReAct pattern
- ✅ Tool integration
- ✅ Multi-step reasoning

### Memory (See full cells in original notebook)
- ✅ Conversation management
- ✅ Token budgets
- ✅ Multiple strategies

### Evaluation (See full cells in original notebook)
- ✅ Test suite creation
- ✅ Metric tracking
- ✅ Regression detection

---

## 🎯 Quick Reference

**Building a chain:**
```python
chain = RobustChain(llm_func, parser, max_retries=3)
result = chain.invoke(prompt)
```

**Creating an agent:**
```python
agent = ReActAgent(llm, tools=[search, calculator])
result = agent.run(task)
```

**Setting up evaluation:**
```python
evaluator = EvaluationFramework()
evaluator.add_test_case("test1", input, expected)
results = evaluator.run(system)
```

---

**Next:** Module 4 (LangGraph) for stateful workflows! 🚀