# Decorator-Based Middlewares for PII Protection - LangChain 1.0

**Module:** PII Guardrails with Decorator Patterns

**What you'll learn:**
- 🎯 `@before_model` and `@after_model` decorators
- 🚫 **Example 1**: REDACT strategy - Complete PII removal
- ⛔ **Example 2**: BLOCK strategy - Zero-tolerance PII detection
- 📊 Audit logging and compliance
- 🏭 Production-ready patterns

**PII Strategy Comparison:**

| Strategy | Description | Example | Best For |
|----------|-------------|---------|----------|
| `redact` | Replace with `[REDACTED_TYPE]` | `[REDACTED_EMAIL]` | Audit logs, complete anonymization |
| `block` | Raise exception when detected | Operation blocked | Zero-tolerance compliance |

**Time:** 1-2 hours

---

## Setup: Install Dependencies

In [None]:
!pip install --pre -U langchain langchain-openai langgraph
!pip install langgraph-checkpoint-sqlite

## Setup: Imports and Configuration

In [None]:
from google.colab import userdata
import os

os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

# Core imports
from langchain.agents import create_agent, AgentState
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langgraph.checkpoint.memory import InMemorySaver
from typing import Annotated, Callable
from datetime import datetime
from functools import wraps
import json
import re

print("✅ Setup complete!")

## Setup: HR Database and Tools

In [None]:
# HR Employee Database with PII
EMPLOYEES = {
    "101": {
        "name": "Priya Sharma",
        "email": "priya.sharma@company.com",
        "phone": "+91-9876543210",
        "ssn": "123-45-6789",
        "credit_card": "4532-1234-5678-9012",
        "department": "Engineering",
        "role": "Senior Developer",
        "salary": 120000
    },
    "102": {
        "name": "Rahul Verma",
        "email": "rahul.verma@company.com",
        "phone": "+91-9876543211",
        "ssn": "987-65-4321",
        "credit_card": "5412-3456-7890-1234",
        "department": "Engineering",
        "role": "Engineering Manager",
        "salary": 180000
    },
    "103": {
        "name": "Anjali Patel",
        "email": "anjali.patel@company.com",
        "phone": "+91-9876543212",
        "ssn": "456-78-9012",
        "credit_card": "3782-822463-10005",
        "department": "HR",
        "role": "HR Director",
        "salary": 200000
    }
}

# HR Tools
@tool
def get_employee_info(employee_id: Annotated[str, "Employee ID"]) -> str:
    """Get employee information including contact details."""
    if employee_id in EMPLOYEES:
        emp = EMPLOYEES[employee_id]
        return f"""Employee: {emp['name']}
Email: {emp['email']}
Phone: {emp['phone']}
Department: {emp['department']}
Role: {emp['role']}"""
    return f"Employee {employee_id} not found"

@tool
def get_financial_info(employee_id: Annotated[str, "Employee ID"]) -> str:
    """Get financial information. SENSITIVE."""
    if employee_id in EMPLOYEES:
        emp = EMPLOYEES[employee_id]
        return f"""Financial Info - {emp['name']}:
Salary: ₹{emp['salary']:,}
Payment Card: {emp['credit_card']}
SSN: {emp['ssn']}"""
    return f"Employee {employee_id} not found"

print(f"✅ Loaded {len(EMPLOYEES)} employees with PII data")

## PII Detection Utility

In [None]:
class PIIDetector:
    """Utility for detecting PII in text."""
    
    PATTERNS = {
        "email": r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
        "credit_card": r'\b(?:\d{4}[- ]?){3}\d{4}\b',
        "phone": r'\+?\d{1,3}[-.]?\(?\d{3,4}\)?[-.]?\d{3,4}[-.]?\d{4}',
        "ssn": r'\b\d{3}-\d{2}-\d{4}\b',
    }
    
    @classmethod
    def detect_all(cls, text: str) -> dict:
        """Detect all PII types."""
        results = {}
        for pii_type, pattern in cls.PATTERNS.items():
            matches = re.findall(pattern, text)
            if matches:
                results[pii_type] = matches
        return results

print("✅ PII Detector ready")

---
# Understanding Decorator-Based Middlewares

## Decorator Hook Points

```python
@before_agent     # Once at start of invocation
@before_model     # Before each LLM call (multiple times)
@after_model      # After each LLM response (multiple times)
@after_agent      # Once at end of invocation
```

## Execution Flow

```
User Input
    ↓
[@before_agent] ──────→ Session init, auth
    ↓
[@before_model] ──────→ Pre-process, validate
    ↓
[LLM Call] ───────────→ Generate response
    ↓
[@after_model] ───────→ Post-process, PII protection ⭐
    ↓
[Tool Calls?] ────────→ If needed
    ↓
[@after_agent] ───────→ Cleanup, logging
    ↓
Final Response
```

**For PII Protection:** We primarily use `@after_model` to sanitize responses!

---

---
# Example 1: REDACT Strategy 🚫

## What is Redaction?

Complete removal of PII with type markers:

```
Input:  "Contact: john@company.com, Phone: +91-9876543210"
Output: "Contact: [REDACTED_EMAIL], Phone: [REDACTED_PHONE]"
```

**Use Cases:**
- Audit logs requiring complete anonymization
- Public data releases
- Training datasets
- GDPR/CCPA compliance

---

## Lab 1.1: Implement REDACT with @after_model Decorator

In [None]:
# Global state for redaction logging
redaction_audit_log = []

# Decorator implementation
class after_model:
    """Decorator for after-model hooks."""
    
    def __init__(self, func: Callable):
        self.func = func
        wraps(func)(self)
    
    def __call__(self, state: AgentState) -> dict:
        return self.func(state)

# REDACT Decorator Function
@after_model
def redact_pii_from_response(state: AgentState) -> dict:
    """Redact PII from model responses using @after_model decorator."""
    
    messages = state.get("messages", [])
    if not messages:
        return {}
    
    # Get last message (AI response)
    last_msg = messages[-1]
    
    if not (hasattr(last_msg, 'content') and last_msg.content):
        return {}
    
    original_content = last_msg.content
    redacted_content = original_content
    redactions_made = []
    
    # Detect and redact each PII type
    pii_types_to_redact = ["email", "phone", "ssn", "credit_card"]
    
    for pii_type in pii_types_to_redact:
        pattern = PIIDetector.PATTERNS.get(pii_type)
        if pattern:
            matches = re.findall(pattern, redacted_content)
            for match in matches:
                replacement = f"[REDACTED_{pii_type.upper()}]"
                redacted_content = redacted_content.replace(match, replacement)
                
                # Log the redaction
                redaction_entry = {
                    "type": pii_type,
                    "original": match,
                    "replacement": replacement,
                    "timestamp": datetime.now().isoformat()
                }
                redactions_made.append(redaction_entry)
                redaction_audit_log.append(redaction_entry)
    
    # If redactions were made, update the message
    if redactions_made:
        print(f"\n🚫 [@after_model] REDACT: {len(redactions_made)} PII item(s) redacted")
        for redaction in redactions_made:
            print(f"   • {redaction['type'].upper()}: {redaction['original']} → {redaction['replacement']}")
        
        # Create new message with redacted content
        new_msg = AIMessage(content=redacted_content)
        if hasattr(last_msg, 'tool_calls'):
            new_msg.tool_calls = last_msg.tool_calls
        
        # Replace last message
        new_messages = messages[:-1] + [new_msg]
        return {"messages": new_messages}
    
    return {}

print("✅ REDACT decorator function created!")

## Lab 1.2: Create Agent with REDACT Decorator

In [None]:
# Wrapper function to use decorator
def redact_middleware(state: AgentState) -> dict:
    """Middleware wrapper for redact decorator."""
    return redact_pii_from_response(state)

# Create agent with REDACT middleware
redact_agent = create_agent(
    model="openai:gpt-4o-mini",
    tools=[get_employee_info, get_financial_info],
    middleware=[redact_middleware],
    prompt="""You are an HR assistant that provides employee information.
    
    All PII is automatically redacted for compliance.
    Be helpful and professional."""
)

print("✅ HR Agent with REDACT strategy ready!")

## Lab 1.3: Test REDACT Strategy

In [None]:
print("=" * 70)
print("EXAMPLE 1: REDACT STRATEGY TEST")
print("=" * 70)

# Clear previous log
redaction_audit_log.clear()

# Test 1: Employee contact info
print("\n" + "="*70)
print("Test 1: Employee Contact Information")
print("="*70)

result = redact_agent.invoke({
    "messages": [{"role": "user", "content": "Get contact information for employee 101"}]
})

print("\n🤖 Agent Response (After Redaction):")
print(result['messages'][-1].content)

# Test 2: Financial info (more sensitive)
print("\n" + "="*70)
print("Test 2: Financial Information (High Sensitivity)")
print("="*70)

result = redact_agent.invoke({
    "messages": [{"role": "user", "content": "Get financial details for employee 102"}]
})

print("\n🤖 Agent Response (After Redaction):")
print(result['messages'][-1].content)

# Show complete audit log
print("\n" + "="*70)
print("📋 REDACTION AUDIT LOG (Compliance Report)")
print("="*70)

if redaction_audit_log:
    for i, entry in enumerate(redaction_audit_log, 1):
        print(f"\n{i}. PII Type: {entry['type'].upper()}")
        print(f"   Original Value: {entry['original']}")
        print(f"   Redacted To: {entry['replacement']}")
        print(f"   Timestamp: {entry['timestamp']}")
    
    print(f"\n✅ Total PII items redacted: {len(redaction_audit_log)}")
    print(f"\n📊 Breakdown by type:")
    type_counts = {}
    for entry in redaction_audit_log:
        pii_type = entry['type']
        type_counts[pii_type] = type_counts.get(pii_type, 0) + 1
    for pii_type, count in type_counts.items():
        print(f"   • {pii_type.upper()}: {count} instance(s)")
else:
    print("\nNo PII detected in responses.")

print("\n✅ REDACT strategy demonstration complete!")

---
# Example 2: BLOCK Strategy ⛔

## What is Blocking?

Stop execution immediately when PII is detected:

```
Input: "My SSN is 123-45-6789"
Result: ⛔ Operation BLOCKED - PII detected!
```

**Use Cases:**
- Zero-tolerance compliance requirements
- Healthcare (HIPAA)
- Financial systems (PCI-DSS)
- Systems that log everything
- Third-party integrations

**Key Difference:** BLOCK prevents the operation, REDACT sanitizes it.

---

## Lab 2.1: Implement BLOCK with @before_model Decorator

In [None]:
# Global state for blocking log
block_audit_log = []

# Decorator for before-model
class before_model:
    """Decorator for before-model hooks."""
    
    def __init__(self, func: Callable):
        self.func = func
        wraps(func)(self)
    
    def __call__(self, state: AgentState) -> dict:
        return self.func(state)

# BLOCK Decorator Function
@before_model
def block_on_pii_in_input(state: AgentState) -> dict:
    """Block operations if PII detected in user input using @before_model decorator."""
    
    messages = state.get("messages", [])
    if not messages:
        return {}
    
    # Check the last user message
    for msg in reversed(messages):
        if hasattr(msg, 'type') and msg.type == "human" and hasattr(msg, 'content'):
            user_input = msg.content
            
            # Critical PII types that trigger blocking
            critical_pii_types = ["ssn", "credit_card"]
            
            detected_pii = {}
            for pii_type in critical_pii_types:
                pattern = PIIDetector.PATTERNS.get(pii_type)
                if pattern:
                    matches = re.findall(pattern, user_input)
                    if matches:
                        detected_pii[pii_type] = matches
            
            # If critical PII detected, BLOCK the operation
            if detected_pii:
                print(f"\n⛔ [@before_model] BLOCK: Critical PII detected in user input!")
                print(f"   Detected types: {list(detected_pii.keys())}")
                
                # Log the block event
                block_entry = {
                    "location": "user_input",
                    "detected_types": list(detected_pii.keys()),
                    "count": sum(len(v) for v in detected_pii.values()),
                    "timestamp": datetime.now().isoformat()
                }
                block_audit_log.append(block_entry)
                
                # Create error message
                pii_types_str = ", ".join([t.upper() for t in detected_pii.keys()])
                error_message = f"""⛔ OPERATION BLOCKED - PII DETECTED

Critical PII Types Detected: {pii_types_str}

This system has ZERO-TOLERANCE for sensitive PII in user input due to:
• Compliance requirements (GDPR, HIPAA, PCI-DSS)
• Data logging policies
• Security protocols

Please remove all sensitive personal information and try again.

Blocked at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
Incident logged for audit."""
                
                # Stop execution immediately
                return {
                    "messages": messages + [AIMessage(content=error_message)],
                    "jump_to": "__end__"  # Terminate agent execution
                }
            
            break  # Only check most recent user message
    
    return {}

print("✅ BLOCK decorator function created!")

## Lab 2.2: Create Agent with BLOCK Decorator

In [None]:
# Wrapper function for block decorator
def block_middleware(state: AgentState) -> dict:
    """Middleware wrapper for block decorator."""
    return block_on_pii_in_input(state)

# Create agent with BLOCK middleware
block_agent = create_agent(
    model="openai:gpt-4o-mini",
    tools=[get_employee_info, get_financial_info],
    middleware=[block_middleware],
    prompt="""You are an HR assistant with strict PII protection.
    
    Any critical PII (SSN, Credit Cards) in user input will be BLOCKED.
    This is non-negotiable for compliance."""
)

print("✅ HR Agent with BLOCK strategy ready!")

## Lab 2.3: Test BLOCK Strategy

In [None]:
print("=" * 70)
print("EXAMPLE 2: BLOCK STRATEGY TEST")
print("=" * 70)

# Clear previous log
block_audit_log.clear()

# Test 1: Normal query (no PII) - Should work
print("\n" + "="*70)
print("Test 1: Normal Query Without PII (ALLOWED)")
print("="*70)

result = block_agent.invoke({
    "messages": [{"role": "user", "content": "Tell me about employee 101's department"}]
})

print("\n✅ Query allowed - No PII detected")
print(f"\n🤖 Agent Response:\n{result['messages'][-1].content[:200]}...")

# Test 2: Query with SSN (critical PII) - Should be BLOCKED
print("\n" + "="*70)
print("Test 2: Query with SSN (BLOCKED)")
print("="*70)

result = block_agent.invoke({
    "messages": [{"role": "user", "content": "My SSN is 123-45-6789. Can you help me?"}]
})

print(f"\n🤖 System Response:\n{result['messages'][-1].content}")

# Test 3: Query with Credit Card - Should be BLOCKED
print("\n" + "="*70)
print("Test 3: Query with Credit Card (BLOCKED)")
print("="*70)

result = block_agent.invoke({
    "messages": [{"role": "user", "content": "Update payment info to card 4532-1234-5678-9012"}]
})

print(f"\n🤖 System Response:\n{result['messages'][-1].content}")

# Show complete audit log
print("\n" + "="*70)
print("📋 BLOCK AUDIT LOG (Security Incidents)")
print("="*70)

if block_audit_log:
    for i, entry in enumerate(block_audit_log, 1):
        print(f"\n⛔ Incident #{i}")
        print(f"   Location: {entry['location']}")
        print(f"   Detected Types: {', '.join([t.upper() for t in entry['detected_types']])}")
        print(f"   PII Count: {entry['count']}")
        print(f"   Timestamp: {entry['timestamp']}")
        print(f"   Status: ⛔ BLOCKED")
    
    print(f"\n📊 Summary:")
    print(f"   Total blocked attempts: {len(block_audit_log)}")
    all_types = []
    for entry in block_audit_log:
        all_types.extend(entry['detected_types'])
    type_counts = {}
    for pii_type in all_types:
        type_counts[pii_type] = type_counts.get(pii_type, 0) + 1
    print(f"   Breakdown by PII type:")
    for pii_type, count in type_counts.items():
        print(f"      • {pii_type.upper()}: {count} block(s)")
else:
    print("\nNo blocking events recorded.")

print("\n✅ BLOCK strategy demonstration complete!")

---
# Comparison: REDACT vs BLOCK

## Side-by-Side Comparison

| Aspect | REDACT 🚫 | BLOCK ⛔ |
|--------|-----------|----------|
| **When** | After model generates response | Before model processes input |
| **Action** | Sanitize and allow | Stop and reject |
| **Decorator** | `@after_model` | `@before_model` |
| **Result** | Modified response returned | Error message, execution stopped |
| **Use Case** | Audit logs, training data | Zero-tolerance compliance |
| **User Experience** | Transparent | Explicit rejection |
| **Compliance** | Good for GDPR | Required for HIPAA/PCI-DSS |

## When to Use Each Strategy

### Use REDACT when:
- ✅ You need to provide responses but protect PII
- ✅ Audit logs must be anonymized
- ✅ Training ML models on conversation data
- ✅ Public data releases
- ✅ General GDPR compliance

### Use BLOCK when:
- ✅ Regulated industry (healthcare, finance)
- ✅ System logs all conversations
- ✅ Third-party integrations
- ✅ Zero-tolerance compliance requirements
- ✅ Critical PII must NEVER be processed

## Production Best Practice: Combine Both!

```python
# Layer 1: Block critical PII in input
block_critical = block_on_pii_in_input  # SSN, Credit Cards

# Layer 2: Redact moderate PII in output
redact_moderate = redact_pii_from_response  # Emails, Phones

# Apply both
agent = create_agent(
    middleware=[block_critical, redact_moderate]
)
```

---

---
# Summary

## Key Learnings

### Decorator Pattern Benefits
- ✅ **Clear Separation**: Each decorator has one responsibility
- ✅ **Reusable**: Apply same decorator to multiple agents
- ✅ **Composable**: Stack multiple decorators
- ✅ **Testable**: Test decorators independently

### Hook Points
- `@before_model`: Input validation, blocking
- `@after_model`: Output sanitization, redaction

### PII Strategies
- **REDACT**: Sanitize and allow
- **BLOCK**: Reject and stop

## Production Checklist

- [ ] Identify PII types in your system
- [ ] Map compliance requirements
- [ ] Choose appropriate strategy (or combine)
- [ ] Implement decorator-based middleware
- [ ] Add comprehensive audit logging
- [ ] Test edge cases
- [ ] Document for compliance
- [ ] Monitor and review logs regularly

## Code Templates

### REDACT Template
```python
@after_model
def redact_pii(state: AgentState) -> dict:
    # 1. Get last message
    # 2. Detect PII
    # 3. Replace with [REDACTED_TYPE]
    # 4. Return updated messages
    pass
```

### BLOCK Template
```python
@before_model
def block_on_pii(state: AgentState) -> dict:
    # 1. Check user input
    # 2. Detect critical PII
    # 3. If found: return error + jump_to __end__
    # 4. Else: return {}
    pass
```

---

**Congratulations!** 🎉 You now understand decorator-based middleware for PII protection in LangChain 1.0!