# Agent Guardrails and Human in the Loop (HITL)

Implement PII detection, content filtering, and human oversight for production agents.

**What you'll learn:**
- PIIMiddleware provides three strategies: redact, mask, block
- Redact removes PII completely
- Mask replaces with placeholder (preserves context)
- Block prevents processing (for critical data)
- Custom patterns define domain-specific PII
- HITL adds human approval for sensitive actions
- Defense in depth: Use multiple protection layers

## Why Guardrails Matter

Production agents need protection against:
- Leaking personally identifiable information (PII)
- Processing sensitive data (API keys, passwords)
- Inappropriate content generation
- Security vulnerabilities
- Compliance violations

**PII Middleware Strategies:**
1. **Redact**: Remove PII completely
2. **Mask**: Replace with placeholder (***)
3. **Block**: Prevent request from processing

## Strategy Reference

### PII Strategies
| Strategy | Effect | Use Case |
|----------|--------|----------|
| redact | Removes completely | PII that shouldn't be logged |
| mask | Replaces with *** | Preserve context while hiding data |
| block | Prevents processing | Critical secrets (API keys) |

### HITL Decisions
| Decision | Effect | Use Case |
|----------|--------|----------|
| approve | Execute as-is | Safe operations |
| edit | Modify then execute | Adjust parameters |
| reject | Block with feedback | Dangerous operations |

## Setup

In [None]:
import sys
sys.path.append('../')

import os
from dotenv import load_dotenv
load_dotenv()

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import create_agent
from langchain.messages import HumanMessage
from langchain.agents.middleware import PIIMiddleware
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3
from scripts import base_tools

In [None]:
model = ChatGoogleGenerativeAI(model='gemini-2.5-flash')
system_prompt = """You are a helpful customer service assistant.
Assist users with their questions while protecting their privacy."""

# Setup checkpointer
conn = sqlite3.connect("db/guardrails_agent.db", check_same_thread=False)
checkpointer = SqliteSaver(conn)
checkpointer.setup()

## PII Protection Strategies

One agent with different PII protection strategies.

In [None]:
# Agent with multiple PII protection strategies
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        # Strategy 1: BLOCK - Prevent processing if API keys detected
        PIIMiddleware("api_key", detector=r"sk-[a-zA-Z0-9]{32}", strategy="block"),
        
        # Strategy 2: REDACT - Remove emails completely
        PIIMiddleware("email", strategy="redact", apply_to_input=True),
        
        # Strategy 3: MASK - Replace credit cards with asterisks
        PIIMiddleware("credit_card", strategy="mask", apply_to_input=True),
        
        # Additional protections
        PIIMiddleware("phone", detector=r"\d{3}-\d{3}-\d{4}", strategy="redact"),
        PIIMiddleware("url", strategy="redact", apply_to_input=True)
    ]
)

## Test Different Strategies

In [None]:
# Test 1: Email REDACTION (removes completely)
config = {'configurable': {'thread_id': 'session1'}}

response = agent.invoke({
    'messages': [HumanMessage(
        "Hi, my name is Laxmi Kant. My email is udemy@kgptalkie.com"
    )]
}, config=config)

print("Test 1 - REDACT Strategy:")
print("Response:", response['messages'][-1].content)

In [None]:
# Test 2: Credit Card MASKING (replaces with ***)

response = agent.invoke({
    'messages': [HumanMessage(
        "I need to update my payment. My card is 4532-1234-5678-9010"
    )]
}, config=config)

print("Test 2 - MASK Strategy:")
print("Response:", response['messages'][-1].content)

In [None]:
# Test 3: API Key BLOCKING (prevents processing)

response = agent.invoke({
        'messages': [HumanMessage(
            "Here's my API key: sk-1234567890abcdefghijklmnopqrstuv"
        )]
    }, config=config)

response

In [None]:
# Test 4: Multiple PII types in one message

response = agent.invoke({
    'messages': [HumanMessage(
        """Hi, I'm Laxmi Kant from KGP Talkie.
        Email: udemy@kgptalkie.com, Phone: 555-123-4567,
        Website: https://kgptalkie.com"""
    )]
}, config=config)

print("Test 4 - Combined Protection:")
print("Response:", response['messages'][-1].content)

## Custom PII Patterns

Define domain-specific PII patterns.

In [None]:
# Agent with custom patterns
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        # Custom pattern: Employee IDs (EMP-123456)
        PIIMiddleware("employee_id", detector=r"EMP-\d{6}", strategy="mask"),
        
        # Custom pattern: Order IDs (ORD-ABC123)
        PIIMiddleware("order_id", detector=r"ORD-[A-Z0-9]{6}", strategy="redact"),
        
        # Standard patterns
        PIIMiddleware("email", strategy="redact"),
        PIIMiddleware("phone", detector=r"\d{3}-\d{3}-\d{4}", strategy="redact")
    ]
)

response = agent.invoke({
    'messages': [HumanMessage(
        "My employee ID is EMP-123456 and order ID is ORD-ABC123"
    )]
}, config=config)

print("Custom Patterns:")
print("Response:", response['messages'][-1].content)

## Human-in-the-Loop (HITL)

Add human approval for sensitive tool actions.

In [None]:
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langchain.tools import tool
from langgraph.types import Command

@tool
def write_file(path: str, content: str):
    """Write content to file."""
    try:
        with open(path, 'w') as f:
            f.write(content)
        return f"Successfully wrote to {path}"
    except Exception as e:
        return f"Error: {e}"

@tool
def execute_sql(query: str):
    """Execute SQL query."""
    return f"Would execute: {query}"

# Agent with HITL
agent = create_agent(
    model=model,
    tools=[write_file, execute_sql],
    checkpointer=checkpointer,
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "write_file": True,  # All decisions: approve, edit, reject
                "execute_sql": {"allowed_decisions": ["approve", "reject"]},  # No editing
            },
            description_prefix="Tool execution pending approval",
        )
    ]
)

In [None]:
# HITL Example 1: APPROVE action
config = {"configurable": {"thread_id": "hitl_approve"}}

result = agent.invoke({
    "messages": [HumanMessage("Write 'Hello World' to test.txt")]
}, config=config)

if "__interrupt__" in result:
    print("Interrupt:", result['__interrupt__'][0].value['action_requests'][0])
    
    # Approve
    result = agent.invoke(
        Command(resume={"decisions": [{"type": "approve"}]}),
        config=config
    )
    print("\nApproved:", result['messages'][-1].content)

In [None]:
# HITL Example 2: EDIT action
config = {"configurable": {"thread_id": "hitl_edit"}}

result = agent.invoke({
    "messages": [HumanMessage("Write 'Original' to data.txt")]
}, config=config)

if "__interrupt__" in result:
    print("Original:", result['__interrupt__'][0].value['action_requests'][0])
    
    # Edit before execution
    result = agent.invoke(
        Command(resume={
            "decisions": [{
                "type": "edit",
                "edited_action": {
                    "name": "write_file",
                    "args": {"path": "data.txt", "content": "Modified content"}
                }
            }]
        }),
        config=config
    )
    print("\nEdited:", result['messages'][-1].content)

In [None]:
# HITL Example 3: REJECT action
config = {"configurable": {"thread_id": "hitl_reject"}}

result = agent.invoke({
    "messages": [HumanMessage("Delete all records from database")]
}, config=config)

if "__interrupt__" in result:
    print("Dangerous:", result['__interrupt__'][0].value['action_requests'][0])
    
    # Reject with feedback
    result = agent.invoke(
        Command(resume={
            "decisions": [{
                "type": "reject",
                "message": "Too dangerous. Use WHERE clause to specify records."
            }]
        }),
        config=config
    )
    print("\nRejected:", result['messages'][-1].content)