# Agent Guardrails and Human in the Loop (HITL)

Implement PII detection, content filtering, and human oversight for production agents.

**What you'll learn:**
- PIIMiddleware provides three strategies: redact, mask, block
- Redact removes PII completely
- Mask replaces with placeholder (preserves context)
- Block prevents processing (for critical data)
- Custom patterns define domain-specific PII
- HITL adds human approval for sensitive actions
- Defense in depth: Use multiple protection layers

## Why Guardrails Matter

Production agents need protection against:
- Leaking personally identifiable information (PII)
- Processing sensitive data (API keys, passwords)
- Inappropriate content generation
- Security vulnerabilities
- Compliance violations

**PII Middleware Strategies:**
1. **Redact**: Remove PII completely
2. **Mask**: Replace with placeholder (***)
3. **Block**: Prevent request from processing

## Setup

In [None]:
import sys
sys.path.append('../')

import os
from dotenv import load_dotenv
load_dotenv()

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import create_agent
from langchain.messages import HumanMessage
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3
from scripts import base_tools

In [None]:
model = ChatGoogleGenerativeAI(model='gemini-2.5-flash')
system_prompt = """You are a helpful customer service assistant.
Assist users with their questions while protecting their privacy."""

# Setup checkpointer
conn = sqlite3.connect("db/guardrails_agent.db", check_same_thread=False)
checkpointer = SqliteSaver(conn)
checkpointer.setup()

## 1. Agent Without Guardrails (Unsafe)

First, see the problem with unprotected agents.

In [None]:
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer
)

print("Unsafe agent created (no PII protection)")

In [None]:
# Test with PII - This will expose sensitive information!
config = {'configurable': {'thread_id': 'unsafe_session'}}

response = agent.invoke({
    'messages': [HumanMessage(
        "Hi, my name is John Doe. My email is john.doe@example.com and my phone is 555-123-4567."
    )]
}, config=config)

print("Response:", response['messages'][-1].content)
print("\nNotice: PII was processed without protection!")

## 2. PIIMiddleware - Email Redaction

Remove email addresses from input and output.

In [None]:
from langchain.agents.middleware import PIIMiddleware

agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True
        )
    ]
)

print("Agent with email redaction created")

In [None]:
# Test email redaction
config = {'configurable': {'thread_id': 'email_redact_session'}}

response = agent.invoke({
    'messages': [HumanMessage(
        "Hi, my name is Laxmi Kant. Here is my email info@kgptalkie.com"
    )]
}, config=config)

print("Original message:", "Hi, my name is Laxmi Kant. Here is my email info@kgptalkie.com")
print("\nAgent response:", response['messages'][-1].content)
print("\nEmail was redacted!")

## 3. PIIMiddleware - Masking Strategy

Replace PII with asterisks instead of removing.

In [None]:
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "credit_card",
            strategy="mask",
            apply_to_input=True
        )
    ]
)

print("Agent with credit card masking created")

In [None]:
# Test masking
config = {'configurable': {'thread_id': 'mask_session'}}

response = agent.invoke({
    'messages': [HumanMessage(
        "I need to update my payment. My card is 4532-1234-5678-9010"
    )]
}, config=config)

print("Response:", response['messages'][-1].content)
print("\nCredit card number was masked!")

## 4. PIIMiddleware - Blocking Strategy

Prevent requests containing sensitive data from processing.

In [None]:
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "api_key",
            detector=r"sk-[a-zA-Z0-9]{32}",  # Pattern for API keys
            strategy="block"
        )
    ]
)

print("Agent with API key blocking created")

In [None]:
# Test blocking (this should fail gracefully)
config = {'configurable': {'thread_id': 'block_session'}}

try:
    response = agent.invoke({
        'messages': [HumanMessage(
            "Here's my API key: sk-1234567890abcdefghijklmnopqrstuv"
        )]
    }, config=config)
    print("Response:", response)
except Exception as e:
    print(f"Request blocked: {e}")
    print("\nAPI key was detected and request was blocked!")

## 5. Multiple PII Protections

Combine multiple PII middleware for comprehensive protection.

In [None]:
agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware("api_key", detector=r"sk-[a-zA-Z0-9]{32}", strategy="block"),
        PIIMiddleware("email", strategy="redact", apply_to_input=True),
        PIIMiddleware("credit_card", strategy="mask", apply_to_input=True),
        PIIMiddleware("url", strategy="redact", apply_to_input=True)
    ]
)

print("Secure agent with multiple PII protections created")

In [None]:
# Test comprehensive protection
config = {'configurable': {'thread_id': 'secure_session'}}

response = agent.invoke({
    'messages': [HumanMessage(
        "Hi, I'm Laxmi Kant. Email: info@kgptalkie.com. Website: https://kgptalkie.com"
    )]
}, config=config)

print("Response:", response['messages'][-1].content)
print("\nMultiple PII types protected!")

## 6. Human-in-the-Loop (HITL)

Add human approval for sensitive tool actions.

In [None]:
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langchain.tools import tool
from langgraph.types import Command

@tool
def write_file(path: str, content: str):
    """Write content to file."""
    try:
        with open(path, 'w') as f:
            f.write(content)
        return f"Successfully wrote to {path}"
    except Exception as e:
        return f"Error: {e}"

@tool
def execute_sql(query: str):
    """Execute SQL query."""
    return f"Would execute: {query}"

agent = create_agent(
    model=model,
    tools=[write_file, execute_sql],
    checkpointer=checkpointer,
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "write_file": True,  # All decisions allowed (approve, edit, reject)
                "execute_sql": {"allowed_decisions": ["approve", "reject"]},  # No editing
            },
            description_prefix="Tool execution pending approval",
        )
    ]
)

print("Agent with HITL middleware created")

## 7. HITL - Approve Action

In [None]:
# Run until interrupt
config = {"configurable": {"thread_id": "hitl_session_1"}}

result = agent.invoke({
    "messages": [HumanMessage("Write 'Hello World' to test.txt")]
}, config=config)

# Check for interrupt
if "__interrupt__" in result:
    print("Interrupt detected!")
    print("Action:", result['__interrupt__'][0].value['action_requests'][0])
    
    # Approve the action
    result = agent.invoke(
        Command(resume={"decisions": [{"type": "approve"}]}),
        config=config
    )
    print("\nAction approved and executed!")
    print("Result:", result['messages'][-1].content)

## 8. HITL - Edit Action

In [None]:
# Run until interrupt
config = {"configurable": {"thread_id": "hitl_session_2"}}

result = agent.invoke({
    "messages": [HumanMessage("Write 'Original content' to data.txt")]
}, config=config)

# Edit the action before execution
if "__interrupt__" in result:
    print("Original action:", result['__interrupt__'][0].value['action_requests'][0])
    
    result = agent.invoke(
        Command(resume={
            "decisions": [{
                "type": "edit",
                "edited_action": {
                    "name": "write_file",
                    "args": {"path": "data.txt", "content": "Modified content"}
                }
            }]
        }),
        config=config
    )
    print("\nAction edited and executed!")
    print("Result:", result['messages'][-1].content)

## 9. HITL - Reject Action

In [None]:
# Run until interrupt
config = {"configurable": {"thread_id": "hitl_session_3"}}

result = agent.invoke({
    "messages": [HumanMessage("Delete all records from database")]
}, config=config)

# Reject the dangerous action
if "__interrupt__" in result:
    print("Dangerous action detected:", result['__interrupt__'][0].value['action_requests'][0])
    
    result = agent.invoke(
        Command(resume={
            "decisions": [{
                "type": "reject",
                "message": "This action is too dangerous. Please specify which records to delete with a WHERE clause."
            }]
        }),
        config=config
    )
    print("\nAction rejected with feedback!")
    print("Result:", result['messages'][-1].content)

## 10. HITL Decision Types

| Decision | Description | Use Case |
|----------|-------------|----------|
| approve | Execute as-is | Safe action |
| edit | Modify before executing | Adjust parameters |
| reject | Block with feedback | Dangerous action |

## 11. Production Safety Checklist

Before deploying your agent:

**PII Protection:**
- [ ] Email addresses redacted/masked
- [ ] Credit card numbers protected
- [ ] API keys and secrets blocked
- [ ] Phone numbers handled
- [ ] Custom domain-specific PII patterns added
- [ ] Both input and output protected

**Human Oversight:**
- [ ] HITL configured for dangerous operations
- [ ] File system access requires approval
- [ ] Database modifications require approval
- [ ] External API calls reviewed
- [ ] Proper decision types configured per tool

**General:**
- [ ] Tested with realistic examples
- [ ] Logging configured (without PII)
- [ ] Security monitoring enabled
- [ ] Checkpointer configured for production