# Agent Guardrails and Safety

Implement PII detection, content filtering, and safety mechanisms for production agents.

## Why Guardrails Matter

Production agents need protection against:
- ‚ùå Leaking personally identifiable information (PII)
- ‚ùå Processing sensitive data (API keys, passwords)
- ‚ùå Inappropriate content generation
- ‚ùå Security vulnerabilities
- ‚ùå Compliance violations

**PII Middleware Strategies:**
1. **Redact**: Remove PII completely
2. **Mask**: Replace with placeholder (***)
3. **Block**: Prevent request from processing

## Setup

In [None]:
import os
from dotenv import load_dotenv

load_dotenv()

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import create_agent
from langchain.messages import HumanMessage, AIMessage
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3
from scripts import base_tools

In [None]:
model = ChatGoogleGenerativeAI(model='gemini-2.5-flash')
system_prompt = """You are a helpful customer service assistant.
Assist users with their questions while protecting their privacy."""

# Setup checkpointer
conn = sqlite3.connect("data/guardrails_agent.db", check_same_thread=False)
checkpointer = SqliteSaver(conn=conn)

## 1. Agent Without Guardrails (Unsafe)

First, see the problem with unprotected agents.

In [None]:
unsafe_agent = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer
)

print("‚ö†Ô∏è Unsafe agent created (no PII protection)")

In [None]:
# Test with PII - This will expose sensitive information!
config = {'configurable': {'thread_id': 'unsafe_session'}}

response = unsafe_agent.invoke({
    'messages': [HumanMessage(
        "Hi, my name is John Doe. My email is john.doe@example.com and my phone is 555-123-4567."
    )]
}, config=config)

print("Response:", response['messages'][-1].text)
print("\n‚ö†Ô∏è Notice: PII was processed without protection!")

## 2. PIIMiddleware - Email Redaction

Remove email addresses from input and output.

In [None]:
from langchain.agents.middleware import PIIMiddleware

agent_email_redact = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True
        )
    ]
)

print("‚úÖ Agent with email redaction created")

In [None]:
# Test email redaction
config = {'configurable': {'thread_id': 'email_redact_session'}}

response = agent_email_redact.invoke({
    'messages': [HumanMessage(
        "Hi, my name is Laxmi Kant. Here is my email info@kgptalkie.com"
    )]
}, config=config)

# Check the processed message
print("Original message:", "Hi, my name is Laxmi Kant. Here is my email info@kgptalkie.com")
print("\nAgent response:", response['messages'][-1].text)
print("\n‚úÖ Email was redacted!")

## 3. PIIMiddleware - Masking Strategy

Replace PII with asterisks instead of removing.

In [None]:
agent_mask = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "credit_card",
            strategy="mask",
            apply_to_input=True
        )
    ]
)

print("‚úÖ Agent with credit card masking created")

In [None]:
# Test masking
config = {'configurable': {'thread_id': 'mask_session'}}

response = agent_mask.invoke({
    'messages': [HumanMessage(
        "I need to update my payment. My card is 4532-1234-5678-9010"
    )]
}, config=config)

print("Response:", response['messages'][-1].text)
print("\n‚úÖ Credit card number was masked!")

## 4. PIIMiddleware - Blocking Strategy

Prevent requests containing sensitive data from processing.

In [None]:
agent_block = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "api_key",
            detector=r"sk-[a-zA-Z0-9]{32}",  # Pattern for API keys
            strategy="block"
        )
    ]
)

print("‚úÖ Agent with API key blocking created")

In [None]:
# Test blocking (this should fail gracefully)
config = {'configurable': {'thread_id': 'block_session'}}

try:
    response = agent_block.invoke({
        'messages': [HumanMessage(
            "Here's my API key: sk-1234567890abcdefghijklmnopqrstuv"
        )]
    }, config=config)
    print("Response:", response)
except Exception as e:
    print(f"üö´ Request blocked: {e}")
    print("\n‚úÖ API key was detected and request was blocked!")

## 5. Multiple PII Protections

Combine multiple PII middleware for comprehensive protection.

In [None]:
agent_secure = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        # Block API keys
        PIIMiddleware(
            "api_key",
            detector=r"sk-[a-zA-Z0-9]{32}",
            strategy="block"
        ),
        # Redact emails
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True
        ),
        # Mask credit cards
        PIIMiddleware(
            "credit_card",
            strategy="mask",
            apply_to_input=True
        ),
        # Redact URLs
        PIIMiddleware(
            "url",
            strategy="redact",
            apply_to_input=True
        )
    ]
)

print("‚úÖ Secure agent with multiple PII protections created")

In [None]:
# Test comprehensive protection
config = {'configurable': {'thread_id': 'secure_session'}}

response = agent_secure.invoke({
    'messages': [HumanMessage(
        "Hi, I'm Laxmi Kant. Email: info@kgptalkie.com. "
        "Website: https://kgptalkie.com"
    )]
}, config=config)

print("Response:", response['messages'][-1].text)
print("\n‚úÖ Multiple PII types protected!")

## 6. Custom PII Patterns

Define custom regex patterns for domain-specific PII.

In [None]:
# Custom pattern for employee IDs
agent_custom = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    checkpointer=checkpointer,
    middleware=[
        PIIMiddleware(
            "employee_id",
            detector=r"EMP-\d{6}",  # Pattern: EMP-123456
            strategy="mask"
        ),
        PIIMiddleware(
            "phone",
            detector=r"\d{3}-\d{3}-\d{4}",  # Pattern: 123-456-7890
            strategy="redact"
        )
    ]
)

print("‚úÖ Agent with custom PII patterns created")

In [None]:
# Test custom patterns
config = {'configurable': {'thread_id': 'custom_pii_session'}}

response = agent_custom.invoke({
    'messages': [HumanMessage(
        "My employee ID is EMP-123456 and phone is 555-123-4567"
    )]
}, config=config)

print("Response:", response['messages'][-1].text)
print("\n‚úÖ Custom PII patterns detected and protected!")

## 7. Input vs Output Protection

Control whether to apply PII protection to input, output, or both.

In [None]:
# Protect only inputs (user messages)
agent_input_only = create_agent(
    model=model,
    tools=[base_tools.web_search],
    system_prompt=system_prompt,
    middleware=[
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True  # Only input
        )
    ]
)

print("Input protection: User data is redacted before reaching the model")
print("Output protection: Agent responses would be filtered")

## 8. Safety Best Practices

### ‚úÖ DO:
- Use `block` strategy for highly sensitive data (API keys, passwords)
- Use `redact` for PII that shouldn't be logged
- Use `mask` when you need to preserve context
- Apply protection to both input AND output
- Test with realistic PII examples
- Log blocked requests for security monitoring

### ‚ùå DON'T:
- Only protect input (output can leak data too)
- Use overly broad patterns (false positives)
- Rely solely on middleware (defense in depth)
- Forget to test edge cases
- Store unredacted PII in logs

## 9. Common PII Types and Patterns

Reference guide for PII detection patterns:

In [None]:
pii_patterns = {
    "email": r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}",
    "phone_us": r"\d{3}-\d{3}-\d{4}",
    "ssn": r"\d{3}-\d{2}-\d{4}",
    "credit_card": r"\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}",
    "ip_address": r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",
    "api_key": r"sk-[a-zA-Z0-9]{32}",
    "url": r"https?://[^\s]+"
}

print("Common PII Patterns:")
for pii_type, pattern in pii_patterns.items():
    print(f"  {pii_type}: {pattern}")

## 10. Production Safety Checklist

Before deploying your agent:

- [ ] PII middleware configured for all sensitive data types
- [ ] API keys and secrets blocked
- [ ] Email addresses redacted/masked
- [ ] Credit card numbers protected
- [ ] Phone numbers handled
- [ ] Custom domain-specific PII patterns added
- [ ] Both input and output protected
- [ ] Tested with real PII examples
- [ ] Logging configured (without PII)
- [ ] Security monitoring enabled

## 11. Key Takeaways

‚úÖ **PIIMiddleware** provides three strategies: redact, mask, block

‚úÖ **Redact** - Remove PII completely

‚úÖ **Mask** - Replace with placeholder (preserves context)

‚úÖ **Block** - Prevent processing (for critical data)

‚úÖ **Custom patterns** - Define domain-specific PII

‚úÖ **Defense in depth** - Use multiple protection layers

‚úÖ **Test thoroughly** - Validate with real PII examples

üí° **Production Tip**: Always protect both input AND output

## Exercise

1. Create an agent with PII protection for your domain
2. Define custom PII patterns relevant to your use case
3. Test with various PII types (email, phone, etc.)
4. Experiment with different strategies (redact, mask, block)
5. Verify both input and output are protected

In [None]:
# Your code here
